Monday, October 13, 2025

Baltic Olympiad in Informatics 2024 (BOI 2024) Jobs

Problem Statement 

Maximize the sum of profits for a subset of jobs S:

max S⊆{1,...,n}i ∈ S xi

such that if job i is picked, then job p(i) is also picked:

i ∈ S → p(i) ∈ S

and there exists an order σ for the jobs in S in which no debt is incurred for all prefixes of the order:

∃ σ ( s + ∑kj=1 xσ(j) ≥ 0,   k = 1, ..., |S| )
Without the second constraint, the answer is simply just y(0), where y(i) = x(i) + max(0, y(first child job of job i)) + max(0, y(second child job of job i)) + … + max(0, y(last child job of job i)). That can easily be computed bottom-up since p(i) < i. This case is equivalent to the case of having enough initial euros to foot any possible intermediate debt: s = 10^18. Note, that x(0) = s and p(0) = 0: job 0 is a dummy job whose profit is the initial balance.

The final job picks will form a single connected component rooted at job 0 because of the first constraint: if job i is picked, then job p(i) is picked. So, picking a job is akin to merging that job with its parent. This is easy if both x(i), x(p(i)) > 0, but not  if x(i) > 0 and x(p(i)) < 0. So, you have to remember the minimum "cost to take" the new merged job that has profit x(i) + x(p(i)). You'd also need to remember the connected components across merges. Both can be done with auxiliary memory, e.g. cost_to_take(i) and a union-find data structure (U(i), find, merge).
 
How do you go about merging jobs to construct the solution then? Do you construct in a top-down manner by growing the solution at its' edges? I found that direction to be a "red herring", and all my attempts for such a job-merging strategy broke down for this example:
6 0
4 0
-3 
-1 2
-1 2
3 3
3 4
 
If x(i) > 0 and job p(i) is picked, then job i will certainly be picked. Further, if x(i) > 0 and cost_to_take(i)≤x(p(i)), then job i will certainly be picked if job p(i) is picked. There's more of these, but notice that these are true all across the job tree. It turns out that performing such job merges all across the tree greedily gives the answer.
 
The greedy job-merging solution is to merge the job that is least expensive to take and has positive profit with its parent job, i.e. out of all the jobs with x(i) > 0, take the one with the smallest cost_to_take(i) and merge it to its parent. I won't go into the actual casework for job-merging as it can be seen in the solution. Here's an additional description of the greedy algorithm. Also, there's a small catch for merging with the zeroth job.
 


Additional observation and questions:
There's a greedy solution to this problem, but I did not reach it from this starting point (the optimization program). Is there some structure in this optimization program that I am missing? I suspect that there is and that exploiting it makes the greedy solution very apparent. But, I have not spotted it. I am wondering about this because there's lots of other optimization problems with greedy solutions, e.g. huffman coding (minimize expected codeword length), basis pursuit (maximize sparsity of signal decomposition), ... 
 
Possible bottom-up dynamic programming solution:
Let z(i, k) = maximum profit attainable if you take job i and have k euros initially. This is enough for a bottom-up dynamic programming solution (convince yourself of this). I could not put together an implementation for it though; someone else supposedly did. One interesting thing to note is that if the job subtree rooted at job i is replaced with another job tree that has the exact same z(i,k) profile, then the problem is unchanged. So, z(i,k) carries all information needed about a job tree for this problem.

Tuesday, August 19, 2025

King's Summit

Atcoder ABC 419 problem C

observations
There are n points: x1=(r1,c1), x2=(r2,c2),...,xn=(rn,cn). The time it takes to get from point a=(ra,ca) to point b=(rb,cb) is the chebyshev distance between a and b: ||a-b||inf. This is length of the shortest "king-walk" between a and b on a chess-board. So, if all the persons meet at "summit" point s=(rs,cs), then it will take max(||s-x1||inf, ||s-x2||inf, ..., ||s-xn||inf) time, i.e. the longest time to reach the summit across all persons equals the total time it takes everyone to reach the summit. 

The chebyshev distance between points a and b is the Linf norm of the difference between a and b. The Linf norm selects the maximum component: ||a-b||inf=max(|ra-rb|,|ca-cb|). Additionally, |p| = max(-p, p). Hence, the Linf norm of ||a-b||inf = max(|ra-rb|,|ca-cb|) = max(ra-rb, rb-ra, ca-cb, cb-ca).

This leads to a big max expression:
time to reach summit becomes
= max(||s-x1||inf, ||s-x2||inf, ..., ||s-xn||inf)
= max(rs - rx1, rx1 - rs, cs - cx1, cx1 - cs,
           rs - rx2, rx2 - rs, cs - cx2, cx2 - cs,
           ...
           rs - rxn, rxn - rs, cs - cxn, cxn - cs)


Assume that the components of persons' starting positions are ordered:
rsmallest ≤ rsecond_smallest ≤ ... ≤ rbiggest
csmallest ≤ csecond_smallest ≤ ... ≤ cbiggest

Then, rs - rsmallest ≥ rs - rx2 ≥ ...rs - rbiggest. Exploiting this structure for the other maximum arguments, leads to this expression for the maximum time to summit:
= max(rs - rsmallest, rbiggest - rs, cs - csmallest, cbiggest - cs).

And, the minimum of this expression in terms of s=(rs,cs) is max( ceil((rbiggest - rsmallest) / 2), ceil((cbiggest - csmallest) / 2).

Friday, August 1, 2025

Codeforces union find problem involving cycles

Codeforces Round 1040 (Div. 2) Problem C

Observations

Given n ranges, the problem involves maximally covering the range [1, 2n] with those ranges, while jointly minimizing cycles between the endpoints of those ranges/ undirected edges. [1, 2n] will be maximally covered by taking all ranges, but this does not necessarily avoid cycles. The important observation is that all cycles have a minimum and maximum point, e.g. 1 and 4 in S = {(1, 2), (2, 4), (1, 4)}. There are always two "paths" to get from the minimum to the maximum. In the case of S, {(1, 2), (2, 4)} or {(1, 4)}. Getting from the minimum to the maximum while avoiding a cycle is then just a matter of take one path and only one path.

Solution

 Take ranges/ undirected edges greedily. Only take an edge (a, b) if b is not currently reachable from a, i.e. the other "path" has not already been taken

Tuesday, July 8, 2025

Distinct Values Subsequences

problem

observations
The number of non-empty subsequences of an array of length n is 2n - 1. If each number in the array appears once, then this is the answer.

The order of the numbers in the array does not affect the count, just the number frequencies.

Why is observation one true: each number can either appear or not appear, and the empty set is not counted, so (1 + 1)...(1 + 1) - 1 = 2n - 1. Each number contributes a term of (1 + 1): the first one is absence, while the second one is incidence.

But what if a number occurs more than once? A subsequence can either not have a number or have just one instance of it. That "choice" is easily expressed as a trivial combination: (frequency of number choose 1) = frequency of number. The expression for the general answer is (1 + frequency of first number)(1 + frequency of second number)...(1 + frequency of last number) - 1.

You could feasibly compute this expression directly mod 1e9+7, but there is a cooler incremental, indirect computation.

solution

#include<iostream>

#include<unordered_map>


using namespace std;


long long binpow(long long a, long long b, long long m) {

    a %= m;

    long long res = 1;

    while (b > 0) {

        if (b & 1)

            res = res * a % m;

        a = a * a % m;

        b >>= 1;

    }

    return res;

}


int main() {

        const long long MOD = 1000000007;

        int n;

        cin >> n;

        unordered_map<int, int> freq;

        long long count = 1;

        for (int i = 0;i < n;++i) {

                int x; 

                cin >> x;

                count = count + (count * binpow(freq[x] + 1, MOD - 2, MOD) % MOD) % MOD;

                ++freq[x];

        }

        count = count - 1 % MOD;

        cout << count << endl;

}

Monday, July 7, 2025

Distinct Values Subarrays

problem

observations

1. The number of subarrays of an array of length n is n(n + 1)/2. If each number in the array appears once, then this is the answer.

2. What if you find a covering of the array into disjoint segments each with distinct numbers, e.g. [1, 2, 3, 1] becomes [1, 2, 3], [1]. Can you apply the formula independently to the segments and sum the results? No, this fails to count segments that cross the split and would underestimate the correct answer. Also, such a covering might not be unique.

3. What if you forego disjoint-ness and expand the segments as much as possible while keeping the numbers distinct: [1, 2, 3, 1] becomes [1, 2, 3], [2, 3, 1]. This covering is unique, but now you overestimate the correct answer because the segments overlap. , that the amount of overshoot is equal to the number of subarrays of [2, 3]. A nice expression for the correct number of subarrays follows from this observation resembling the inclusion-exclusion principle formula.

4. If x1, ..., xk are distinct, but x1, ..., xk, xk+1 are not distinct, then one of the numbers in x1, ..., xk is xk+1. If xj = xk+1 with 1 ≤ j ≤ k, then xj+1, ..., xk+1 is distinct. This means that storing the index of incidence for each number in x1, ..., xk is enough to find out the start of the second segment in the construction from observation three.

solution

You can compute the expression from observation three in a sliding window fashion by exploiting the structure from observation four.

Wednesday, March 26, 2025

two problems

Codeforces Round 1011 (Div. 2) Problem C
Solution: x + y = x xor y as long as the addition involves no carries, i.e. the bits that are set in a should be disjoint from the ones in b. Multiples of powers of two are always disjoint from numbers less than that power of two. if x ≠ y, then without loss of generality, one of x and y will be bigger than the other, so choose the k that makes the the bigger one a power of two. If the numbers x and y given are equal to each other, then (x + k) + (y + k) ≠ (x + k) xor (y + k) for all non-negative numbers k (unless x = y = 0).

Solution: This is basically just a connected components problem. Once you identify the connected components, connect them minimally with # components - 1 edges: connect one vertex in each connected component to one vertex in a single chosen component of your choice. Here's my union find solution. You could also identify and connect the components using a depth first search and no union find structure, but then you need to store the adjacency matrix.