Monday, July 7, 2025

Distinct Values Subarrays

problem

observations

1. The number of subarrays of an array of length n is n(n + 1)/2. If each number in the array appears once, then this is the answer.

2. What if you find a covering of the array into disjoint segments each with distinct numbers, e.g. [1, 2, 3, 1] becomes [1, 2, 3], [1]. Can you apply the formula independently to the segments and sum the results? No, this fails to count segments that cross the split and would underestimate the correct answer. Also, such a covering might not be unique.

3. What if you forego disjoint-ness and expand the segments as much as possible while keeping the numbers distinct: [1, 2, 3, 1] becomes [1, 2, 3], [2, 3, 1]. This covering is unique, but now you overestimate the correct answer because the segments overlap. , that the amount of overshoot is equal to the number of subarrays of [2, 3]. A nice expression for the correct number of subarrays follows from this observation resembling the inclusion-exclusion principle formula.

4. If x1, ..., xk are distinct, but x1, ..., xk, xk+1 are not distinct, then one of the numbers in x1, ..., xk is xk+1. If xj = xk+1 with 1 ≤ j ≤ k, then xj+1, ..., xk+1 is distinct. This means that storing the index of incidence for each number in x1, ..., xk is enough to find out the start of the second segment in the construction from observation three.

solution

You can compute the expression from observation three in a sliding window fashion by exploiting the structure from observation four.