Wednesday, March 26, 2025

two problems

Codeforces Round 1011 (Div. 2) Problem C
Solution: x + y = x xor y as long as the addition involves no carries, i.e. the bits that are set in a should be disjoint from the ones in b. Multiples of powers of two are always disjoint from numbers less than that power of two. if x ≠ y, then without loss of generality, one of x and y will be bigger than the other, so choose the k that makes the the bigger one a power of two. If the numbers x and y given are equal to each other, then (x + k) + (y + k) ≠ (x + k) xor (y + k) for all non-negative numbers k (unless x = y = 0).

Solution: This is just a connected components problem. Once you identify the connected components, connect them minimally with # components - 1 edges: connect one vertex in each connected component to one vertex in a single chosen component of your choice. Here's my union find solution. You could also identify and connect the components using a depth first search and no union find structure, but then you need to store the adjacency matrix.

Saturday, February 8, 2025

how to pay n cents

How many different ways can you pay n cents using cents, nickels, dimes, quarters, and half-dollars? Let the answer be An. This is the first problem in Problems and Theorems in Analysis I. Let fcoin[n] be a discrete signal that equals 1 when n is a non-negative multiple of the coin, and 0 everywhere else:
Ex. fnickel[-1] = 0, fnickel[1] = 0, fnickel[5] = 1, fnickel[6] = 0, ...
This can be though of as a forward echo, in which the "future" filter taps are not attenuated

 (fcent*fnickel*fdime*fquarter*fhalf-dollar)[n] = An, where * is the convolution operator. This can be computed via dynamic programming. Start with fcent[n] by setting dp[n] = 1 for all 0 ≤ n ≤ 100, then convolve with the next coin signal. This amounts to applying this update rule from n=1 to n=100: dp[n] ← dp[n] + dp[n - coin_value].

If order of payment matters, then the solution is a simple linear recurrence. A= An - 1 + An - 5 + ... + An - 50.

The explanation by Polya, touches on generating functions:

Supposedly, they are useful for enumeration problems, like integer partition. I guess the connection makes sense because polynomial multiplication amounts to convolution. See these notes:

Dynamic programming solutions to equivalent cses problems:

Thursday, February 6, 2025

matrix 2-norm

Intuitively, the matrix 2-norm measures the greatest action achievable by applying A to x. A = UΣV*, where U and V are orthogonal matrices and Σ is a non-negative diagonal matrix. This is the full singular value decomposition of A. If A is hermitian, then U = V. Basically, A is equivalent to an orthonormal basis change (U) followed by scaling (Σ) and another orthonormal basis change (V). The L^2 norm of a vector is roughly equivalent to the "energy" of a discrete signal in a signal processing sense:
L^2 norm of vector: (xTx)1/2
Energy of discrete signal: xTx
Orthogonal transforms, like rotations and reflections, preserve norms by Parseval's theorem:
(Qx)*(Qx)=x*Q*Qx = x*x
This is why the induced matrix 2-norm of matrix A is equal to its biggest singular value. The matrix 2-norm measures the greatest action achievable by applying A:
||A||2 = supx∈Cn (||Ax||2)/(||x||2)
Basically, the matrix 2-norm measures the greatest action achievable by applying A. Because U and V preserve the L^2 norm, the supremum is achieved, when all of a discrete signal/ vector's energy is scaled by the largest singular value.
||A||2 = supx∈Cn, ||x||2=1 (||Ax||2)
||A||2 = supx∈Cn, ||x||2=1 (||Σx||2)
The unit vector that achieves this largest scaling is the right singular vector v1, or the first column of V.
Notice, that V*v1 = e1, and Σe1 = σmax(A), which is the largest singular value of A.

Tuesday, February 4, 2025

more about projectors

Say you have a vector v in Cm and a matrix A in Cmxn. How do you obtain the projection of v onto range(A)?

Consider v once it has been decomposed into a component in range(A) and a component in ⊥range(A).

v = projA(v) + (v - projA(v))

Because proj_A(v) is in range(A), proj_A = Ax for some x in Cn.

v = Ax + (v - Ax)

v - Ax is orthogonal to range(A), so

(a_i)*(v - Ax) = 0 for all columns a_i. This is better shown in matrix form: A*(v - Ax) = 0

(A*A)x = A*v

When is A*A invertible? This is the case when the null space of A*A = {0}. If the n columns of A in Cm are linearly independent, then null(A) = {0} by the rank-nullity theorem: # columns of A = rank(A) + dim(null(A)). This means that the linear map A is injective (one-to-one), or an "embedding". The vectors Ax in Cm are all orthogonal to the left null space of A, so Ax -> A*Ax is injective. Thus, the entire map is injective, and is surjective too since the map goes from Cn to Cn. Thus, A*A is bijective and invertible.

x = (A*A)-1A*v, and projA(v) = Ax = A(A*A)-1A*v

What is the matrix, P = A(A*A)-1A*? P is the orthogonal projector onto range(A) since P2=P and P*=P.




Wednesday, January 15, 2025

projectors and reflectors

Notation: if v is a vector, then v* is its conjugate transpose and vv* is an outer product. If Q is a matrix, Q* is its conjugate transpose. C^m is the vector space over the complex numbers. The conjugate transpose is just the regular transpose ^T in R^m.

Say you have a vector v in C^m, then P_1 = (vv*)/(v*v) is the rank one projector onto the line: span(v). P_2 = (I - P_1) is the rank m -  1 complementary projector onto the hyperplane: ⊥span(v). Intuitively, P_1 isolates the component that exists in span(v), while the transform P_2 = (I - P_2) "shaves off" that component. If v has unit length, then P_1 = vv*, since the inner product v*v = 1. You can project onto more than just hyperplanes and lines. If you have an m by n matrix Q (n ≤ m) with pairwise orthonormal columns, then QQ* projects onto the space spanned by those columns. Notice the similarity between vv* and QQ*. Also, notice that QQ* = I if Q is a full basis for C^m.

What if you want to reflect a vector across a hyperplane ⊥span(v)? This is just a matter of inverting the component in span(v): P3 = (I - 2(P_1)). Here's an easy example of this in 3d space: imagine if the hyper plane is the xy-plane (z=0, x and y are free): the line orthogonal to this space is span(e3). The projector onto the xy-plane is the (I - [0, 0, e3]) = [e1, e2, 0]: which is just the identity matrix, except the last column is zero. (I - 2[0, 0, e3]) is just the identity matrix except the last entry in the diagonal negates the z component.

What are the eigenvalues of a transform that reflects a vector in C^m across a hyperplane? The components of the vector in ⊥span(v) hyperplane are untouched, while the component of the vector in span(v) is negated. Hence, the eigenvalues of reflector are all one except for a single negative one. The determinant is the product of all of these eigenvalues: negative one. Hence, the transform is unitary and the singular values are all 1. This is the solution to excercise 10.1 in Numerical Linear Algebra by Trefethen and Bau.

Tuesday, January 14, 2025

intersection of two planes

Here's a nice problem from the book Numerical Linear Algebra by Trefethen and Bau. You are given two planes in R^3, A = span{a1, a2} and B = span{b1, b2}; now, construct a basis for A ∩ B. a1 is linearly independent to a2. b1 is linearly independent to b2 (Lecture 7, Exercise 7.4)

If A and B are parallel and equal, then A ∩ B = A = B. If A and B are parallel but not equal, then A ∩ B = Ø.

If A and B are not parallel and not equal, then they will intersect. This intersection will be a line.

How do you figure out if the planes are parallel? They are parallel if the spaces orthogonal to them are the same, i.e. ⊥A = ⊥B.  Note that ⊥A and ⊥B are lines. What if you form a plane from the span of those two lines? That would be the space of all vectors orthogonal to either A or B. Note that a line not equal to zero cannot be orthogonal to A and B at the same time. What would be the space orthogonal to that space? It would be A ∩ B.

Conveniently, you can obtain an orthonormal basis of a set of linearly independent vectors as well as a basis for the space orthogonal to those vectors through a full QR factorization.