![]() | ||
Matrix completion is the task of filling in the missing entries of a partially observed matrix. A wide range of datasets are naturally organized in matrix form. One example is the movie-ratings matrix, as appears in the Netflix problem: Given a ratings matrix in which each entry
Contents
- Low rank matrix completion
- Assumptions
- Uniform sampling of observed entries
- Lower bound on number of observed entries
- Incoherence
- Low rank matrix completion with noise
- High rank matrix completion
- Convex relaxation
- Gradient descent
- Alternating least squares minimization
- Applications
- Collaborative filtering
- System identification
- Global positioning
- References
Without any restrictions on the number of degrees of freedom in the completed matrix this problem is underdetermined since the hidden entries could be assigned arbitrary values. Thus matrix completion often seeks to find the lowest rank matrix or, if the rank of the completed matrix is known, a matrix of rank
In statistical learning point of view, the matrix completion problem is an application of matrix regularization which is a generalization of vector regularization. For example, in the low-rank matrix completion problem one may apply the regularization penalty taking the form of a nuclear norm
Low rank matrix completion
One of the variants of the matrix completion problem is to find the lowest rank matrix
Candès and Recht proved that with assumptions on the sampling of the observed entries and sufficiently many sampled entries this problem has a unique solution with high probability.
An equivalent formulation, given that the matrix
Assumptions
A number of assumptions on the sampling of the observed entries and the number of sampled entries are frequently made to simplify the analysis and to ensure the problem is not underdetermined.
Uniform sampling of observed entries
To make the analysis tractable, it is often assumed that the set
Lower bound on number of observed entries
Suppose the
Secondly, there must be at least one observed entry per row and column of
Combining the necessary conditions and assuming that
Incoherence
The concept of incoherence arose in compressed sensing. It is introduced in the context of matrix completion to ensure the singular vectors of
Candès and Recht define the coherence of a matrix
-
μ ( U ) , μ ( V ) ≤ μ 0 - The entries of
∑ k u k v k † μ 1 r m n
for some
Low rank matrix completion with noise
In real world application, one often observe only a few entries corrupted at least by a small amount of noise. For example, in the Netflix problem, the ratings are uncertain. Candès and Plan showed that it is possible to fill in the many missing entries of large low-rank matrices from just a few noisy samples by nuclear norm minimization. The noisy model assumes that we observe
where
where
Among all matrices consistent with the data, find the one with minimum nuclear norm. Candès and Plan have shown that this reconstruction is accurate. They have proved that when perfect noiseless recovery occurs, then matrix completion is stable vis a vis perturbations. The error is proportional to the noise level
for all matrices
High rank matrix completion
The high rank matrix completion in general is NP-Hard. However, with certain assumptions, some incomplete high rank even full rank matrix can be completed.
Eriksson, Balzano and Nowak have considered the problem of completing a matrix with the assumption that the columns of the matrix belong to a union of multiple low-rank subspaces. Since the columns belong to a union of subspaces, the problem may be viewed as a missing-data version of the subspace clustering problem. Let
The algorithm involves several steps : (1) local neighborhoods; (2) local subspaces; (3) subspace refinement; (4) full matrix completion. This method can be applied to Internet distance matrix completion and topology identification.
Convex relaxation
The rank minimization problem is NP-hard. One approach, proposed by Candès and Recht, is to form a convex relaxation of the problem and minimize the nuclear norm
The complexity of using SDP to solve the convex relaxation is
Candès and Recht show, using the study of random variables on Banach spaces, that if the number of observed entries is on the order of
This result has been improved by Candès and Tao. They achieve bounds that differ from the optimal bounds only by polylogarithmic factors by strengthening the assumptions. Instead of the incoherence property, they assume the strong incoherence property with parameter
-
| ⟨ e a , P U e a ′ ⟩ − r m 1 a = a ′ | ≤ μ 3 r m a , a ′ ≤ m and| ⟨ e b , P U e b ′ ⟩ − r n 1 b = b ′ | ≤ μ 3 r n b , b ′ ≤ n - The entries of
∑ i u i v i † μ 3 r m n
Intuitively, strong incoherence of a matrix
Candès and Tao find that when
Gradient descent
Keshavan, Montanari and Oh consider a variant of matrix completion where the rank of the
- Trim
M E 2 | E | n 2 | E | n - Project
M E r principal components. Call the resulting matrixTr ( M E ) . - Solve
min X , Y min S ∈ R r × r 1 2 ∑ i , j ∈ E ( M i j − ( X S Y † ) i j ) 2 + ρ G ( X , Y ) whereG ( X , Y ) is some regularization function by gradient descent with line search. InitializeX , Y atX 0 , Y 0 Tr ( M E ) = X 0 S 0 Y 0 † G ( X , Y ) as some function forcingX , Y to remain incoherent throughout gradient descent ifX 0 Y 0 - Return the matrix
X S Y †
Steps 1 and 2 of the algorithm yield a matrix
In Step 3, the space of candidate matrices
Alternating least squares minimization
Alternating minimization represents a widely applicable and empirically successful approach for finding low-rank matrices that best fit the given data. For example, for the problem of low-rank matrix completion, this method is believed to be one of the most accurate and efficient, and formed a major component of the winning entry in the Netflix problem. In the alternating minimization approach, the low-rank target matrix is written in a bilinear form:
the algorithm then alternates between finding the best
The alternating minimization algorithm can be viewed as an approximate way to solve the following non-convex problem:
The AltMinComplete Algorithm proposed by Jain, Netrapalli and Sanghavi is listed here:
- Input: observed set
Ω , valuesP Ω ( M ) - Partition
Ω into2 T + 1 subsetsΩ 0 , ⋯ , Ω 2 T Ω belonging to one of theΩ t -
U ^ 0 = S V D ( 1 p P Ω 0 ( M ) , k ) i.e., top-k left singular vectors of1 p P Ω 0 ( M ) - Clipping: Set all elements of
U ^ 0 2 μ k n U ^ 0 - for
t = 0 , ⋯ , T − 1 do -
V ^ t + 1 ← argmin V ∈ R n × k ∥ P Ω t + 1 ( U ^ V † − M ) ∥ F 2 -
U ^ t + 1 ← argmin U ∈ R m × k ∥ P Ω T + t + 1 ( U ( V ^ t + 1 ) † − M ) ∥ F 2 - end for
- Return
X = U ^ T ( V ^ T ) †
They showed that by observing
It is worth noting that, although convex relaxation based methods have rigorous analysis, alternating minimization based algorithms are more successful in practice.
Applications
Several applications of matrix completion are summarized by Candès and Plan as follows:
Collaborative filtering
Collaborative filtering is the task of making automatic predictions about the interests of a user by collecting taste information from many users. Companies like Apple, Amazon, Barnes and Noble, and Netflix are trying to predict their user preferences from partial knowledge. In these kind of matrix completion problem, the unknown full matrix is often considered low rank because only a few factors typically contribute to an individual's tastes or preference.
System identification
In control, one would like to fit a discrete-time linear time-invariant state-space model
to a sequence of inputs
Global positioning
The global positioning problem emerges naturally in sensor networks. The problem is to recover the global positioning of points in Euclidean space from a local or partial set of pairwise distances. Thus it is a matrix completion problem with rank two if the sensors are located in a 2-D plane and three if they are in a 3-D space.