Suppose p ≫ n (many more predictor variables than observations), I have a design matrix X and a quantitative response vector y, and I plan to fit a linear regression model.

Some one told me that the ordinary least squares solution is not unique (WHY?). And what can I say about the residuals of any solution?

Now, If I want to use ridge regression, is the solution unique or not? WHY?

Last, suppose I compute a series of ridge solutions βˆ(λ) for X and y, letting λ get mono- tonically smaller. How did the limiting ridge solution change as λ ↓ 0?

THX.