Problems for ordinary least squares and ridge solutions

I am confused with these two methods. Fox example,

Suppose p ≫ n (many more predictor variables than observations), I have a design matrix X and a quantitative response vector y, and I plan to fit a linear regression model.

Some one told me that the ordinary least squares solution is not unique (WHY?). And what can I say about the residuals of any solution?

Now, If I want to use ridge regression, is the solution unique or not? WHY?

Last, suppose I compute a series of ridge solutions βˆ(λ) for X and y, letting λ get mono- tonically smaller. How did the limiting ridge solution change as λ ↓ 0?

Maybe you've done this already, but before you use ridge, why don't you simplify your model by removing the explanatory variates that are highly correlated with each other, use multiple regression and see how that looks ? You could use a stepwise regression to help.