best fitting lines in PCA

I'm learning about PCA and I don't get the point that PC2 is the next best fitting line after PC1. Wouldn't the second best fitting line be a line which is close to PC1? This second line will not be orthogonal to PC1 (which I'm aware is required) but it's likely to have a sum of squares of distances larger than the orthogonal PC2, and thus be a better description of the spread. Could someone explain this to me please?


Ambassador to the humans
PC2 accounts for the most variation - after the variation in PC1 has been accounted for.

As an example think of points in 3d. Suppose that they are laid out on a typical x,y,z grid. Then suppose that the most variation is directly just in the z direction. So to get PC2 we just look at x and y and try to find what accounts for the most variation there.