I have a panel dataset (firms) in which some variables are "choice" variables (that is, they reflect executives' free choices as well as objective characteristics of the firms that may affect those choices). Therefore, in the first stage, I regress the "choice" variables on other independent variables. In the second stage, I am supposed to use the results of the first stage as new independent variables instead of the original "choice" variables.

Now for my question. In all books that I have read, the standard procedure is to use the predicted values of the choice variables instead of the original values. And here is what I don't understand. The whole idea for the first stage is to tease out the free choice of the decision makers and free it from firms' characteristics. I want to know how the choice to invest more or less than what is expected for this specific firm affects its performance. However, if I plug in the predicted values from Stage 1, I will get the expected values for investments. In other words, I will get the investment values that each specific firm would be expected to make given its characteristics. Therefore, the free choice of the executives is thrown out, and it is this free choice that I am interested in!

Please let me know if my reasoning is correct. It it is, I should be using the residuals from the first stage instead of the original values. I want to know why all textbooks mandate the use of the predicted values in Stage 2 instead of the residuals. Thank you for your help!