Excluding a variable from a proxy


Hi all,
I got a question regarding the use of a proxy and the possibility to exclude a variable from the proxy. I am studying the effects of litigation risk on conservatism and I am currently using a litigation risk model from Kim and Skinner (2012). One of the variables included in their specification (firm size) has overlap with a variable I used to compute the dependent variable. Conservatism and litigation risk have a positive assocation. Firm size and conservatism have a negative association, whereas firm size and litigation risk have positive assocation. Therefore, there are some opposing associations and I am afraid that this might create bias. My results are different when using an industry proxy, but various paper express concerns that solely using industry classification as a proxy for litigation risk provides relatively poor explanatory power for litigation risk.

Since firm size is only a part of the proxy, I am wondering if it is statistically appropriate to exclude the variable in the litigation proxy in order to mitigate the issues as described above.

Thank you in advance for your reply!