Dear TalkStatstters,
I want to investigate possible wage gaps between different groups of people with doctorate degree, in my region. So, I have in mind a regression where the dependent variable is the log of the ratio between two average wages.
I have administrative data from the whole population of new doctorates, in a given year, from my region.
I have also survey data from a sample of this same population (where the whole population was contacted to participate to the survey, but only around 65% of doctorates answered the survey: this is the sample I should work on).
There will be self-selection bias if the nonrespondence is not random; so that my estimates won't have external validity.
Any inputs on how to tackle this issue? Liteature on this issue?
I want to investigate possible wage gaps between different groups of people with doctorate degree, in my region. So, I have in mind a regression where the dependent variable is the log of the ratio between two average wages.
I have administrative data from the whole population of new doctorates, in a given year, from my region.
I have also survey data from a sample of this same population (where the whole population was contacted to participate to the survey, but only around 65% of doctorates answered the survey: this is the sample I should work on).
There will be self-selection bias if the nonrespondence is not random; so that my estimates won't have external validity.
Any inputs on how to tackle this issue? Liteature on this issue?