# How to interpret relationship between variables when one value has a disproportionate number of samples

#### deez_horvath

##### New Member
Hello all!

I am in the beginning of a statistics project in my master's course which will be focusing on regression analysis later on. We have not yet done any regression analysis but have been asked to turn in a paper on initial findings and analysis, but I am having a lot of trouble in how I exactly interpret my results.

We are doing a study on how religious beliefs in Africa affect political scale (0 being left-wing, 10 being right wing). Our main assumption was that those who are more religious will have more right-wing beliefs). The issue I am coming up with, however, is that because Africa has an overwhelming majority which finds religion extremely important, I have very high samples for those that find religion important (so more accurate), and much less for the lower end of the scale, and therefore am having trouble interpreting my results.

For example, the variable for abortion is a scale of 0 - 10, with 0 being the belief that abortion is never justifiable, and 10 being the belief that it is always justifiable. Below I have computed the mean political scale for each category.

justifiable | Summary of political scale
abortion | Mean Std. Dev. Freq.
------------+------------------------------------
never | 5.4241474 2.9079906 5,102
2 | 5.7409949 2.2989203 583
3 | 5.3273381 2.3997685 278
4 | 5.1809045 2.3478848 199
5 | 4.9976526 2.3629481 426
6 | 5.25625 2.5084684 160
7 | 5.5490196 2.3531207 102
8 | 5.6046512 2.5862945 86
9 | 6.0365854 2.6871671 82
always | 5.8351648 2.9861299 91
------------+------------------------------------
Total | 5.4265016 2.7815021 7,109

I had assumed that the mean political score would go up (become more right-wing) in the categories were abortion was never or almost never justifiable, aka 1-5. And this is true, you can see above that the mean political score decreases (becomes more left-wing) as abortion becomes more acceptable. HOWEVER, this trend stops with category 5, after which the mean political score starts to go up (right-wing) as abortion becomes more and more acceptable. I hypothesized that this might just be because I have such little samples of the people who believe abortion is acceptable (and larger spread in the 95% confidence intervals), and that maybe my assumption is still correct - but I'm not really sure what I can say here.

This happens to all of my variables (importance of god, confidence in churches, etc), and I just don't know how to interpret this and whether I can say there is a correlation due to reversal of the trend.

I would really, really appreciate if someone could help me out. I am sorry this is such a juvenile question. I tried to ask my professor and he said that he is not going to help us with this part, I also tried to look for answers on the internet but couldn't find anything that corresponded to this. I asked my friend who works in STATA and he said maybe weights would help but he's not particularly sure in this situation.

Thank you so much for taking the time to read this and to anyone who takes the time to reply!