# How to analyze UNPAIRED data for correlation/regression

I am a doctoral student struggling with the analysis of one of my research questions. I have 2 samples. One sample responded to one instrument while the other responded to a completely different instrument. I'm trying to figure out whether the responses of one sample is related to the responses of the other sample.

Since neither sample completed the other's instrument, the data are unpaired. Initial analysis yielded warning messages in SPSS. A friend suggested I pair the data with zeros which at least provided some output and no error or warming messages! I've worked with the statistics department at my university and my dissertation chair, but the analyses they suggested I perform leave me with very weird results.

Is this analysis even possible? If so, how? I appreciate any guidance as I'd really like to finish this up soon!

You cannot perform correlation or regression without paired data. The only thing that you can do with this type of data is to perform a 2-sample t-test (or equivalent) to determine whether there is a statistically significant difference in means. However, any significant differences will not tell you whether the difference was in the instruments or in the samples tested unless you had randomly assigned the samples to the instruments.

Thank you for your response! What about a "dummy" value to pair it with? Would that work?

I've read other research articles that discuss how the attributes of one population affect a different attribute of another one. I don't know where these articles are at this point so I can't refer to them.

Could you please tell us the topic of your research, and the exact research question?
How large are these samples? What are these instruments, and what do they measure?
Which analysis did you perform, and which warning message did you receive?
Could you please tell us what the statistics department and your dissertation chair suggested,
and how you carried out the according analyses, and what the "weird results" did look like?

I'm researching the topic of nursing empowerment

How large are these samples? One sample contains 67 results, the other contains 62 results. What are these instruments, and what do they measure? I'm using the Psychological Empowerment Instrument developed by Spreitzer (1995) for the one sample (nurses participating in a specific course) and the Status and Promotion of Professional Nursing Practice - Part II developed by Carlson-Catalano (1988) that measures the empowering teaching behaviors used by the instructors of the course.

Which analysis did you perform, and which warning message did you receive? I performed a bivariate correlation. The warning message stated the CI couldn't be computed because the number of valid cases did not exceed 3. Which makes sense since there is no paired data to correlate.

My chair (not a statistician, but a PhD prepared nurse) suggested I contact the stats department. The stats department didn't really help. Sent me a few references that didn't apply. A friend suggested I use a "dummy" value of "0" so that the data could be paired. This yielded a correlation of -.857. This makes sense because each value is being compared to 0. I also tried using the mean for each of the values as the paired value, but got the same result. I just don't think this is right. Any guidance would be appreciated. Thank you!

Carol

Ok, now I know what your research topic is, but I still do not know what your exact research question is.
In particular, why you used these two instruments. And why with 2 different classes (although that might
be a matter of the study design, not of the research question).

Could your research question be something like "participants who experience more empowering behaviors
by their instructors score higher onthe psychological empowerment instrument" ?

This is my specific research question:

Is there a relationship between empowering teaching behaviors (ETB) scores of instructors and PE post-test scores of the participants?

I want to use the PEI scores as the dependent variable and the ETB scores as the independent variable.

Ok, thank you. So indeed you chose a research design which is not able to answer your research question.
It is a bit surprising that your instructor did not tell you that you need to measure both variables in the
same subjects. Do you have any opportunity to collect the missing data (ETB in one class, PEI in the other)?

With kind regards

Thank you again for your quick response. I'm disappointed that neither my chair nor the stats department didn't guide me to a different study design based on my research questions. I have PEI data on the instructors. One of my questions does ask whether there is a relationship between ETB and PEI among instructors. I can't collect any more data and the ETB isn't for student observations of ETB. It is self-reported from the instructors. Thank you again for helping me. I'll discuss this with my chair and stats department and see where they want me to go with this research question...

I found the PEI survey online: https://webuser.bus.umich.edu/spreitze/Pdfs/EmpowerInstrument.pdf But, I haven't had luck finding the ETB online. You say that the ETB is self-reported "from the instructors". Does this mean that each instructor has a PEI survey and those same instructors have an associated self-reported ETB? In this case, I don't see why you couldn't relate these. Maybe I am missing something.