Homework Help

#1
Hello!

We're currently studying regression in our basic statistics class. My professor gave each of us in the class a set of data and we were tasked with describing the relationship between the two variables within.

The two variables are Patient Satisfaction (KPasTot) and Evaluation of Nurse's Therapeutic Communication Skills (KTerTot).

To be honest, I'm still not entirely sure what I'm doing. I looked at the data and neither looks to be normal.
1581774456572.png

1581774435967.png
I then performed Pearson and Spearman's rho correlation tests (I think Spearman's is for non-parameteric data?) and both returned significant positive correlation between the two variables. So then I plotted the data and got this ugly mess.

1581774620343.png
Those points are all over the place but they look to GENERALLY be following that linear fit line? Would I just describe the relationship as significantly positive, report the correlation coefficient and significance, and include the equation for the fit line and its R^2? I feel like describing the relationship as "significantly positively correlated" is doesn't properly explain the relationship between the two but (based on the super simple analysis I've done) I don't see a clear relationship between the two.

I'm not looking for answers but how would you approach this question? And what would you do if faced with a scatterplot like the one above? I want to understand the thinking process behind the analysis and reporting more than to know the "correct" answer to this assignment.

Thanks in advance for anyone nice enough to help with suggestions or advice.
 

Karabiner

TS Contributor
#2
I then performed Pearson and Spearman's rho correlation tests (I think Spearman's is for non-parameteric data?)
There is no such thing like nonparametric data. There are non-parametric tests
(tests which do not assume certain distributional properties of the data or of
the prediction errors). Spearman is used for rank data (or for interval data transformed
into ranks). Personally, I would use both Spearman (a coefficient for the degree
of monotony of associations) and Pearson (degree of linear association) to describe
the relationship in the sample.

It is interesting to note that both variables show high frequencies for the most extreme
postive value, althiugh I do not know what to made out of this.
So then I plotted the data and got this ugly mess.
I do not think it is ugly. The R² indicates that there's a large correaltion between the
variables, which is more or less reflected by the scatterplot.

Would I just describe the relationship as significantly positive, report the correlation coefficient and significance, and include the equation for the fit line and its R^2?
Sounds ok.

Just my 2pence

Karabiner
 
#3
Thanks a lot for your input.

I obviously have a lot more learning to do on this topic and I really appreciate your help steering me in the right direction :)

I decided to graph how the average of the Communication Therapeutic Skills related to the Patient Satisfaction and got a graphic that shows a much clearer relationship between the two variables.

1581850086464.png

Is this a valid way of going about looking at the data? I understand the linear fit line no longer describes the raw results but rather the average value of the samples. How valuable is this graphic in describing the general relationship between the two variables?
 

Karabiner

TS Contributor
#4
Unfortunately, I have no idea what you actually did there. "the average of the Communication Therapeutic Skills related to the Patient Satisfaction" What average? There is only 1 average in the sample, but you display a graph.

With kind regards

Karabiner
 
Last edited:

Dason

Ambassador to the humans
#5
Looks like they averaged over any replicated values. You would still want to do the regression on the original data though.