# Help with choosing the right method

I'm new to stats as you will probably become aware.

Just need a little help to understand which statistical method should be
used to test a hypothesis.

The work centres around a surgical endoscopy procedure for the taking of biopsy samples. The hypothesis states that with an increase in the number of procedures carried out there should be a lowering in the time the procedure takes and the number of attempts to take a clean biopsy.

!2 procedures have been observed and the data collected as follows.

Procedure Time Taken Attempts
1 90min 3
2 80min 2
3 90min 3
4 60min 2
5 60min 2
6 60min 2
7 90min 3
8 60min 2
9 60min 3
10 60min 4
11 30min 2
12 20min 2
Mean 63.3 2.5
Mode 60 2
Median 60 2
S.Deviation 19.79 0.622

Any advice on the method I should use to prove/disprove the hypothesis would be appreciated.

I was suggested to use a chi-square calculation which I did and got the following: Df=11 Chi-square=4.3140201607494 with the significance level at .05, chi-square should be 19.68 and therefore the distribution is not significant. p <= 1.

I don't know if this was the right test or if it is the start of another but any help would be appreciated.

Andrew

I would do two regressions, with Procedure as the x or independent variable, and Time Taken and Attempts as the y or dependent variables.

If Time and Attempts are in fact decreasing, the regression lines should slope downward as Procedure "increases." Additionally, test to see if r > 0 and if the slope of the line is > 0.

Ok I did the linear regression and got these results butI am confused as to there meaning.

slope m= -5.673758865
y-int, b= 100.212766
r= -0.911211378

Sorry I should say that the regression was just on the proceedure and time values.

Andrew

Can you attach the file rather than the image? It's too small to read.

OK Also included is the no of attempts which to be honest I think is out of statistical control.

Thanks for all the help

slope m= -5.673758865
The slope represents the change in y for each unit change in x. In this case, it means that for each successive procedure, the time should be reduced, on average, by about 5.7 minutes.

y-int, b= 100.212766
The y-intercept is where the regression line will cross the y-axis, in other words, how long would a procedure take if x=0. However, in your case, the lowest meaningful x value is 1, so you may want to estimate where the regression line intersects the vertical line x=1. That would give you the estimated time, on average, that it takes to complete the procedure on the very first attempt.

r= -0.911211378
r is the strength of the linear relationship between x and y, and can range from -1.0 to +1.0. A value of -0.91 indicates a strong negative relationship - i.e., as x increases, y tends to decrease (as the number of procedures increases, the time taken to complete them tends to decrease).

Great thanks for all the help John.

I've even attempted a chi-squared test with a significant result which I've attached below.

Is this helpful for this exercise

Andrew

Chi squared tests work best with categorical data - I would stick with regression in your example.