# Statistical Analysis for Master's Thesis

#### may_survey

##### New Member
Hello Everyone,
So this is my first forum post ever probably because I just couldn't figure out solution by myself this time. Thank you in advance for your suggestions and help!

So I have done previous STATS projects and surveys, but my options (for the responses) were usually Yes/No or Strongly agree/ Not agree or something which was common for all the questions. However this time my survey has let's say around 5 questions, each question has different answer solutions. For example:

Q1) How many people are in your agile team?
*1-5 *6-10 *11-20 * Even more

Q2) How is your team structured?
*Co-located *Near-located *Far-located

Q3) Every sprint has a defined goal
*True *False *Not sure

Q4) Each sprint for our project has life-span of
*1-2 weeks *3-4 weeks *5-7 weeks *Even more

Q5) How often do you retrospect during agile cycles?
*After every sprint *After every other sprint *Whenever Sprint does not go well

Now as you can see answers are so different. I am having hard time in connecting them all to use mean, standard deviation or say like ANOVA, pearson correlation for deriving results. Do you guys have any idea? Or graphs are the only option for me?

#### CE479

##### New Member
Hi,
The tests that you describe (mean, standard deviation) apply to parametric data, but your data is not parametric.

What you do next with the data depends on your aim. Are you wanting to compare data between groups at all? If not, then you may not need to do anything more than bar graphs or pie charts to display your data.

Hope that helps

#### GretaGarbo

##### Human
Hi,
The tests that you describe (mean, standard deviation) apply to parametric data, but your data is not parametric.
There are parametric and non-parametric methods, but I have never heard of “non-parametric data”. I guess that people mean then, that the data is not normally distributed. But there are many other parametric distributions. For example, the binomial and multinomial distribution like may_surveys data above.

I suggest that may_survey estimates the proportions (and their standard error, that is easy, most software do that) as that is the parameters in the multinomial distribution.

And the sample mean estimates the population mean. What is wrong with that?

#### CE479

##### New Member
@GretaGarbo- thanks for correcting me. Very useful for my understanding.

#### rogojel

##### TS Contributor
hi,
what are the questions you need to answer using this data?

regards

#### may_survey

##### New Member
Hello Everyone,
@CE479 I have total of around 40 questions and I have divided them in 4 sections, each containing 10 questions. Each section is focused on particular benefit of using Agile methodology for Software Development. However, since the options are so different, I was wondering if I can move out of just pie and bar charts.

@GretaGarbo Can I use non-paramteric test here too? I mean I know we can do coding for ordinal data, but how about my answers since these are different for every question? What I can think of is, assign ranks to all answer options like below and then finally doing the calculations. Do you think this is one way?
Q1) How many people are in your agile team?
*1-5 *6-10 *11-20 * Even more
(rank1) (rank2) (rank3) (rank4)

Q2) How is your team structured?
*Co-located *Near-located *Far-located
(rank1) (rank2) (rank3)

Q3) Every sprint has a defined goal
*True *False *Not sure
(rank1) (rank2) (rank3)

Also, I read that I can also use Contingency Table to derive results. Any thought?

#### may_survey

##### New Member
Also, the questions that I have mentioned in my initial post all belongs to same section. So I have to derive results for one section which is containing these 5 questions.