Is the result "no significant difference" (t-Test) of any use for my research thesis?

#1
Hi,

I am not so familiar with statistics, since I am a computer science student. I hope it is alright to ask this question here.

I am currently working towards the end of a thesis. I compared the User Experience of Android Apps and Progressive Web Apps (PWA- basically Websites which act like normal apps) and the research question was if PWA can keep up with android apps in terms of UX.
For measuring UX I conducted a user study and used an existing questionnaire (UEQ), which calculates different scales of UX.
To find out if a significant difference exists between those apps I performed t-Tests for each scale (calculated value).

Every calculated p-Value was way above the alpha-level of 0.05, so I reject the Null hypothesis for every scale. At the beginning I thought everything was fine -> no statistical difference found -> PWA can keep up.

But the more I read about this topic I found out a high p-value does not indicate that the groups are equal or that there is no effect. Other sources say "no statistical significance" just means that the Null hypothesis cannot be rejected and "anything is possible".

Question: Are the results of the questionnaire of any use to answer my research question or do I have to say "anything is possible"?

In addition I conducted a qualitative study, which concluded that both types of apps were able to offer the same UX, but now I am doubting if I can put these result in relation to the results of the quantitative questionnaire.
Originally I thought you could use the Null hypothesis as a result but I guess I was wrong. A number of research papers, which had a similar topic, just said "no statistical significance has been found, therefore we conclude that both types of Apps were able to ........", which doesn't seem to be mathematically correct.
 

Karabiner

TS Contributor
#2
Are the results of the questionnaire of any use to answer my research question or do I have to say "anything is possible"?
Interpretations are largely based on sample size. Non-significance in case of a small small size
can often be attributed to poor statistical power.

You could calculate 95% confidence intervals around your parameters (e.g. mean difference),
to get an impression of how variable results could be if the same study (with the same sample
size) were repeated over and over.

Originally I thought you could use the Null hypothesis as a result but I guess I was wrong.
What you want to achieve seems to refer to equivalence testing. Basically, you will have
to define how large a difference between Apps is permitted, so that you can still
view them as nearly-identical/pracitcally identical/equivalent.

With kind regards

Karabiner
 

Dason

Ambassador to the humans
#4
You can say "There may well be a difference, but we don't have enough evidence to claim that there is."
The issue as I see it is that they want to say there is no difference. So equivalence testing is probably the best route forward for them.