Best test to use for hypothesis

#1
Hello - I am afraid it is along time since I studied statistics and I am a bit rusty, but any ideas on the following would be gratefully recieved (audience for research would be academic but not statisticians):

I have data - not a random sample but full data - of new malt whisky distilleries in Scotland since 2000 and totals of all distilleries before this by region (eg Highland, Lowland, Islands etc)

If my null hypothesis is new distilleries following the existing region pattern, can I test the alternative hypothesis that a particular region is now more likely to get a new distillery than before based on this data?

For example 11 new distilleries in the Lowlands since 2000 out of 14 Lowland distilleries in total looks significant swing to me when there are 9 new in Highlands and 2 new in Speyside since 2000 out of a total of 38 and 48 respectively. Percentage wise the increase in the more usual Highlands/Speyside area is clearly much less - but how is the best way/test to show this in terms of significance?

Or is such a test inappropriate fir this data?

It is actually the island distilleries I am interested in testing and seeing if the increase there is unusual and suggestive of an increased trend towards offshore distilleries (2000 to 2018 there were 6 new ones out of a total of 19 - overall 29 new out of 122 - malt whisky distilleries only, not grain -more since 2018 too, but using this cut off for now)

I can simply state % increases per region but as I was once a student of statistics I was thinking if a more interesting test was possible - so if anyone has any thoughts on suitable test I'd be glad to read them - apologies for long missive but thought it best to give some figures and explain the data

Iain
 

katxt

Active Member
#2
On first glance it looks like a chi square situation. You have the observed figures 14, 9 and 2 new distilleries. All you need now is how you would have expected 25 new distilleries to be distributed by chance. And there's the problem.
One possibility is equal shares 8.3 in each area. Another is to split the 25 in the ratio reflecting those already there 14:38:48. or perhaps according to actual physical area, or by population, or possibly by shortage of distilleries in each area.
In short, my opinion is that you don't have the data here needed to tell whether 14, 9 and 2 is unexpected or not using a statistical test.
 
#3
Thank you - that's very useful feedback - I had kind of discounted chi squared but maybe because I am so rusty.... I think clearly new malt distilleries have gone into areas where they were not traditionally (illicit stills apart) but the hypothesis based on previous experience I would expect them to gravitate to other areas more - the Lowlands more renowned for the big producing grain distilleries.
I could test more than one hypothesis of course - it would be an appendix anyway as this is not a stats paper so when I get the time I'll play around with the chi2 and see what it throws up. Thanks again.