Need help what test to use

cvda

New Member
#1
Hi there,

Currently I am investigating if a digital display in a store window will lead to significantly more people entering the store. Of course it also matters how many people walk by, because there is chance that more people enter a store when there are also more people walking by. So I got the the data of how many people walk by and how many enter the store per hour. I would like to measure if the average proportion of people entering a store (visitors/passerby) differs for condition 1 (no screen) and condition 2 (a screen).

Condition 1 (no screen) shows that an average of 0.011804049 (this number is calculated: for every opening hour the visitors/passerby to get ratio. And then add up all these ratios and divide by total of number op hours) of people is entering. So 1.18 people out of 100 walk in per hour. Data is from a month

Condition 2 (with screen) shows that an average of 0.017454835 (same calculation) of people is entering. So 1,745 people out of 100 walk in per hour. Data is from a month

Now I wanted to compare these average proportions (ie those 0.011 and 0.017) whether it is statistically significant. Can I just use an independent t-test for that or not?

If I do the result is p<0.001.

Thank you :)

Btw. data looks normally distributed
 

Attachments

Last edited:

Karabiner

TS Contributor
#2
So you have n1 people in condition 1, and 0.0118*n1 people walked in while (n1-0.0118*n1) people did not walk in.
And you have n2 people in condition 2, and 0.0174*n2 people walked in (etc.). You can create a 2*2 table
"condition"*"walk in/don't walk in" and use Chi² as test of statistical signficance.

But if your design is matched in some way (say, you compare Monday 8-9 o'clock with display with next weeks's
Monday, 8 o'clock without display etc.), one could maybe think about a dependent samples t-test with
hourly rate as the dependent variable.

Btw. data does not look normally distributed, but whether unconditional sample data are normally distributed
is irrelevant for a t-test anayway.

With kind regards

Karabiner
 

cvda

New Member
#3
Thank you for your information. Yes I compare the first Monday of Feb 2019 with the first Monday of Feb 2020. And so on, first Tuesday 2019 with first Tuesday 2020 for every opening hours that are the same. Is a dependent t-test still an option?

And if I use the chi2 with condition 1 or 2 and walk in or don't walk in. Does it still take in account that there may be walking more people in because there are simply more people in the street?
 
Last edited:

Karabiner

TS Contributor
#4
I don't know whether 2019 and 2020 differ in many more ways than just sign/no sign, but anyway,
in my opinion it is justified to match by "same month&same weekday&same hour". This could
nicely account for peculiarities due to month/weekday/hour of the day.

Thinking about this a second time, I am not sure whether one can easily perform a statistcal test on
these data, since observations within each condition are clearly not independent (what happens
during hour X on day Y is clearly not uncorrelated with hour X+1), and I don't know how one can account
for that. But if number of observations is large, as it seems to be (either in the analysis of n1+n2
individuals, or in the analysis of pairs of hours), then maybe the descriptive statistics are convincing
enough.

With kind regards

Karabiner
 

cvda

New Member
#5
Thank you,

would it maybe be better to do it through an ab test? which is often used for online webshops. But I think that is not so much difference with a normal shop (however, its hard to control a lot of things). So that I put in for condition A the number of people visiting and the number of people walking by And in condition B also.

See just an example below with the sums of all people walking in and walking by in feb 2019. And also for 2020 (version B) Thus visitors is here passerby en conversion walking in. And then are the opening hours and day (Mondays etc) of month the same.

1587042602055.png
 

Attachments

cvda

New Member
#8
I think I will do the following: to do a paired t-test on the number of people who walk by before and after putting the digital display. If the result says that there is no significant difference on the number of people who walk by, then proceed to another paired t-test on the number of people who enter the store.

But I got one question: is paired t-test correct if the groups in condition 1 and 2 are NOT the same people? because people walking in the street are not the same.
 

Karabiner

TS Contributor
#9
We discussed "hour" as the unit of observation and "proportion entering" as the dependent variable.
The idea was that First Monday in April 2019,, 8-9 o'clock can perhaps be matched with First Monday
in April 2020, 8-9 o'clock.

With kind regards

Karabiner