# Sample size for comparing two proportions

#### Buckeye

##### Active Member
Hi!

I posted a similar question in the past. I'm trying to construct an A/B test to compare proportions between a test and control group. The historic proportion in the control group is 43%. I tried to boil this down to a simple hypothesis test:

H0: Ptest = Pcontrol
H1: Ptest < Pcontrol

But, I'm concerned that the desired effect size (maybe a 1-2% difference) might be too small. Thus, requiring a sample size that's not feasible. I'm trying to think of alternatives. Maybe a Bayesian model? I'll have to brush off the dust. Any suggestions would be appreciated.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Is the change randomized and is 43% the baseline for both groups? So are you expecting 1-2% change between groups? Happy to help, I need to crank out more sample size calcs.

#### Buckeye

##### Active Member
Is the change randomized change between groups?
I'm not sure what you mean here. The intervention will be randomized to participants. Like an experimental design. Maybe half will get the intervention and the other half will get no intervention. We have an application where we send our customers a single robot call for a service they may need. If they don't answer the robot call we send them a call from a human being. The actual process is more detailed than this, but it's proprietary info. We are trying to reduce the percentage of human calls because it costs time and money. So, our intervention is to send two robot calls to see if the customers interact before the human call. The way I'm thinking about this is, if the test group reduces the proportion of human calls by 2% then that is enticing enough for the business to adopt the intervention for all participants moving forward. I still need to ask about the desired effect size. It could be 5% for example. But, I'm trying to prepare for the case that we won't have enough data to test it in this simplistic fashion. I would say 43% is the baseline for the control group. And I'm thinking of calculating several sample sizes for various potential differences. Hopefully that helps explain.

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
What is the outcome - just them answering the call or some type of churn where they purchase a product? How would the customer know that it is a bot or person before answering? They wouldn't right. So the process doesn't matter if the person is blinded - it is just a number of contacts needed for an answer and may churn.

#### Buckeye

##### Active Member
Without getting into too many details. We advertise this service to our customers. There is a process flow that reaches out to the customer if we believe they need service. Theoretically, the customer should know the steps if they are ever in the scenario which would prompt us to reach out. We are trying to get the customer to respond yes or no to the service before needing a human call. We send a series of communications that end in a human call if they don't respond to the previous steps. So, we are trying to reduce the percentage of customers that need a human call by adding an additional step.

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Three scenarios, they say: Y/N/No response. What is the conversion rate for call one (43%?), human call after call one? So now you want to see conversion rate for call two prior to human call.

Group #1: conversion rate for 1st call then human call at human call after adjusting for conversion rate at 1st call?
versus
Group #2: conversion rate for 2nd bot call at second bot call or at human call, after adjusting for conversion rate at 1st call?

To an outsider - this is more confusing that you may think. Can you write out what the design and outcome are?

#### Buckeye

##### Active Member
Let me know if this illustration helps:

It doesn't matter how they engage. They could accept or deny our service. We just want the customers to tell us before using human resource. Due to the nature of the service, we are obliged to send a human call (if the flow reaches that point) and follow up in some fashion to the non-response customers.

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Much better! I'll provide comments tomorrow.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
So for example say you had a sample of 500 people, you randomize 50% to each group. Now you just compare the proportions that respond after one bot call (group 1) to those that reply after two bot calls (group 2). You could ensure exchangeability between the groups given the proportion of responders after call 1 in both groups seems comparable.

You may not need a control group, could use historic data - but you would have to strongly assume there is no historic bias that makes people or their actions different across time. With the recession - I am not sure if that would be true. Also, each person is only targeted once? Meaning that if I don't respond you don't target me again in another year or so, everyone is getting targeted at the same time in their relationship with your company, correct. So always ~ 6 months after establishing a relationship with you all.

This set-up seems pretty straight forward.

#### Buckeye

##### Active Member
It's fair to assume people are only targeted once. We reach out to the customer under a specific circumstance. So, the timing of interaction could vary.