Estimating the demand function for a product

Hi everyone!

I have around 20 products with 30 monthly observations for each that I need to analyze.

I have data on demand, normal price, discount in percent, as well as some categorical variables but we can disregard these categorical variables for now.

I want to understand for example how much a discount effect sales, per product, and find the products with the highest beta for this so that I know which products are suitable to discount and which ones that are not. The problem is however the low sample size. 30 observations, and discount is 0 for around 20 of the 30 weeks.

In order to solve this a friend of mine told me that i could just run a regression with all products at the same time, and separate them by using a dummy variable for each one. But, after reading up on this method, I realized that these dummies will then only change the Y intercept for each product, and not the slopes for the % discount. This is a major issue as I am certain that the slopes are very different for each product..

So, anyone out there that can help me and tell me if there is a way to solve my problem?

You can start by looking at the scatterplots for the separate products. You have about 10 (30-20) observations per product where there was a discount active, right?
If you put the 20 weeks' data of all products in one group and the 10 weeks' data of all products in another group. You can perform a univariate ANOVA first, to see whether discounts work in general.

Second, you want to find a factor that determines whether discounts are effective. You suspect that product_id is that factor. This is a categorical variable. In this case the 20 products are a random draw from the population of products (tell your software). You can now test each group of ten (product_id_discount) observations against the 20x20 observations (all_no_discount) in the other group, that test will be less affected by low sample size. You can also test each group of ten (product_id_discount) observations against each 20 observations (product_id_no__discount) separately. In that case, you may have to do a correction on the significance (or just use 99%, in the end this will be open to experimentation & interpretation).

Regression is mathematically identical to analysis of variance, however, you are not immediately interested in a linear model, especially with a categorical variable like product_id. Your buddy is right that you can dummy code product_id to make it a continuous/interval variable but there seems to be no natural order. You would find such an order if you would for instance use normal price as independent variable. In that case, dummy coding would be extra but unnecessary. With normal price you could see whether discounts work for certain normal price levels/ranges in a regression.