Hi everybody,
I appreciate any kind of help and tips:
Problem:
- I have a dataset with 1600 orders
- For each of these orders I know the total costs & products included (e.g. 7975€ & A, C, D, H)
- I don't know the individual prices for the range of products (in total 80), which would, for instance, be helpful to calcuate new offers .
--> please see the attached pic
Potential Solultion: Multiple regression
- no of observations = no of orders
- independent variables = all products (= ~80)
- dependent variable = total cost (per order)
Questions:
- Is a multiple regression a suitable statistical tool?
- If not: Is there an alternative?
- If yes: Are there any specialities I have to consider (e.g. multicollinearity --> how can I reduce it, etc.)?
Many thanks for your help!
Chris
I appreciate any kind of help and tips:
Problem:
- I have a dataset with 1600 orders
- For each of these orders I know the total costs & products included (e.g. 7975€ & A, C, D, H)
- I don't know the individual prices for the range of products (in total 80), which would, for instance, be helpful to calcuate new offers .
--> please see the attached pic
Potential Solultion: Multiple regression
- no of observations = no of orders
- independent variables = all products (= ~80)
- dependent variable = total cost (per order)
Questions:
- Is a multiple regression a suitable statistical tool?
- If not: Is there an alternative?
- If yes: Are there any specialities I have to consider (e.g. multicollinearity --> how can I reduce it, etc.)?
Many thanks for your help!
Chris