I have a set individuals (let's say ID_No 1 through 50) and a set of metrics that pertain to these individuals: things like age, SAT reading score, SAT math score, a motivation index metric, etc.
These individuals are also aggregated into teams (let's say ID_No 1 though 10 is team 1, ID_No 11 through 20 is team 2, etc.). Now for these teams, I have a set of performance metrics on a collective test that each team completed together; so for instance, team 1 scored a 89/100, team 2 scored a 75/100, etc.
I'm simply running a multivariate regression of the individual characteristic metrics against the team performance metric.
But my question is whether it's more appropriate to:
OR
I have played around with some dummy data that I created, and I know there is a difference in the results, but I'm not sure which result is more valid. I'll also be completing simple bivariate regressions and probably PCA on this same data.
Thanks!
These individuals are also aggregated into teams (let's say ID_No 1 though 10 is team 1, ID_No 11 through 20 is team 2, etc.). Now for these teams, I have a set of performance metrics on a collective test that each team completed together; so for instance, team 1 scored a 89/100, team 2 scored a 75/100, etc.
I'm simply running a multivariate regression of the individual characteristic metrics against the team performance metric.
But my question is whether it's more appropriate to:
1. Find the average of each characteristic metric (e.g., SAT scores, age) for all individuals on a team and assign that average to each team. In this case, n=5 (the number of teams). So the regression would look like:
MODEL team_score = team_avg_age team_avg_math team_avg_read ...
OR
2. Assign the team performance metrics to each individual. In this case, n=50 (the number of individuals), and the same team score would be shared by all of the individuals on that team. So the regression would look like:
MODEL team_score = age math read ...
I have played around with some dummy data that I created, and I know there is a difference in the results, but I'm not sure which result is more valid. I'll also be completing simple bivariate regressions and probably PCA on this same data.
Thanks!