Does dummy coding produce a "high dimensional" dataset?

#1
Say you have a dataset with only a few categorical variables but each categorical variable has a lot of levels. You dummy encode each categorical variable and give every level a binary variable which increases the number of columns/variables in your dataset. Would this dataset be considered a "high dimensional" dataset even though almost all the columns are binary encoded dummy variables with 0's and 1's?
 

obh

Active Member
#2
Hi Amd,

From a logical perspective, I would think about a categorical variable as only one dimension.

Multiple linear regression formula with matrices: (least-squares model)
B=((X'X)^(-1))*X'Y
From the calculation perspective, when you calculate linear regression each additional dummy increases the size of the X matrix.
Each dummy variable is another dimension in X.