Does dummy coding produce a "high dimensional" dataset?

Say you have a dataset with only a few categorical variables but each categorical variable has a lot of levels. You dummy encode each categorical variable and give every level a binary variable which increases the number of columns/variables in your dataset. Would this dataset be considered a "high dimensional" dataset even though almost all the columns are binary encoded dummy variables with 0's and 1's?


Active Member
Hi Amd,

From a logical perspective, I would think about a categorical variable as only one dimension.

Multiple linear regression formula with matrices: (least-squares model)
From the calculation perspective, when you calculate linear regression each additional dummy increases the size of the X matrix.
Each dummy variable is another dimension in X.