GLM with multiple categorical predictors

Hello internet,

lets say I want to predict an outcome using multiple categorical predictors. Please consider the follow reproducible R-code:


n <- 500

df <- data.frame(
  loneliness = round(runif(n, 0, 1)),
  pet = factor(round(runif(n, 0, 1))),
  addiction = factor(round(runif(n, 0, 2))),
  sex = factor(round(runif(n, 0, 1)))

levels(df$pet) <- c('dog', 'cat')
levels(df$addiction) <- c('none', 'low', 'high')
levels(df$sex) <- c('male', 'female')

df_coded <- data.frame(
  loneliness = df$loneliness,
  psych::dummy.code(df$pet) %>% as_tibble() %>%  setNames(paste0('pet_', names(.))),
  psych::dummy.code(df$addiction) %>% as_tibble() %>%  setNames(paste0('addiction_', names(.))),
  psych::dummy.code(df$sex) %>% as_tibble() %>%  setNames(paste0('sex_', names(.)))

model_formula <- as.formula(
    'loneliness ~ ',
    paste0(colnames(df_coded)[-1], collapse = ' + ')

model_fit <- glm(formula = model_formula, data = df_coded, family = binomial)
As you can see, I wish to predict loneliness (no/yes) by what kind of pet (dog/cat) the subject has, by if the subject has a certain level of addiction (none/low/high) and by the subjects biological sex (male/female). So far, so good. I've dummy-coded each of these predictors, resulting in one variable for each predictor level to use them in my linear model. I know that one dummy-variable is usually omitted because it represent the baseline and expressed by the intercept. However, I find this problematic in the case of multiple categorical predictors since the intercept would then represent the intersection between one level of each categorical predictor (e.g. intercept = pet_dog + addiction_none + sex_male). I do not want my intercept to represent this interaction, in fact, I do not care about any intersect at all since i merely wish to evaluate the impact of each category on the outcome. How is this usually achieved?

Thanks a lot!


Ambassador to the humans
So if you don't care about it... why do you care about it? It literally is going to represent a predicted value for some combination of your factors. If you don't have anything specific that you want it to represent in mind why does this particular representation bother you?