So I'm working on a dataset named "esoph" which is provided by default in RStudio. The dataset's aim is to find if there is a correlation between age, drinking, smoking and oesophageal cancer.

There are 5 columns, 4 of which are numerical (agegp, alcgp, tobgp, ncontrols) and one which is categorical (ncases).

After doing a chi-squared test, I found that the numerical variables are not dependent at all since the p-value is very small.

After getting the correlation matrix, I noticed that all the correlations are low (below 50%)...

Code:

```
# Code used to get the correlation matrix
df$agegp = as.numeric(as.factor(df$agegp))
df$alcgp = as.numeric(as.factor(df$alcgp))
df$tobgp = as.numeric(as.factor(df$tobgp))
sort(cor(df)[1,])
cor(df, use="complete.obs", method="pearson")
```