# Which statistical test?

#### MG1997

I am interested in investigating the impact of the localization of malignant lung tumors on the pattern of lymph node metastases. The localization of the lung tumor is a categorical independent variable divided in upper lobe, middle lobe and lower lobe. The only way I can think of to investigate this research question is to consider the dependent variable (divided in eg intrapulmonary lymph nodes, mediastinal lymph nodes, extrathoracic lymph nodes,...) also categorical and thus use a chi squared test for statistical analysis.

My question is if there is a better way to investigate the impact of the localization of malignant tumors on the pattern of lymph node metastases.

#### fed2

does
ivided in eg intrapulmonary lymph nodes, mediastinal lymph nodes, extrathoracic lymph nodes,.
have a sort of 'ordinal' character to it, ie increasing severity?

#### MG1997

The order of the lymph node metastases might differ depending on the localization of the primary tumor, so I don't think we can consider it an ordinal variable.

I want to predict the pattern of lymph node metastases. For example: primary lung tumors of the upper lobe metastasise typically to extrathoracic lymph nodes, to intrapulmonary lymph nodes, but not to mediastinal lymph nodes, while tumors of the middle lobe typically metastasise to intrapulmonary lymph nodes, to mediastinal lymph nodes and least frequent to extrathoracic lymph nodes.

#### Karabiner

How many categories for metastases will there be, and how large is your sample size?

#### MG1997

My sample size will be around 200 patients and there will be 8 categories for metastases.

#### Karabiner

So you will have about 8 patients per cell, on average, in a 2*8 table,
and I suppose there will be a number cells with expected (not actual) frequencies
below 5, which will distort the Chi² test. So maybe there is a way to reduce the
number of categories for metastatses, based on theoretical considerations
(i.e. before looking at the data)?

Moreover, if each patient can have multiple metastases, (dependent observations),
then the Chi² for independent observations cannot be used at all.

