Stata commands: ranksum vs median - which one to use?

wand

New Member
#1
Hi everyone,

I need some help. Does anyone know whether the Mood’s median test [Stata command ‘median’, I guess?] is still used by people? I read from Wiki that the Moods test is ‘obsolete.’ Is it true?

If it is true, should I use the Mann–Whitney test instead [Stata command 'ranksum']?

And which approach is more suitable for my task, as outlined below:

My task is to test the differences in the median annual expenditure on some consumer products. The sample size is about 1000 [about 250 per year for 4 years]. I want to test the difference in the medians between income groups [one group vs the rest of the groups combined], and also to test any changes in the medians from one year to the next year for each income group.

I would appreciate any expert comments.

Thanks!

Wand
 
Last edited:

maartenbuis

TS Contributor
#2
I would consider qreg. This will allow you to test for differences in median (or any other percentile of your choice). Moreover, it also gives you an idea of how large these differences are.
 

wand

New Member
#3
Thanks a lot Maartenbuis for your reply.
I see that the command qreg is used in regression. However my task doesn't involve regression, not sure if it works in my case?

Regards
Wand
 
#4
Maarten's recommendation to use -qreg- provides you a method
for comparing medians between pairs of companies. If you
don't want to think of this process as regression, just think of
it as a means of conducting t-tests on medians.

In the following toy example, I rename two variables and
create a series of dummy variables using the -auto.dta-
data set to mirror the problem you described.

In the -qreg- output the p-values (P>|t|) for the dummy variables
for a pair of companies indicates whether the median expenditures
of those companies differ at a statistically significant level.

Code:
* Create hypothetical data set mirroring your variables
clear
sysuse auto.dta
rename headroom expenditure
rename rep78 company
tab company, gen(company)

* Create binary variables of all company pairs

gen comp12=1 if company1
replace comp12 = 2 if company2

gen comp13=1 if company1
replace comp13 = 2 if company3

gen comp14=1 if company1
replace comp14 = 2 if company4

gen comp15=1 if company1
replace comp15 = 2 if company5

gen comp23=1 if company2
replace comp23 = 2 if company3

gen comp24=1 if company2
replace comp24 = 2 if company4

gen comp25=1 if company2
replace comp25 = 2 if company5

gen comp34=1 if company3
replace comp34 = 2 if company4

gen comp35=1 if company3
replace comp35 = 2 if company5

gen comp45=1 if company4
replace comp45 = 2 if company5


* Compare median expenditure of company 1 with company 2
qreg expenditure comp12

* Compare median expenditure of company 1 with company 3
qreg expenditure comp13

* Compare median expenditure of company 1 with company 4
qreg expenditure comp14

* Compare median expenditure of company 1 with company 5
qreg expenditure comp15

* Compare median expenditure of company 2 with company 3
qreg expenditure comp23

* Compare median expenditure of company 2 with company 4
qreg expenditure comp24

* Compare median expenditure of company 2 with company 5
qreg expenditure comp25

* Compare median expenditure of company 3 with company 4
qreg expenditure comp34

* Compare median expenditure of company 3 with company 5
qreg expenditure comp35

* Compare median expenditure of company 4 with company 5
qreg expenditure comp45
To demonstrate this with means rather than medians, consider
the following OLS regression and equivalent t-test.

Code:
* Compare mean expenditures of company3 with mean expenditures of company5

regress expenditure comp35

ttest expenditure, by(comp35)
 

wand

New Member
#5
Thanks a lot RedOwl. I will take the advices from you and Maartenbuis and use qreq instead.

Is there a reason why Mood's median test and Mann-Whitney test are not recommended? Is qreg conceptually similar or different from either or both of the 2 approaches?

Regards
Wand
 

maartenbuis

TS Contributor
#6
I would not say that I don't recommend either test, it is just that I recommend quartile regression more. The reason is that it also gives you an estimate of the size of the difference. This allows you to see whether or not a differences is substantially significant on top of whether or not a difference is statistically significant.