Confidence interval for median

winds

New Member
The question is:

A traffic warden notes the time cars have been illegally parked after their metered time has expired. For 16 offending cars he records the time in minutes as:
10 42 29 11 63 145 11 8 23 17 5 20 15 36 32 15
Obtain an appropriate 95 per cent confidence interval for the median overstay time of offenders prior to detection. What assumptions were you making to justify using the method you did? To what population do you think the confidence interval you obtained might apply?

I've been trying to figure out what he means by this. If I'm given a hypothetical median value, like 40, then I can use Wilcoxon or Sign Test to see if the null hypothesis is accepted or rejected. Then I can construct a confidence interval. But he doesn't give any such value. So how can I construct a confidence interval? Do I just use the sample median? If so, then the null will obviously be accepted right? And so I just construct a confidence interval for that?

Does this sound right or am I totally misinterpreting the question?

Dragan

Super Moderator
The question is:

A traffic warden notes the time cars have been illegally parked after their metered time has expired. For 16 offending cars he records the time in minutes as:
10 42 29 11 63 145 11 8 23 17 5 20 15 36 32 15
Obtain an appropriate 95 per cent confidence interval for the median overstay time of offenders prior to detection. What assumptions were you making to justify using the method you did? To what population do you think the confidence interval you obtained might apply?

I've been trying to figure out what he means by this. If I'm given a hypothetical median value, like 40, then I can use Wilcoxon or Sign Test to see if the null hypothesis is accepted or rejected. Then I can construct a confidence interval. But he doesn't give any such value. So how can I construct a confidence interval? Do I just use the sample median? If so, then the null will obviously be accepted right? And so I just construct a confidence interval for that?

Does this sound right or am I totally misinterpreting the question?
A basic (large sample) confidence interval for the median is:

Low_score=N*0.5 - 1.96*Sqrt[N*0.5*(1-0.5)] = 4

Upper_score=N*0.5 + 1.96*Sqrt[N*0.5*(1-0.5)] = 12

Thus, the median is 18.5 and has (95%) lower and upper limits of 11 and 32, respectively.

gianmarco

TS Contributor
Hi,
beside the hint provided by Dragan, I would add that a CI for the median could be assessed with bootstrapping procedure (as far as I know, systat routinely provides this).

Hope this helps,
regards
gm

Dragan

Super Moderator
Hi,
beside the hint provided by Dragan, I would add that a CI for the median could be assessed with bootstrapping procedure (as far as I know, systat routinely provides this).

Hope this helps,
regards
gm

You're right gm, the bootstrap is really, in my opinion, the best way to handle this.

And, that was my original thought too, but I don't think that is what the poster's instructor is looking for because he's asking one to make a distributional assumption.

Note that I constructed a 95% bootstrap CI for this data using S+ and it gives the same upper and lower limits as the basic (classical) technique I described above i.e. lower-limit 11 and upper-limit 32.

gianmarco

TS Contributor
Hi Dragan,
ok...fine.

I was just adding a hint complementing what you wrote.

Best regards,
gm

fed1

TS Contributor
Here is how to do exact non parametric interval. It is more difficult than asymptotic interval.

Let $$X_{i}$$ be the ith order stat. m_{0}

$$Pr( X_{i} < m_{0} < X_{k}) = .5^{n} ( \sum_{x = i}^{k-1} n / choose / x)$$

Turns out that this gives the 5th and 12th order stat 95% coverage, in very close agreement with asymptotic interval. This can be useful for small samples too!

winds

New Member
Dragan, gianmarco and fed1, thank you all so much. The nonparametric estimate was required and as fed1 noted it's close to Dragan's version. I learned a lot from this, thanks!

MJH

New Member
Here is how to do exact non parametric interval. It is more difficult than asymptotic interval.

Let $$X_{i}$$ be the ith order stat. m_{0}

$$Pr( X_{i} < m_{0} < X_{k}) = .5^{n} ( \sum_{x = i}^{k-1} n / choose / x)$$

Turns out that this gives the 5th and 12th order stat 95% coverage, in very close agreement with asymptotic interval. This can be useful for small samples too!
Can you explain this notation or suggest a source that can explain it?

MJH

New Member
A basic (large sample) confidence interval for the median is:

Low_score=N*0.5 - 1.96*Sqrt[N*0.5*(1-0.5)] = 4

Upper_score=N*0.5 + 1.96*Sqrt[N*0.5*(1-0.5)] = 12

Thus, the median is 18.5 and has (95%) lower and upper limits of 11 and 32, respectively.
What would be considered a "large sample" here?

wipeout

New Member
I agree with gianmarco, I think nonparametric bootstrap is the most suitable techniques to address this problem, the code in R would be:

library(boot) # Calling boot library
data<-c(10,42,29,11,63,145,11,8, 23 ,17 ,5 ,20 ,15, 36, 32 ,15) # Your data
boot.median<-function(x,i){median(x)} #defining median function

median.boot<-boot(data, boot.median, R=10000) # applying bootstrap
median.boot #Get results:

ORDINARY NONPARAMETRIC BOOTSTRAP

Call:
boot(data = data, statistic = boot.median, R = 10000)

Bootstrap Statistics :
original bias std. error
t1* 18.5 1.3979 5.598443

#So if you need a CI for median:
boot.ci(median.boot) # you get

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 10000 bootstrap replicates

CALL :
boot.ci(boot.out = median.boot)

Intervals :
Level Normal Basic
95% ( 6.53, 27.86 ) ( 5.00, 26.00 )

Level Percentile BCa
95% (11.0, 32.0 ) (11.0, 30.5 )

Generally, BCa interval is the best choice. Furthermorre, you can draw the histgram and Quantile Normal plot with:

plot(median.boot)