Is a Z score of 21.4 a valid value?

#1
Hi folks,

I have a dataset that is highly skewed right. When calculating Z-score, I was expecting to see values up to as high as 6 (6 standard deviations from the mean). In my head, I was thinking that 6 was the highest possible value. I'm pretty certain my calculation is correct, but would just like a "nod" that it is technically possible to have a Z score of 21.4, or a "shake" that no, that's not possible.

The mean is 11452.7 and stdev is 53563.65.

Thank you!
 
#3
Z scores can be from negative infinity to positive infinity. There is no theoretical upper or lower limit. In a normal distribution, values beyond 3 or -3 become extremely few in number, but it is possible to determine mathematically the probability that z would be, say 230 or greater. Don't hold your breath waiting for a value in that range! On the other hand, in a highly skewed distribution, a large z is more likely. It is, however, still pretty odd. To get a z of 10.8 I had to create a very odd distribution with 118 values of 1, and one value of 200. Mean 2.67, SD 18.24, z for the value of 200 is 10.8. Very unusual pattern. Do you have a single value of over 1 million? Take note that with a single value that large and a sample size of 100, the mean would be larger than that shown (assuming all values are positive) just based on the one value!
 
#4
Followup. Do you remember (have you seen) Chebyshev's inequality? It says that regardless of the form of the distribution, no more than 1/k-squared of the values can be > k SD's from the mean. So to get a z value over 21 you would need over 400 observations. (21^2 = 441). Do you have that many? I notice that in my odd distribution, increasing that largest value from 200 to 1,000 does not change the z score. I cannot get a z value over the square root of my sample size (10.86), which is predictable from the inequality.
 
#5
Followup. Do you remember (have you seen) Chebyshev's inequality? It says that regardless of the form of the distribution, no more than 1/k-squared of the values can be > k SD's from the mean. So to get a z value over 21 you would need over 400 observations. (21^2 = 441). Do you have that many? I notice that in my odd distribution, increasing that largest value from 200 to 1,000 does not change the z score. I cannot get a z value over the square root of my sample size (10.86), which is predictable from the inequality.
Thanks EdGr for the pointer to Chebyshev's inequality-- hadn't heard of that!

The data has seven occurrences of 21 standard deviations, and there are 11,862 total observations in the dataset (see breakdown below). So, if I understand this correctly, only 1/(21^2) or at most .2268% of the population is expected to be 21 standard deviations from the mean. In fact, as you'll see below 0.06% of the population have 21 stdevs. So this means, Chebyshev's inequality theorem supports this, correct?


Standard Deviations Frequency % of total
1 11599 97.79%
2 139 1.17%
4 43 0.36%
5 16 0.13%
6 11 0.09%
7 15 0.13%
8 11 0.09%
9 1 0.01%
10 5 0.04%
11 3 0.03%
12 1 0.01%
13 1 0.01%
14 2 0.02%
15 3 0.03%
16 1 0.01%
18 3 0.03%
21 7 0.06%
Grand Total 11861 100.00%

Thanks again for sharing this theorem!
JJ
 
#6
That has to be the most right skewed data I have ever seen! Does the degree of skew make sense to you given the nature of the data? Otherwise it appears your interpretation is correct. You have a z value of 21...
 
#7
That has to be the most right skewed data I have ever seen! Does the degree of skew make sense to you given the nature of the data? Otherwise it appears your interpretation is correct. You have a z value of 21...
Yes, the heavy right skew does make sense for this attribute. Thanks again. :tup: