t, Z and sigma again

#22
Hi Joe,

I showed that when you estimate the standard deviation, the t-distribution, give better results than the normal distribution, even when n>30.
And I showed you that when you estimate the standard deviation, there is no practical difference between t and Z; and I don't understand what you're measuring.
 

Dason

Ambassador to the humans
#24
Code:
n <- 10
N <- 10000000
mu <- 0
sigma <- 1

# Tests against ho: mu=0,  we'll wrongly calculate
# the sd from the data
ztest_fromdat <- function(dat){
  xbar <- mean(dat)
  s <- sd(dat)

  z <- (xbar - 0)/(s/sqrt(n))
 
  pval <- 2*pnorm(-abs(z), 0, 1)
  return(pval)
}

single_trial <- function(n, mu, sigma){
  dat <- rnorm(n, mu, sigma)
  t.p <- t.test(dat)$p.value
  z.p <- ztest_fromdat(dat)
  output <- c(t = t.p, z = z.p)
  return(output)
}

output <- replicate(N, single_trial(n=n, mu=mu, sigma=sigma))
# Over repeated trials the probability that the p-values
# should be below our alpha (here 0.05) should equal alpha.
# that's the definition of what alpha is.
colMeans(t(output) <= 0.05)
If we run that code...

Code:
> colMeans(t(output) <= 0.05)
      t       z
0.05068 0.08229
> # This means that in our 100000 replications we rejected the null in 5.068% of the reps for the t-test
> # and we rejected the null... 8.229% of the reps for the z-test.  We used alpha=0.05 so it should be 5%.
> # the value we see in our experiment is within sampling error of the true value.  The value we see
> # for the z-test isn't.  
>
> # If you're interested in the math...
> # The value for z...
> # is awefully close to this...
> # (because it's the value it's estimating)
> pt(qnorm(0.025), 9)*2
[1] 0.08164913
So you'll see ... if you're using the z-test when you're estimating the standard deviation from your data then you're going to be wrong. Where "wrong" here means that if you say you're using alpha=0.05 it's actually higher than that. As the sample size of the data increases the 'true' alpha will converge to the nominal value that you choose but especially for small samples it's not correct.
 
#25
Thanks, and help me understand.
This starts with a standard Normal distribution, mu = 0, sigma = 1;;;;then sigma is changed, call the changed sigma "s"
What is/how is s varied? What is the variation procedure/formula?
What is n, or the range of n? How is n decided for each rep?
Thanks;
joe b.
 

Dason

Ambassador to the humans
#26
Sigma is the actual standard deviation for the population. I generate data from the theoretical population. Once that happens we treat that just like we would with real data - basically I pretend like I don't know the parameters of the distribution that generated the data. It's just data at that point. So if we are going to do a test of any kind we need to estimate those parameters. We use the sample mean and sample standard deviation for this test because we don't know what the parameters actually are (or least that's what we're treating the situation like).

From there I just conduct the t-test and z-test like we would with that data. I then do this many times and see what happens.

For our particular example every time I generate new fake data I generate a sample of size 10 (that's the 'n' in my script). I repeat the whole experiment 100000 times I guess (I think I changed that particular value after copying the code here but before running the script). Seems a bit overkill but I had time to let it run and wanted a fairly accurate sample I guess.

You're free to modify the code to see what happens for different values. The important thing to note though is that the z-test isn't correct and why do we say that? Because 1) it's just mathematically incorrect but I don't think you'd be able to follow the arguments for why but that results in 2) the actual type-I error rate isn't the same as the level of alpha we choose for the test.
 
#27
n = 10? As I have mentioned, Z is about n > 30, or 50, or 100, depending on the source. Z doesn''t work well below some n like 30. It doesn't compute when n = 10. Nobody I know would use Z in an n = 10 situation.
How about n = 30, then n = 100?
I have no clue about your script/program.
Thanks;
joe b.





Sigma is the actual standard deviation for the population. I generate data from the theoretical population. Once that happens we treat that just like we would with real data - basically I pretend like I don't know the parameters of the distribution that generated the data. It's just data at that point. So if we are going to do a test of any kind we need to estimate those parameters. We use the sample mean and sample standard deviation for this test because we don't know what the parameters actually are (or least that's what we're treating the situation like).

From there I just conduct the t-test and z-test like we would with that data. I then do this many times and see what happens.

For our particular example every time I generate new fake data I generate a sample of size 10 (that's the 'n' in my script). I repeat the whole experiment 100000 times I guess (I think I changed that particular value after copying the code here but before running the script). Seems a bit overkill but I had time to let it run and wanted a fairly accurate sample I guess.

You're free to modify the code to see what happens for different values. The important thing to note though is that the z-test isn't correct and why do we say that? Because 1) it's just mathematically incorrect but I don't think you'd be able to follow the arguments for why but that results in 2) the actual type-I error rate isn't the same as the level of alpha we choose for the test.
 
Last edited:

Dason

Ambassador to the humans
#28
I'm realizing now I basically just did a slightly more manual version of what was posted here: http://www.talkstats.com/threads/t-z-and-sigma-again.75495/post-221569

You might not think that the difference is large between the actual error rates but the difference is big enough that you should believe the z-test isn't hitting the stated type-I error rate. And yes the difference decreases as the sample size increases. That's partially why the common suggestion that if n > 30 then it doesn't really matter. But if you ask me... if it doesn't *really* matter why would you use the more inappropriate version when it's just as easy to do the appropriate version in any software?
 
#29
What IS the difference at n = 30 and 100?

I'm realizing now I basically just did a slightly more manual version of what was posted here: http://www.talkstats.com/threads/t-z-and-sigma-again.75495/post-221569

You might not think that the difference is large between the actual error rates but the difference is big enough that you should believe the z-test isn't hitting the stated type-I error rate. And yes the difference decreases as the sample size increases. That's partially why the common suggestion that if n > 30 then it doesn't really matter. But if you ask me... if it doesn't *really* matter why would you use the more inappropriate version when it's just as easy to do the appropriate version in any software?
 

Dason

Ambassador to the humans
#30
Not much. At n=30 the actual type-I error for the z-test is 0.0596722 and at n=100 it is 0.05281139. The point is that the t-test is valid and the z-test isn't.

I guess I'm wondering what your questions still are?
 
#31
Not much. At n=30 the actual type-I error for the z-test is 0.0596722 and at n=100 it is 0.05281139. The point is that the t-test is valid and the z-test isn't.

I guess I'm wondering what your questions still are?
Well, it depends where you're standing. In Stats 101, I thought and acted that the Normal distribution was of great importance, introducing the notions of a distribution, a probability distribution, the area under the curve, and CLT. About random and independent and hypothesis testing and interval probabilities vs point probababilities and all of it.
We make believe that we know variences, ANOVA, and that's sorta OK. But we don't know variances, else we'd have **** few other questions. And to suggest that a Z test is not "correct" unless we know the variance, is nibbling on the fringes, confusing the students and pissing me off. The difference with n>30, between t and Z , is inconsequential. We're in the area of claiming that the accelerator, brake and steering wheel are accelerators; because velocity is a vector quantity. Standing tall in the weeds doesn't impress or help the student. I've only been bothered by this since 1972. However, she has reported that there is pie.
joe b.
(Swiss rich folk are happier than Swiss poor folk. The told me. It's the yodeling.)
 

Dason

Ambassador to the humans
#32
We make believe that we know variences, ANOVA, and that's sorta OK
Not sure what you're talking about. ANOVA doesn't pretend to know the variances. In an ANOVA we calculate the variance from the data - just like we do with a t-test. As a matter of fact if you do an ANOVA on just two groups it's equivalent to doing a t-test.

The normal distribution IS important for a lot of things. And it's a great teaching tool. It gets used in teaching situations... and in a lot of complex situations where asymptotics are the best we can do. But nobody I know really uses a simple z-test. But you know what's a lot easier to teach initially than a t-test? That's right - the z-test. So maybe don't get so mad and just understand that it's a much simpler thing to teach conceptually - there are a lot less moving pieces when teaching that particular test.

I'm legitimately not sure why you seem so perplexed and angry about this. What's the big deal? The t-test is the appropriate thing to do. Maybe you weren't taught that but it's pretty much taught in every intro class I've ever seen.

You also seem to go on tangents and just make a lot of weird unrelated points. Not sure if that's a language barrier issue or something but you might want to work on staying on topic.
 
#33
Not sure what you're talking about. ANOVA doesn't pretend to know the variances. In an ANOVA we calculate the variance from the data - just like we do with a t-test. As a matter of fact if you do an ANOVA on just two groups it's equivalent to doing a t-test.

The normal distribution IS important for a lot of things. And it's a great teaching tool. It gets used in teaching situations... and in a lot of complex situations where asymptotics are the best we can do. But nobody I know really uses a simple z-test. But you know what's a lot easier to teach initially than a t-test? That's right - the z-test. So maybe don't get so mad and just understand that it's a much simpler thing to teach conceptually - there are a lot less moving pieces when teaching that particular test.

I'm legitimately not sure why you seem so perplexed and angry about this. What's the big deal? The t-test is the appropriate thing to do. Maybe you weren't taught that but it's pretty much taught in every intro class I've ever seen.

You also seem to go on tangents and just make a lot of weird unrelated points. Not sure if that's a language barrier issue or something but you might want to work on staying on topic.
Back in the old days, ANOVA assumed that variances were equal, giving thought that differences squared led to the conclusion that means differed. I guess that that changed.
 

Dason

Ambassador to the humans
#34
No. It still assumes variances are equal. There are alternatives that don't require that. But it doesn't require that the population variance is known which is what it sounded like you were talking about.
 

obh

Active Member
#35
What IS the difference at n = 30 and 100?
Hi Joe,

Sorry if I wasn't clear when posting the attached chart, it should answer your question.
The chart shows the actual type I error (rejecting correct H0)
Blue Z - The actual type I error for the Z test when using the sample standard deviation.
Red T - The actual type I error for the T-test.

The actual type I error for the T-test is around 0.05, as expected if the distribution is correct.




t_z_pvalue0.png View attachment 2204 1591322848018.png
 
Last edited:

obh

Active Member
#36
Ps if you use the Z-test with the population standard deviation, which is the correct use for the test, you get type I error similar to the significance level. (0.05)
This is the expected result for any test used properly.
 
#37
Its interesting to reflect that there are a variety of tests that all perform just as well as t-test asymptotically. For example consider running a t-test but deleting the last observation, clearly it is just as good as n->+inf. Well, I think alot of stats comes down to that idea that it doesn't matter that much what test you choose as long as it isn't totally screwball and n is big. sort of a "jack handy deep thoughts".
 
#38
Ps if you use the Z-test with the population standard deviation, which is the correct use for the test, you get type I error similar to the significance level. (0.05)
This is the expected result for any test used properly.
If you use the Z-test with s and reasonable n, which is the INcorrect use for the test, you get type I error similar to the significance level. (0.05)
 

obh

Active Member
#40
If you use the Z-test with s and reasonable n, which is the INcorrect use for the test, you get type I error similar to the significance level. (0.05)
Hi Joe,

If you need to hammer a nail you can use a hammer or a stone, what will you use?

Seriously, you may use the correct test with the same effort, and get an exact result, or use the incorrect test, and get an inaccurate result. so what will you use?
Correct, when using the incorrect test with a larger n you may get smaller inaccuracy, but why not using the correct test?
When you only have a stone you may use the stone, but we have the hammer :)