Student's t-distribution table

JBtje

New Member
#1
Dear all,

In a Lineair Single Regression example, I have a hypothesis:
H_0: ß = 0.

Normally T will than be:
T= (b - ß) / se(b)
but, since I test wherever ß = 0, i can use:
T= b / se(b)

Question 1) Is this correct? or is there another reason why i may use T = b / sec(b) ? (in stead of (b-ß)/se(b))


Now according to the book: (b-ß) / se(b) ~ t_(n-2)

So, if my N=20, I have to look in the Student's t-distribution table where "N" (or "df" as it is called in my table) is "18" (18=20-2)

Question 2) Why? Why do I have to look in the row where (df=18), why not t_(n-3) (df=17) or t_n (df=20)

Could anyone explain me why T = (b - ß) / se(b) commes with t_(n-2)? and not n-3, n-4, etc...?


Thank you!
Jeffrey
 

trinker

ggplot2orBust
#2
Because notation of equations and regression models differs from book to book, software program to program, person to person I don't fully understand what you're asking in question number 1. I think maybe standardized and unstandardized betas but this should be explicitly stated.

For question 2 the degrees freedom comes from your number of observations (n) and the number of parameters you're estimating (with a t distribution this is always 2 to my knowledge). Hence n-2=df.

Please clarify Q1 and perhaps we can offer further insight there as well.
 

JBtje

New Member
#3
Hello Trinker,

Thanks for the clarification and the quick reply!

Q1:
If one wants to calculate the Confidence Interval for ß with 95%, than one has the "Test/Theory"
T=(b-ß) / se(b)

Lets say N=20, so df=18, than one gets:

-2.101 < (b-ß) / se(b) < 2.101
which can be rewritten to:
b - 2.101 x se(b) < ß < b + 2.101 x se(b)
Filling in the numbers for b and se(b) gived the Confidence Intervall of ß... great (and understandable)...

But now, if I want to test H_0 : ß = 0 against H_1 : ß != 0 (not is 0)
Than (translated from book)
From the result (b-ß) / se(b) ~ t_(n-2) we can derivate the Test Statistic.
Using the nulhypothesis, (b-ß) / se(b) will change to T = b / se(b), with t_(n-2)
so, using the nullhypothesis, suddenly T = b / se(b) can be used because it it a derivation. Thought why? how come this " - ß" can suddenly be ignored? has it something to do with the hypothesis that H_0 : ß = 0 ? Can the " - ß" also be ignored if the hypothesis is H_0 : ß = 10 ?

I hope this makes it a little more clear :)

Jeffrey
 

Dason

Ambassador to the humans
#4
Question 1) Is this correct? or is there another reason why i may use T = b / sec(b) ? (in stead of (b-ß)/se(b))
The reason you use b/se(b) is just because you're hypothesizing ß=0 so the ß disappears from the formula. T distributions can come about in other applications but I wouldn't worry about that stuff yet.


(with a t distribution this is always 2 to my knowledge). Hence n-2=df.
That isn't correct.

You were correct in that the number of degrees of freedom is N-k where k is the number of parameters in the model but it doesn't always need to be 2 (unless you're just doing simple linear regression).
 

Dason

Ambassador to the humans
#5
so, using the nullhypothesis, suddenly T = b / se(b) can be used because it it a derivation. Thought why? how come this " - ß" can suddenly be ignored? has it something to do with the hypothesis that H_0 : ß = 0 ? Can the " - ß" also be ignored if the hypothesis is H_0 : ß = 10 ?
When doing testing you can't just ignore the ß when calculating the t-statistic - you plug in the hypothesized value.

While constructing the confidence interval the ß term is what you're constructing the CI for so it's not really ignored but it's not a part of the calculation itself as you show.
 

trinker

ggplot2orBust
#6
I thought this was a simple t-test (for which I believe n-2=df is correct) but now get that this is the slopes of a regression model (read this too quickly; thanks for fixing my misinforming the poster dason :); you're still a dirty bot)
 

JBtje

New Member
#7
Hello,

Thank you both for the replies!
Although the answears triggers more questions here, for now i know enough :D
I'll continue studying / reading, and when the next question arrises, I know where to find you :D

Thanks again!
Jeffrey
 

JBtje

New Member
#8
Hello,

Well… im stuck again :(

Question 1) The first question I’m dealing with, is with Alpha…
I have written down a whole test with different outcomes, to try to understand what “Alpha” actually is. But I would like some confirmation from someone who actually knows it :D

Ok, is it correct to say that Alpha (in a Student’s T-table) is a surface of X% of the total surface under the “wave shaped” graphic on the outer left and right.
So, in the next graphic the blue filled area’s are “Alpha”, correct?


Qusetion 2)
1) Y_i = Alpha + ßx_i + e_i; with e_i independent and N(o, Sigma^2) devided
2) H_0 : ß = 0 VS H_1 : ß != 0
3) T = b / se(b) with se(b) = s / (s_x * sqrt(n-1))
4) Under H_0 : T ~ t_(n-2) = t_79 (N=81)
5) T = 0.83 / 0.065 = 12.77

Ok, and now I need to get the P-value, but how?
6) P(|T|>= 12.77)

Is it correct to say that the P-value is a value like “alpha” that indicates how likely it is that P(|T|>= 12.77) is true? Or to turn it around: What alpha is needed to be, to have the outcome of T to be in the critical area?

p.s. i would guess one can find the P-value by looking in the student's T-table, in the buttom row (where df = infinity, since t_81) and than search for a value that is slightly higher than (in this case) 12.77 ?

Thanks!
Jeffrey