PS, the Likert with 5 doesn't distribute normally, but the requirement is for a residuals' normality.
|Any way in your case you use a large sample size (120) so it shouldn't be a problem even if the distribution of the residuals skewed (not symmetrical)
The second model is incorrect ...even if you ignore the interaction X1X2
I will take one row for example:
In the first model:
X1 X2 X1X2 Y
4 6 24 40
In the second model:
X1 X2 X1X2 Y
4 0 0 40
0 6 0 40...
What test do you run for each pair?
If for example, you use a significant level of 0.05 in each test
As the end in each test (pair) you run the allowed probability for type I error is 0.05 but the potential maximum allowed probability in all the test
So the probability not to get a...
What test do you want to run?
How many tests?
For example, if you compare 4 algorithms: A, B, C, D:
If you run the following tests:
A-B (for example t-test to compare A average to B average)
In this example case, you run 6 comparisons so n=6.
First, not all the correlation values in the matrix are necessarily significant ...need to be proved. (but the sample size is large)
For example, -0.004 may not be significant. (I didn't test)
The method is usually the opposite, first, you think what you want to achieve and then you...
Thanks Miner, this is a very interesting article!
I assume it is relevant for any multiple regression.
I guess this is the reason why an automatic process like the stepwise method for multiple regression is good only as a screening method and not as a way to build a model?
I think that the sample size of 50 is small enough to use the binomial distribution instead of any other approximation distribution.
Since the distribution is discrete the confidence level won't be exactly the required level.
On the other hand, maybe sample size of 50 is big enough to use...
Okay, so let's think about the theoretical question...
Probably doing it separately for each of the 15 distributions will okay.
Just an idea, what about multiple regression? (didn't say which one, for linear you need to meet the assumptions and the normality is only for the residuals)
Usually, when n=1000 you won't get df=3, but this was only an example and describe your point it well :)
Per my understanding when trying to identify the distribution you won't look only on the hedge but on the entire distribution.
Like Shapiro Wilk does for the Normal distribution and...