Normality and Box Cox transformation

#1
Hi...I have been studying Time series forecasting for a while and come up with below queries... Can anyone please help to provide simple explanation for the below queries,

1. Why do we have to normalise the data using transformations, won't it affect the relationship between variables? Although it seems to distort the scale of the data but it still reduce the effect of outliers?

2. I see that white noise for error should have 0 to very minimum co relation so that the error must be completely random and proving that the model is good but if the errors possess normality or stable variance then does it imply that error is not following any pattern and we didn't miss any variable to fit into our model?. Does residuals need to have normality to show it is a good model?

3. How is homeosedacity different from white noise?

4. How to calculate optimal lambda for box Cox transform?

5. Is there any simple derivation available for box Cox transformation?

Please if anyone can address all these questions...it will be very really helpful... thank you in advance.
 

Miner

TS Contributor
#2
I cannot answer all of your questions, but can refer you to information about Q4 an Q5. See Minitab Help Methods and Formulas for Box - Cox transform.

Regarding Q3, homoscedacity is typically used to indicate that the variance does not change with the magnitude of the variable. White noise would mean that there are only random elements of variation present, and there are no signals such as a trend or seasonality. It may help you understand the concept to see how it is used in industrial statistics through control charts. Control charts quantify and and use the random element of variation (common cause variation) to highlight the non-random element or signal (special cause variation).
 
Last edited:

noetsi

Fortran must die
#3
I am not a statistician, just someone who has read time series a good while. Let the buyer beware :p

If anything time series analyst tend to ignore these issues more than most statistical analyst. This is not just my opinion it is the view of other experts I have read.

Normality has nothing to do with forecasting point values. So if forecasting point values is what you are interested in I would not worry about it. In the time series literature I have seen it gets limited attention. Transformations are often used in econometrics based on a specific economic model or theory . Normality does in influence the p values and thus the statistical test. From my observation time series analysis pays little attention to these tests in most cases (unless you are doing something like regression with ARIMA error).. For classical exponential smoothing models, unlike ARIMA, it is ignored since that method has no known distribution [although in recent decades it has been realized these are transfer functions which do have theoretical distributions].

White noise simply means the model has no serial patterns in it as far as I know. I have never seen heteroscedastcity raised in discussions of it. Here is one view of this...

It is often incorrectly assumed that Gaussian noise (i.e., noise with a Gaussian amplitude distribution – see normal distribution) necessarily refers to white noise, yet neither property implies the other. Gaussianity refers to the probability distribution with respect to the value, in this context the probability of the signal falling within any particular range of amplitudes, while the term 'white' refers to the way the signal power is distributed (i.e., independently) over time or among frequencies.
But there are many definitions of white noise and in some cases heteroscedastcity may matter.

https://en.wikipedia.org/wiki/White_noise

R has a way to determine lambda but I do not know what search mechanism is utilized.
 
Last edited:
#4
I cannot answer all of your questions, but can refer you to information about Q4 an Q5. See Minitab Help Methods and Formulas for Box - Cox transform.

Regarding Q3, homoscedacity is typically used to indicate that the variance does not change with the magnitude of the variable. White noise would mean that there are only random elements of variation present, and there are no signals such as a trend or seasonality. It may help you understand the concept to see how it is used in industrial statistics through control charts. Control charts quantify and and use the random element of variation (common cause variation) to highlight the non-random element or signal (special cause variation).
Thank you for this explanation
 
#5
I am not a statistician, just someone who has read time series a good while. Let the buyer beware :p

If anything time series analyst tend to ignore these issues more than most statistical analyst. This is not just my opinion it is the view of other experts I have read.

Normality has nothing to do with forecasting point values. So if forecasting point values is what you are interested in I would not worry about it. In the time series literature I have seen it gets limited attention. Transformations are often used in econometrics based on a specific economic model or theory . Normality does in influence the p values and thus the statistical test. From my observation time series analysis pays little attention to these tests in most cases (unless you are doing something like regression with ARIMA error).. For classical exponential smoothing models, unlike ARIMA, it is ignored since that method has no known distribution [although in recent decades it has been realized these are transfer functions which do have theoretical distributions].

White noise simply means the model has no serial patterns in it as far as I know. I have never seen heteroscedastcity raised in discussions of it. Here is one view of this...



But there are many definitions of white noise and in some cases heteroscedastcity may matter.

https://en.wikipedia.org/wiki/White_noise

R has a way to determine lambda but I do not know what search mechanism is utilized.
How come you are saying most of these models like exponential smoothing won't fit into any Probability Distribution? Like normal distribution..Even though the goal is to forecast point values but that is something which is more like a mean of any forecast/probability distribution within its Prediction Interval?
 
#6
I cannot answer all of your questions, but can refer you to information about Q4 an Q5. See Minitab Help Methods and Formulas for Box - Cox transform.

Regarding Q3, homoscedacity is typically used to indicate that the variance does not change with the magnitude of the variable. White noise would mean that there are only random elements of variation present, and there are no signals such as a trend or seasonality. It may help you understand the concept to see how it is used in industrial statistics through control charts. Control charts quantify and and use the random element of variation (common cause variation) to highlight the non-random element or signal (special cause variation).
So the White noise error is completely random and if it's present it can be pretty obvious on any Forecasting Error measures right? Like it will be obvious while assessing RSME
 

Miner

TS Contributor
#7
So the White noise error is completely random and if it's present it can be pretty obvious on any Forecasting Error measures right? Like it will be obvious while assessing RSME
Yes, white noise is random, and it will always be present, in naturally occurring time series. Yes, this noise is a component of forecasting error, but is not the only component. Using the wrong forecasting method would also generate forecasting error.
 

noetsi

Fortran must die
#8
How come you are saying most of these models like exponential smoothing won't fit into any Probability Distribution? Like normal distribution..Even though the goal is to forecast point values but that is something which is more like a mean of any forecast/probability distribution within its Prediction Interval?
The people who created Exponential Soothing did not build a theoretical model for it - it was purely an empirical model. It had no known distribution when these were created and thus no confidence interval. Much later it was realized that ES were special forms of transfer functions so a theoretical distribution could be determined