Which is the better prediction model?

The aim is to predict the breakdown time of a machine as a percentage of scheduled hours for the next day. So my time series looks like this,

Break_down_percentage = 7%, 8%, 10%, 6%, 12 % etc.

There are 315 data points which can be used to test the different models. I used ets(), arima() and nnetar() as a rolling window prediction to identify the best prediction model. I also wanted to determine a good size of the rolling window that can be used to predict the next day value. So I tried all the models for different rolling window sizes to estimate a good model and a good size of the past data to predict the future value.

I plotted the MAPE values against the rolling window size as shown in the graph. From the graph it can be seen that the ideal size of the window is using all of the data points but it is not practical to use all of the previous data points considering the system dynamics ( very old data may not be useful in predicting the future value as lot of things might have changed in the machine until then. Experts view in manufacturing is that the last two to three months values (~ 60 to 90 data points will be useful). Also, using n-1 data points to train the model has only 1 data point to test the model and that could also be a reason for low MAPE for last window size.

So my question is,

ARMA model is found to better than ETS between 60 to 90 data points.And the corresponding lowest MAPE value is 8.8% and the corresponding window size is 74. Is this very high value for the rolling window prediction or should it be advisable to use more data for future prediction (increasing window size)
Are there any other models that would be interesting to try out in addition to the above models?
Note. I am a self-learner on this topic and didn't attend any formal classes on data science prediction. So apologies if this is obvious.


No cake for spunky
I am not sure what ETS is but exponential smoothing (of which there is a series of models depending on trend and seasonality) has been shown to work as well (admittedly in tests that are now nearly 40 years old, not sure if this analysis has ever been reproduced). In particular ARIMA has serious problems with non-linearity while exponential smoothing sometimes can work with that.

Unless someone has done this type of analysis they are unlikely to be able to comment on what is a "high" MAPE. With phenomenon that is highly predictable (where many types of time series works well) the results you found might be poor. In highly unpredictable, highly varied environments it might be higher than normal. Is there any academic or industrial engineering literature on this topic?

The advantage of exponential smoothing models (beyond simplicity, robustness, and the points I made above) are that they effectively weight recent data more than past data. So you don't have to get rid of past data just because that which occurred in the past is less pertinent. Generally you don't want to get rid of data in time series unless you think there is a structural break (that is the process has changed significantly I think in this case).

I am hardly an expert in time series either, I spend a lot of time learning it, but its not a simple method. Personally I would not have started with ARIMA because trying to identify the real model (P,D,Q) or a seasonal model is far from easy with real world data. The ideal types presented in books are rarely going to show up in real world data.
Hello. Thank you for taking time and answering my question.

ETS () is exponential smoothing. I agree that ARIMA is much difficult to implement in practice and it is ideal in literature examples. I haven't seen any such examples in the industrial engineering area. Forecasting is very popular when it comes to product demand forecasting etc but it's not yet popular within hardcore operations.

I agree to the advantages you pointed out for exponential smoothing. But operations is so dynamic in nature that everything gets changed with less time. So the structue of the time series data also changes every time. not sure how to build a model that can capture this.


TS Contributor
You might also consider a different approach. Reliability and Maintainability techniques are often used to model machine breakdowns. Hazard rates would tell you whether the breakdowns are increasing, decreasing or constant. Reliability growth models would allow you to make predictions.


No cake for spunky
A lot of forecasts in business are likely what are called judgmental. They use their professional expertise (domain knowledge) rather than formal tests ore they adjust the statistics by their knowledge. Exponential smoothing is used a lot I hear there, it is certainly far easier to do and requires much less expertise than ARIMA.

I would think Cox Proportional Hazard models would show breakdowns and the predictors that drove them. But that is hardly an easy method to learn.

A good rule of thumb is always do what miner suggests in something involving industrial engineering. I have for years :)
I guess that the aim is not:

The aim is to predict the breakdown time
but rather to find explanations and the cause of machine breakdown. To predict is not the same as finding the cause.

Then, as Miner, I would suggest to look at reliability issues.

First, about the time series data: are there any auto correlation for 7 days of lag?

Second, about the causes for breakdown. Is it the repairing time that increases the time, or if repair time is constant, is it the severity of the of the break down that increases the time?

Maybe there are several types of breakdowns, then you will have several dependent variables (all measured in time).

I would look at the data with a possibly skewed distribution, like the exponential distribution, gamma distribution. Maybe there are many days with no breakdowns at all. Then you would need a zero inflated distribution.

Often there are explanatory background variables ("x-variables") that can explain the problem. Search for recorded data. There seem to be a downward trend for the data, so it seem like they have figured out some of the problems (found some of the causing x-variables, or been able to control them better.)

There is also maybe the need to do a robust construction along the Taguchi designs. An other possibility is "Evolutionary operations".

Just because someone has given you some data, doesn't mean that all the answers are in these data. Often you need to get other information. This is not a school assignment (or is it?)


TS Contributor
it seems to me that this is not so much about whether breakdowns happen (we basically have non-zero percentages all the time?) but more about the availability of the machine - actually 100-Availability, per day. So, I guess survival analysis is not suitable for this data - we do not know how many breakdowns happened, when etc. Is a low availability driven by many breakdowns, quickly repaired or did we only have one breakdown that took longer to fix?

So, in order to go deeper, I guess you would need to disengage these two elements - to do a survival type analysis you would need the number and maybe times of the breakdowns, to analyse the availability the waiting times to repair and the duration of repairs.