Time series decompositions are one of the most important forms of data in machine learning and break down a series of events over time into analyzable components. Examples of data that might form a time series include the prices of stocks at various times, the number of passengers flying on an airline per day, and the sales revenue of a company charted over time.
The components of a time series decomposition include trend, cycles, and seasonality. By decomposing a time series on the basis of such components and plotting the result, one can gain greater insight and understanding into the nature of the time series. There are two forms of decomposition—additive and multiplicative.
In additive decomposition, you model the time series as the sum of various components, while in multiplicative decomposition, you model the time series as the product of various components.
How does additive decomposition work?
One particular way to perform an additive time series decomposition is to model the time series as the sum $y_t = S_t + C_t + T_t + R_t$. $S_t$ is the variation due to seasonality—a seasonal pattern exists when variation occurs regularly over a fixed period like a week or a quarter of the year.
$C_t$ is the cyclical component, representing variations that are non-cyclical but repeated.
$T_t$ is the trend, representing the persistent increasing or decreasing progression of the data, which does not have to be linear. Finally, $R_t$ is the residuals, representing random, irregular noise left over when all the other components are subtracted out. An additive decomposition is appropriate when variations around the trend of the data do not vary with the magnitude of the trend line at any given point.
As a result, any time series in which there is a constant trend represented by an average is best modeled as an additive decomposition.
Additive models are simpler than multiplicative models, and here the seasonal and cyclic variations do not vary with the magnitude of the trend because the trend does not change at all. Examples of instances where an additive decomposition would be preferred to a multiplicative decomposition might be the level of a river as time passes or the amount of food in a refrigerator over 10 years. In both instances, the data is simply bouncing around an average as time passes.
Multiplicative decomposition in action
One particular way to perform a multiplicative decomposition is to model the time series as the product $y_t = S_tC_tT_tR_t$, where the definitions of the variables are the same as for additive decomposition. A multiplicative decomposition is appropriate when variations around the trend of the data are proportional in magnitude to the magnitude of the data at any given time.
In most instances where the trend of the data changes as time passes, a multiplicative time series decomposition is appropriate, because it’s natural for the variation around a trend to be proportional to the magnitude of the trend at any given time. Examples of instances where a multiplicative decomposition would be appropriate include the earnings of a growing company as time passes or the amount of text written by a child in school per day over a period of 10 years.
Time series decomposition in industry
As a business application that illustrates how a decomposition plays out in the real world, one could take the example of decomposing the time series data of the number of flights per month on a passenger airline over 10 years. It’s simplest to perform the decomposition without the cyclic variation, $C_t$, and assume only a seasonal variation, $S_t$. One could assume that the airline’s business is growing over time because the population is growing. If population growth is in a linear regime, the trend line $T_t$ of the time series data will be linear as well. This contrasts with the seasonal variation, $S_t$, which will be proportional to the trend line itself.
This is because the seasonal variation can be thought of as ratios of passengers willing to travel in a given month compared with those traveling in the month of January, rather than as an additive factor. The season may explain variations such as reduced travel during the winter months when it is less convenient to travel and higher travel during the summer months when many people decide to take vacations.
Finally, we consider the random noise, $R_t$, of the model. The number of passengers flying on an airline in a given month is driven by the random decisions of millions of individual people and this will produce considerable randomness in the data from month to month. Sometimes the randomness is driven by events such as unusually bad weather or pandemics.
Separating out the three components $R_t$, $S_t$ and $T_t$ provides a more accurate and insightful view of the airline’s operations for business managers or investors than simply watching the non-decomposed sum.