Simple Exponential Smoothing is similar to a moving-average, but instead of weighting each point equally, an exponential function dictates the weights assigned. Forecasts are more sensitive to recently observed values and less so to ones further back in time.
This guide will look at two approaches to SES. The first will be vectorised, where we construct an array of weights and multiply it with a time series. It will be computationally efficient, but we will discover its limitations. We'll then turn to a component form, where we travel step-by-step through a series and construct new "smoothed" series. This approach also lays the foundation for future tutorials on Double and Triple Exponential Smoothing.
Let's take a simple example where we have 5 days of recorded sales.
DAY 1 | DAY 2 | DAY 3 | DAY 4 | DAY 5 | |
---|---|---|---|---|---|
SALES | 1 | 4 | 2 | 0 | 5 |
If we were to forecast the sales for day 6 using the average of the previous days, we would just calculate
$$\frac{1+4+2+0+5}{5} = 2.4$$Another way to think of the average is to assign each value a weight. In this case, each day will have equal weighting - or you can think of it as each day having an equal influence on the final forecast. We have 5 days, so our weight will be $\frac{1}{5}=0.2$.
DAY 1 | DAY 2 | DAY 3 | DAY 4 | DAY 5 | FORECAST | |
---|---|---|---|---|---|---|
SALES | 1 | 4 | 2 | 0 | 5 | |
WEIGHT | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | |
SALES * WEIGHT | 0.2 | 0.08 | 0.04 | 0 | 1 | 2.4 |
We can multiple the weight on each day with the recorded sales and sum the result. As expected, our forecast is also 2.4. Great, you say, a longer way of calculating the average. But we can adjust the weights, and in the case of simple exponential smoothing, assign a larger weight to Day 5 and decrease it exponentially the further back in time we go. This will make forecast more sensitive to Day 5 and less to Day 1.
The function we will use to calculate the weights takes the form of $\alpha(1-\alpha)^t$, where $\alpha$ is known as the smoothing factor and $t$ is the number of steps backward from the last observed value. The value of $\alpha$ must be set to $0 < \alpha < 1$.
To begin, let's use $\alpha$ = 0.8.
$$Day 5 = 0.8 * (1 - 0.8)^0 = 0.8$$$$Day 4 = 0.8 * (1 - 0.8)^1 = 0.16$$$$...$$DAY 1 | DAY 2 | DAY 3 | DAY 4 | DAY 5 | FORECAST | |
---|---|---|---|---|---|---|
SALES | 1 | 4 | 2 | 0 | 5 | |
WEIGHT | 0.00128 | 0.0064 | 0.032 | 0.16 | 0.8 | |
SALES * WEIGHT | 0.00128 | 0.0256 | 0.064 | 0 | 4.0 | 4.09 |
We can again multiply the sales and weights and sum the results. This time our forecast is much higher, because it's heavily influenced by the last observed of 5.
There's two scenarios to consider to gain a little intuition about setting $\alpha$ and it's effect on the forecast.
So while $\alpha$ is referred to as the smoothing factor, it's actually the lower values of $\alpha$ that will give a 'smoother' result.
We can plot the plot the weights for different values of alpha to see the how the weights change, eg alpha=0.5 and 0.1.
import numpy as np
import matplotlib.pyplot as plt
alpha_high = np.array([0.5*(1-0.5)**t for t in range(9,-1,-1)])
alpha_low = np.array([0.1*(1-0.1)**t for t in range(9,-1,-1)])
# Plot weight arrays
fig, (ax1, ax2) = plt.subplots(1, 2, sharey=True, figsize=(12,4))
ax1.bar(np.arange(1,11), alpha_high)
ax1.set_ylabel('Weight')
ax1.set_title('Alpha = 0.5')
ax2.bar(np.arange(1,11), alpha_low)
ax2.set_title('Alpha = 0.1')
plt.show()
Let's go through a quick example. We'll create a random time-series with a size of 50, with values ranging between 0 and 9.
random_sample = np.random.randint(10, size=50)
# Plot randomly generated time series
plt.figure(figsize=(12,4))
plt.plot(random_sample)
plt.xlabel('Time')
plt.show()
We want to generate an array of weights using the function $\alpha(1-\alpha)^t$, which we can do using list comprehension.
n = len(random_sample)
alpha = 0.1
weights = np.array([alpha*(1-alpha)**t for t in range(n-1,-1,-1)])
# Plot array of weights
plt.figure(figsize=(12,4))
plt.bar(np.arange(1,n+1), weights)
plt.ylabel('Weight')
plt.show()
One important check we need to do is to ensure the weights sum to 1.
print(f'Sum of Weights: {weights.sum():.5f}')
We are close to 1, but not quite there. It means we're introducing a bias, we will be underestimating the forecast slightly. Nevertheless, we'll continue on and mulitply the weights array with the random time-series that was generated.
forecast = (weights * random_sample).sum()
print(f'Forecast: {forecast:.3f}')
We can put this approach into a fairly simple function, which will return a single point forecast. It will also send a warning if the sum of the weights is below 0.99.
def ses_vectorised(ts, alpha):
"""
Vectorised approach to simple exponential smoothing, returns a
single point forecast
Parameters
----------
ts : array_like
1-D time series
alpha : float
Smoothing factor, `0 < alpha < 1`
Returns
-------
forecast : float
"""
n = len(ts)
ts = np.array(ts)
weights = np.array([alpha*(1-alpha)**t for t in range(n-1,-1,-1)])
weights_tot = weights.sum()
if weights_tot < 0.99:
print(f'Warning: weights sum to {weights_tot:.3f}, larger alpha or ' +
'longer time-series required.')
return np.sum(weights * ts)
As a quick test of the function, we'll pass the original sample and a smoothing factor of 0.05.
forecast = ses_vectorised(random_sample, alpha=0.05)
print(f'Forecast: {forecast:.3f}')
The sum of the weights is too low, so we need to either increase the value of alpha or use a longer time-series. This is not ideal, as more data is not always be available or perhaps we want to use a low value for alpha. So we will pivot to a different approach to SES.
The component approach to SES overcomes the length limitation of the vectorised method. This will involve a step-by-step iteration through the series, creating a secondary "smoothed" series which will be the forecast, $F_t$. It follows the equation,
$$F_t = \alpha A_{t-1} + (1 - \alpha)F_{t-1}$$Using the same sales example from earlier, we'll begin by first setting $F_1 = A_1$.
TIME (t) | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
ACTUAL ($A_t$) | 1 | 4 | 2 | 0 | 5 |
FORECAST ($F_t$) | 1 |
We are now able to calculate the value for $F_2$. $$\begin{aligned} F_2 &= \alpha A_{1} + (1 - \alpha)F_{1} \\ &= 0.8*1 + (1-0.8)*1 \\ &= 1 \end{aligned}$$
Just a note that when $F_1 = A_1$, the equation will always reduce such that $F_2 = A_1$, so this first step can sometimes be skipped. Nevertheless, we can continue on, applying the formula step-by-step to fill in the rest of the table.
$$\begin{aligned} F_3 &= \alpha A_{2} + (1 - \alpha)F_{2} \\ &= 0.8*4 + (1-0.8)*1 \\ &= 3.4 \end{aligned}$$ $$...$$TIME (t) | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
ACTUAL ($A_t$) | 1 | 4 | 2 | 0 | 5 |
FORECAST ($F_t$) | 1 | 1 | 3.4 | 2.28 | 0.45 |
We can also take a step further and calculate a forecast for Day 6 using the final values in this table.
$$\begin{aligned} F_6 &= \alpha A_{5} + (1 - \alpha)F_{5} \\ &= 0.8*5 + (1-0.8)*0.45 \\ &= 4.09 \end{aligned}$$This approach to SES can now be put into a function.
def ses(ts, alpha):
"""
Perform simple exponential smoothing on an array and
return the smoothed series
Parameters
----------
ts (N,) : array_like
1-D time series
alpha : float
Smoothing factor, `0 < alpha < 1`
Returns
-------
forecast (N+1,) : ndarray
1-D forecast array
"""
n = len(ts) + 1
forecast = np.zeros(n)
forecast[0] = ts[0]
for i in range(1,n):
forecast[i] = alpha*ts[i-1] + (1-alpha)*forecast[i-1]
return forecast
The random_sample time series from earlier can be passed to the ses() function to return a smoothed series. We'll set alpha=0.3 and then obtain the smoothed series which will be plotted on top of the original series.
forecast = ses(random_sample, alpha=0.3)
print(f'Final value: {forecast[-1]:.3f}')
# Plot results
plt.figure(figsize=(12,4))
plt.plot(random_sample, label='Actual')
plt.plot(forecast, linestyle='--', label='SES, alpha=0.3')
plt.xlabel('Time')
plt.legend()
plt.show()
The value of alpha will affect the smoothed series and ultimately the forecast. So how do we choose an appropriate value? This is explored further in the next tutorial, where we will look at a Greenhouse Gas Emissions dataset and optimise the selection of alpha for a given series.