鲁棒预测

当数据集中存在异常值时，它们会扰乱计算出的统计摘要，例如均值和标准差，导致模型偏向异常值并偏离大多数观测值。因此，模型需要努力在准确适应异常值和在正常数据上表现良好之间取得平衡，从而提高在这两种类型数据上的整体性能。鲁棒回归算法解决了这个问题，明确考虑了数据集中的异常值。

在本 notebook 中，我们将展示如何拟合鲁棒的 NeuralForecast 方法。我们将
- 安装 NeuralForecast。
- 加载带噪声的 AirPassengers 数据集。
- 拟合和预测鲁棒化的 NeuralForecast。
- 绘制和评估预测结果。

您可以使用 Google Colab 的 GPU 来运行这些实验。

1. 安装 NeuralForecast

!pip install neuralforecast

import logging
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
from random import random
from random import randint
from random import seed

from neuralforecast import NeuralForecast
from neuralforecast.utils import AirPassengersDF

from neuralforecast.models import NHITS
from neuralforecast.losses.pytorch import MQLoss, DistributionLoss, HuberMQLoss

from utilsforecast.losses import mape, mqloss
from utilsforecast.evaluation import evaluate

logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)

2. 加载带噪声的 AirPassengers 数据集

在本例中，我们将使用经典的 Box-Cox AirPassengers 数据集，并引入异常值来扩充它。

特别是，我们将重点在于向目标变量引入异常值，使其偏离原始观测值，偏离因子为指定的倍数，例如标准差的 2 到 4 倍。

# Original Box-Cox AirPassengers 
# as defined in neuralforecast.utils
Y_df = AirPassengersDF.copy() 
plt.plot(Y_df.y)
plt.ylabel('Monthly Passengers')
plt.xlabel('Timestamp [t]')
plt.grid()

# Here we add some artificial outliers to AirPassengers
seed(1)
for i in range(len(Y_df)):
    factor = randint(2, 4)
    if random() > 0.97:
        Y_df.loc[i, "y"] += factor * Y_df["y"].std()

plt.plot(Y_df.y)
plt.ylabel('Monthly Passengers + Noise')
plt.xlabel('Timestamp [t]')
plt.grid()

# Split datasets into train/test 
# Last 12 months for test
Y_train_df = Y_df.groupby('unique_id').head(-12)
Y_test_df = Y_df.groupby('unique_id').tail(12)
Y_test_df

	unique_id	ds	y
132	1.0	1960-01-31	417.0
133	1.0	1960-02-29	391.0
134	1.0	1960-03-31	419.0
135	1.0	1960-04-30	461.0
136	1.0	1960-05-31	472.0
137	1.0	1960-06-30	535.0
138	1.0	1960-07-31	622.0
139	1.0	1960-08-31	606.0
140	1.0	1960-09-30	508.0
141	1.0	1960-10-31	461.0
142	1.0	1960-11-30	390.0
143	1.0	1960-12-31	432.0

3. 拟合和预测鲁棒化的 NeuralForecast

Huber MQ 损失

Huber 损失，用于鲁棒回归，是一种损失函数，与平方误差损失相比，它对数据中的异常值敏感性较低。Huber 损失函数对于小误差是二次函数，对于大误差是线性函数。这里我们将使用一种稍作修改的版本用于概率预测。您可以随意尝试 $\delta$ 参数。

Dropout 正则化

Dropout 技术是一种用于神经网络中防止过拟合的正则化方法。在训练过程中，dropout 在每次更新时随机地将层中的一部分输入单元或神经元设置为零，有效地“丢弃”这些单元。这意味着网络不能依赖于任何单个单元，因为它随时可能被丢弃。通过这样做，dropout 通过防止单元之间过度协同适应，迫使网络学习更鲁棒和更具泛化能力的表示。

Dropout 方法可以帮助我们增强网络对自回归特征中异常值的鲁棒性。您可以通过 dropout_prob_theta 参数来探索它。

拟合 NeuralForecast 模型

使用 NeuralForecast.fit 方法，您可以训练一组模型来适应您的数据集。您可以定义预测 horizon (在本例中为 12)，并修改模型的超参数。例如，对于 NHITS，我们更改了编码器和解码器的默认隐藏层大小。

请参阅 NHITS 和 MLP 的模型文档。

horizon = 12
level = [50, 80]

# Try different hyperparmeters to improve accuracy.
models = [NHITS(h=horizon,                           # Forecast horizon
                input_size=2 * horizon,              # Length of input sequence
                loss=HuberMQLoss(level=level),    # Robust Huber Loss
                valid_loss=MQLoss(level=level),   # Validation signal
                max_steps=500,                       # Number of steps to train
                dropout_prob_theta=0.6,              # Dropout to robustify vs outlier lag inputs
                #early_stop_patience_steps=2,        # Early stopping regularization patience
                val_check_steps=10,                  # Frequency of validation signal (affects early stopping)
                alias='Huber',
              ),
          NHITS(h=horizon,
                input_size=2 * horizon,
                loss=DistributionLoss(distribution='Normal', 
                                      level=level), # Classic Normal distribution
                valid_loss=MQLoss(level=level),
                max_steps=500,
                #early_stop_patience_steps=2,
                dropout_prob_theta=0.6,
                val_check_steps=10,
                alias='Normal',
              )
          ]
nf = NeuralForecast(models=models, freq='M')
nf.fit(df=Y_train_df)
Y_hat_df = nf.predict()

# By default NeuralForecast produces forecast intervals
# In this case the lo-x and high-x levels represent the 
# low and high bounds of the prediction accumulating x% probability
Y_hat_df

	unique_id	ds	Huber-median	Huber-lo-80	Huber-lo-50	Huber-hi-50	Huber-hi-80	Normal	Normal-median	Normal-lo-80	Normal-lo-50	Normal-hi-50	Normal-hi-80
0	1.0	1960-01-31	412.738525	401.058044	406.131958	420.779266	432.124268	406.459717	416.787842	-124.278656	135.413223	680.997070	904.871765
1	1.0	1960-02-29	403.913544	384.403534	391.904419	420.288208	469.040375	399.827148	418.305725	-137.291870	103.988327	661.940430	946.699219
2	1.0	1960-03-31	472.311523	446.644531	460.767334	486.710999	512.552979	380.263947	378.253998	-105.411003	117.415565	647.887695	883.611633
3	1.0	1960-04-30	460.996674	444.471039	452.971802	467.544189	480.843903	432.131378	442.395844	-104.205200	135.457123	729.306885	974.661743
4	1.0	1960-05-31	465.534790	452.048889	457.472626	476.141022	490.311005	417.186279	417.956543	-117.399597	150.915833	692.936523	930.934814
5	1.0	1960-06-30	538.116028	518.049866	527.238159	551.501709	563.818848	444.510834	440.168396	-54.501572	189.301392	703.502014	946.068909
6	1.0	1960-07-31	613.937866	581.048035	597.368408	629.111450	645.550659	423.707275	431.251526	-97.069489	164.821259	687.764526	942.432251
7	1.0	1960-08-31	616.188660	581.982300	599.544128	632.137512	643.219543	386.655823	383.755157	-134.702011	139.954285	658.973022	897.393494
8	1.0	1960-09-30	537.559143	513.477478	526.664856	551.563293	573.146667	388.874817	379.827057	-139.859344	110.772484	673.086182	926.355774
9	1.0	1960-10-31	471.107605	449.207916	459.288025	486.402985	515.082458	401.483643	412.114990	-185.928085	95.805717	703.490784	970.837830
10	1.0	1960-11-30	412.758423	389.203308	398.727295	431.723602	451.208588	425.829895	425.018799	-172.022018	108.840889	723.424011	1035.656128
11	1.0	1960-12-31	457.254761	438.565582	446.097168	468.809296	483.967865	406.916595	399.852051	-199.963684	110.715050	729.735107	951.728577

4. 绘制和评估预测结果

最后，我们绘制了两种模型的预测结果与实际值对比图。

并评估 NHITS-Huber 和 NHITS-Normal 预测器的准确性。

fig, ax = plt.subplots(1, 1, figsize = (20, 7))
plot_df = pd.concat([Y_train_df, Y_hat_df]).set_index('ds') # Concatenate the train and forecast dataframes
plot_df[['y', 'Huber-median', 'Normal-median']].plot(ax=ax, linewidth=2)

ax.set_title('Noisy AirPassengers Forecast', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()

为了评估中位数预测结果，我们使用平均绝对百分比误差 (MAPE)，定义如下：

$\mathrm{MAPE}(\mathbf{y}_{\tau}, \hat{\mathbf{y}}_{\tau}) = \mathrm{mean}\left(\frac{|\mathbf{y}_{\tau}-\hat{\mathbf{y}}_{\tau}|}{|\mathbf{y}_{\tau}|}\right)$

为了评估连贯的概率预测结果，我们使用连续排序概率得分 (CRPS)，定义如下：

$\mathrm{CRPS}(\hat{F}_{\tau},\mathbf{y}_{\tau}) = \int^{1}_{0} \mathrm{QL}(\hat{F}_{\tau}, y_{\tau})_{q} dq$

正如您所见，鲁棒回归的改进体现在正常预测和概率预测设置中。

df_metrics = Y_hat_df.merge(Y_test_df, on=['ds', 'unique_id'])
df_metrics.rename(columns={'Huber-median': 'Huber'}, inplace=True)

metrics = evaluate(df_metrics,
                   metrics=[mape, mqloss],
                   models=['Huber', 'Normal'],
                   level = [50, 80],
                   agg_fn="mean")

metrics

	metric	Huber	Normal
0	mape	0.034726	0.140207
1	mqloss	5.511535	61.891651

入门

功能

教程

用例

API 参考

1. 安装 NeuralForecast

2. 加载带噪声的 AirPassengers 数据集

3. 拟合和预测鲁棒化的 NeuralForecast

Huber MQ 损失

Dropout 正则化

拟合 NeuralForecast 模型

4. 绘制和评估预测结果

参考文献

入门

功能

教程

用例

API 参考

​1. 安装 NeuralForecast

​2. 加载带噪声的 AirPassengers 数据集

​3. 拟合和预测鲁棒化的 NeuralForecast

​Huber MQ 损失

​Dropout 正则化

​拟合 NeuralForecast 模型

​4. 绘制和评估预测结果

​参考文献

1. 安装 NeuralForecast

2. 加载带噪声的 AirPassengers 数据集

3. 拟合和预测鲁棒化的 NeuralForecast

Huber MQ 损失

Dropout 正则化

拟合 NeuralForecast 模型

4. 绘制和评估预测结果

参考文献