分位数预测

在预测中，我们常常对预测的分布感兴趣，而不仅仅是点预测，因为我们想了解预测周围的不确定性。

为此，我们可以创建分位数预测。

分位数预测具有直观的解释，因为它们表示预测分布的特定百分位数。这使我们可以做出诸如“我们预计 90% 的航空旅客观测值将高于 100”之类的陈述。这种方法有助于在不确定性下进行规划，提供一系列可能的未来值，并帮助用户通过考虑所有可能的未来结果来做出更明智的决策。

使用 TimeGPT，我们可以创建预测分布，并提取指定百分位数的分位数预测。例如，第 25 和第 75 分位数分别提供了对预期结果的下四分位数和上四分位数的见解，而第 50 分位数（即中位数）则提供了中心估计值。

TimeGPT 使用共形预测来生成分位数。

1. 导入包

首先，我们导入所需的包并初始化 Nixtla 客户端

import pandas as pd
from nixtla import NixtlaClient

from IPython.display import display

nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍 使用 Azure AI 端点

要使用 Azure AI 端点，请设置 base_url 参数

nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")

2. 加载数据

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df.head()

	时间戳	值
0	1949-01-01	112
1	1949-02-01	118
2	1949-03-01	132
3	1949-04-01	129
4	1949-05-01	121

3. 使用分位数进行预测

使用 TimeGPT 进行时间序列预测时，您可以设置想要预测的分位数。操作方法如下

quantiles = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
timegpt_quantile_fcst_df = nixtla_client.forecast(
    df=df, h=12, 
    quantiles=quantiles, 
    time_col='timestamp', target_col='value',
)
timegpt_quantile_fcst_df.head()

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...

	时间戳	TimeGPT	TimeGPT-q-10	TimeGPT-q-20	TimeGPT-q-30	TimeGPT-q-40	TimeGPT-q-50	TimeGPT-q-60	TimeGPT-q-70	TimeGPT-q-80	TimeGPT-q-90
0	1961-01-01	437.837952	431.987091	435.043799	435.384363	436.402155	437.837952	439.273749	440.291541	440.632104	443.688812
1	1961-02-01	426.062744	412.704956	414.832837	416.042432	421.719196	426.062744	430.406293	436.083057	437.292651	439.420532
2	1961-03-01	463.116577	437.412564	444.234985	446.420233	450.705762	463.116577	475.527393	479.812921	481.998169	488.820590
3	1961-04-01	478.244507	448.726837	455.428375	465.570038	469.879114	478.244507	486.609900	490.918976	501.060638	507.762177
4	1961-05-01	505.646484	478.409872	493.154315	497.990848	499.138708	505.646484	512.154260	513.302121	518.138654	532.883096

📘 Azure AI 中可用的模型

如果您使用 Azure AI 端点，请务必设置 model="azureai"

nixtla_client.forecast(..., model="azureai")

对于公共 API，我们支持两种模型：timegpt-1 和 timegpt-1-long-horizon。

默认使用 timegpt-1。有关如何以及何时使用 timegpt-1-long-horizon 的信息，请参阅本教程。

TimeGPT 将以 TimeGPT-q-{int(100 * q)} 的格式返回每个分位数 q 的预测结果。

nixtla_client.plot(
    df, timegpt_quantile_fcst_df, 
    time_col='timestamp', target_col='value',
)

重要的是要注意，分位数（或多个分位数）的选择取决于您的具体用例。对于风险较高的预测，您可能倾向于使用更保守的分位数，例如第 10 或第 20 百分位数，以确保为最坏情况做好准备。另一方面，如果您处于过度准备成本很高的情况下，您可能会选择更接近中位数的分位数，例如第 50 百分位数，以平衡谨慎和效率。

例如，如果您在大型促销活动期间管理零售企业的库存，选择较低的分位数可能有助于您避免库存不足，即使这意味着您可能会稍微多备一些库存。但是，如果您是为餐厅安排员工，您可能会选择更接近中位数的分位数，以确保在不过度配备人员的情况下手头有足够的员工。

最终，选择取决于您在特定情境中理解风险和成本之间的平衡，而使用 TimeGPT 的分位数预测可以帮助您完美地调整您的策略以适应这种平衡。

历史预测

您也可以通过添加 add_history=True 参数来计算历史预测的分位数预测，如下所示

timegpt_quantile_fcst_df = nixtla_client.forecast(
    df=df, h=12, 
    quantiles=quantiles, 
    time_col='timestamp', target_col='value',
    add_history=True,
)
timegpt_quantile_fcst_df.head()

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Calling Historical Forecast Endpoint...

	时间戳	TimeGPT	TimeGPT-q-10	TimeGPT-q-20	TimeGPT-q-30	TimeGPT-q-40	TimeGPT-q-50	TimeGPT-q-60	TimeGPT-q-70	TimeGPT-q-80	TimeGPT-q-90
0	1951-01-01	135.483673	111.937768	120.020593	125.848879	130.828935	135.483673	140.138411	145.118467	150.946753	159.029579
1	1951-02-01	144.442398	120.896493	128.979318	134.807604	139.787660	144.442398	149.097136	154.077192	159.905478	167.988304
2	1951-03-01	157.191910	133.646004	141.728830	147.557116	152.537172	157.191910	161.846648	166.826703	172.654990	180.737815
3	1951-04-01	148.769363	125.223458	133.306284	139.134570	144.114625	148.769363	153.424102	158.404157	164.232443	172.315269
4	1951-05-01	140.472946	116.927041	125.009866	130.838152	135.818208	140.472946	145.127684	150.107740	155.936026	164.018852

nixtla_client.plot(
    df, timegpt_quantile_fcst_df, 
    time_col='timestamp', target_col='value',
)

交叉验证

quantiles 参数也可以包含在 cross_validation 方法中，从而允许比较 TimeGPT 在不同时间窗口和不同分位数上的性能。

timegpt_cv_quantile_fcst_df = nixtla_client.cross_validation(
    df=df, 
    h=12, 
    n_windows=5,
    quantiles=quantiles, 
    time_col='timestamp', 
    target_col='value',
)
timegpt_quantile_fcst_df.head()

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
INFO:nixtla.nixtla_client:Validating inputs...

	时间戳	TimeGPT	TimeGPT-q-10	TimeGPT-q-20	TimeGPT-q-30	TimeGPT-q-40	TimeGPT-q-50	TimeGPT-q-60	TimeGPT-q-70	TimeGPT-q-80	TimeGPT-q-90
0	1951-01-01	135.483673	111.937768	120.020593	125.848879	130.828935	135.483673	140.138411	145.118467	150.946753	159.029579
1	1951-02-01	144.442398	120.896493	128.979318	134.807604	139.787660	144.442398	149.097136	154.077192	159.905478	167.988304
2	1951-03-01	157.191910	133.646004	141.728830	147.557116	152.537172	157.191910	161.846648	166.826703	172.654990	180.737815
3	1951-04-01	148.769363	125.223458	133.306284	139.134570	144.114625	148.769363	153.424102	158.404157	164.232443	172.315269
4	1951-05-01	140.472946	116.927041	125.009866	130.838152	135.818208	140.472946	145.127684	150.107740	155.936026	164.018852

cutoffs = timegpt_cv_quantile_fcst_df['cutoff'].unique()
for cutoff in cutoffs:
    fig = nixtla_client.plot(
        df.tail(100), 
        timegpt_cv_quantile_fcst_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'value']),
        time_col='timestamp', 
        target_col='value'
    )
    display(fig)

入门指南

功能

部署

教程

用例

API 参考

1. 导入包

2. 加载数据

3. 使用分位数进行预测

历史预测

交叉验证

入门指南

功能

部署

教程

用例

API 参考

​1. 导入包

​2. 加载数据

​3. 使用分位数进行预测

​历史预测

​交叉验证

1. 导入包

2. 加载数据

3. 使用分位数进行预测

历史预测

交叉验证