异常检测

导入包

首先，我们导入本教程所需的包并创建一个 NixtlaClient 的实例。

import pandas as pd
from nixtla import NixtlaClient

nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍 使用 Azure AI 端点

要使用 Azure AI 端点，请设置 base_url 参数

nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")

加载数据集

现在，让我们加载本教程的数据集。我们使用 Peyton Manning 数据集，该数据集跟踪了 Peyton Manning 的维基百科页面的访问量。

df = pd.read_csv('https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv')
df.head()

	ds	y
0	2007-12-10	9.590761
1	2007-12-11	8.519590
2	2007-12-12	8.183677
3	2007-12-13	8.072467
4	2007-12-14	7.893572

nixtla_client.plot(df, max_insample_length=365)

我们现在执行异常检测。默认情况下，TimeGPT 使用 99% 的置信区间。如果一个点落在该区间之外，则被认为是异常。

anomalies_df = nixtla_client.detect_anomalies(df, freq='D')
anomalies_df.head()

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Querying model metadata...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Anomaly Detector Endpoint...

	ds	y	TimeGPT	TimeGPT-hi-99	TimeGPT-lo-99	anomaly
0	2008-01-10	8.281724	8.224187	9.503586	6.944788	False
1	2008-01-11	8.292799	8.151533	9.430932	6.872135	False
2	2008-01-12	8.199189	8.127243	9.406642	6.847845	False
3	2008-01-13	9.996522	8.917259	10.196658	7.637861	False
4	2008-01-14	10.127071	9.002326	10.281725	7.722928	False

📘 Azure AI 中可用的模型

如果您使用 Azure AI 端点，请务必设置 model="azureai"

nixtla_client.detect_anomalies(..., model="azureai")

对于公共 API，我们支持两个模型：timegpt-1 和 timegpt-1-long-horizon。

默认情况下使用 timegpt-1。请参阅本教程了解如何以及何时使用 timegpt-1-long-horizon。

如您所见，False 被分配给“正常”值，因为它们落在置信区间内。然后将标签 True 分配给异常点。

我们还可以使用 NixtlaClient 绘制异常。

nixtla_client.plot(df, anomalies_df)

使用外生特征进行异常检测

之前，我们在没有使用任何外生特征的情况下执行了异常检测。现在，可以专门为此场景创建特征，以便在模型执行异常检测任务时为其提供信息。

在这里，我们创建模型可以使用的日期特征。

这通过使用 date_features 参数来完成。我们可以将其设置为 True，它将从给定的日期和数据频率生成所有可能的特征。或者，我们可以指定我们想要的特征列表。在本例中，我们只想要月份（month）和年份（year）级别的特征。

anomalies_df_x = nixtla_client.detect_anomalies(
    df,
    freq='D', 
    date_features=['month', 'year'],
    date_features_to_one_hot=True,
)

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Using the following exogenous features: ['month_1.0', 'month_2.0', 'month_3.0', 'month_4.0', 'month_5.0', 'month_6.0', 'month_7.0', 'month_8.0', 'month_9.0', 'month_10.0', 'month_11.0', 'month_12.0', 'year_2007.0', 'year_2008.0', 'year_2009.0', 'year_2010.0', 'year_2011.0', 'year_2012.0', 'year_2013.0', 'year_2014.0', 'year_2015.0', 'year_2016.0']
INFO:nixtla.nixtla_client:Calling Anomaly Detector Endpoint...

然后，我们可以绘制检测到的异常，模型现在使用了外生特征的额外信息。

nixtla_client.plot(df, anomalies_df_x)

修改置信区间

我们可以使用 level 参数调整置信区间。该参数接受 0 到 100 之间的任何值，包括小数。

减小置信区间会检测到更多异常，而增大置信区间则会减少异常数量。

例如，在这里我们将区间减小到 70%，我们会注意到绘制了更多异常（红点）。

anomalies_df = nixtla_client.detect_anomalies(
    df, 
    freq='D',
    level=70
)

INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Anomaly Detector Endpoint...

nixtla_client.plot(df, anomalies_df)

入门

功能

部署

教程

用例

API 参考

异常检测

导入包

加载数据集

异常检测

使用外生特征进行异常检测

修改置信区间

入门

功能

部署

教程

用例

API 参考

​导入包

​加载数据集

​异常检测

​使用外生特征进行异常检测

​修改置信区间

导入包

加载数据集

异常检测

使用外生特征进行异常检测

修改置信区间