比较三种指数平滑方法预测互联网用户连接数的性能:阻尼霍尔特法最优
[]在这个示例里,我们在对通过服务器连接到互联网的用户数量进行预测时,将迄今我们所考虑的三种指数平滑方法的预测性能进行了比较。图展示了在 100 分钟这个时间段内所观测到的数据。
<p><pre class="sourceCode r"> <code class="sourceCode r"><span id="cb168-1">www_usage <span class="ot"><-</span> <span class="fu">as_tsibble</span>(WWWusage)</span>
<span id="cb168-2">www_usage <span class="sc">|></span> <span class="fu">autoplot</span>(value) <span class="sc">+</span></span>
<span id="cb168-3"><span class="fu">labs</span>(<span class="at">x=</span><span class="st">"分钟"</span>, <span class="at">y=</span><span class="st">"用户数"</span>,</span>
<span id="cb168-4"> <span class="at">title =</span> <span class="st">"每分钟的互联网用户数"</span>)</span></code></pre></p>
图8.5: 通过服务器连接到互联网的用户
我们将使用时间序列交叉验证来比较三种方法的一步预测精度。
<p><pre class="sourceCode r"> <code class="sourceCode r"><span id="cb169-1">www_usage <span class="sc">|></span></span>
<span id="cb169-2"><span class="fu">stretch_tsibble</span>(<span class="at">.init =</span> <span class="dv">10</span>) <span class="sc">|></span></span>
<span id="cb169-3"><span class="fu">model</span>(</span>
<span id="cb169-4"> <span class="at">SES =</span> <span class="fu">ETS</span>(value <span class="sc">~</span> <span class="fu">error</span>(<span class="st">"A"</span>) <span class="sc">+</span> <span class="fu">trend</span>(<span class="st">"N"</span>) <span class="sc">+</span> <span class="fu">season</span>(<span class="st">"N"</span>)),</span>
<span id="cb169-5"> <span class="at">Holt =</span> <span class="fu">ETS</span>(value <span class="sc">~</span> <span class="fu">error</span>(<span class="st">"A"</span>) <span class="sc">+</span> <span class="fu">trend</span>(<span class="st">"A"</span>) <span class="sc">+</span> <span class="fu">season</span>(<span class="st">"N"</span>)),</span>
<span id="cb169-6"> <span class="at">Damped =</span> <span class="fu">ETS</span>(value <span class="sc">~</span> <span class="fu">error</span>(<span class="st">"A"</span>) <span class="sc">+</span> <span class="fu">trend</span>(<span class="st">"Ad"</span>) <span class="sc">+</span></span>
<span id="cb169-7"> <span class="fu">season</span>(<span class="st">"N"</span>))</span>
<span id="cb169-8">) <span class="sc">|></span></span>
<span id="cb169-9"><span class="fu">forecast</span>(<span class="at">h =</span> <span class="dv">1</span>) <span class="sc">|></span></span>
https://img1.baidu.com/it/u=2139457931,276438248&fm=253&fmt=JPEG&app=138&f=JPEG?w=801&h=500
<span id="cb169-10"><span class="fu">accuracy</span>(www_usage)</span>
<span id="cb169-11"><span class="co">#> # A tibble: 3 × 10</span></span>
<span id="cb169-12"><span class="co">.model 的类型包括 ME、RMSE、MAE、MPE、MAPE、MASE、RMSSE、ACF1</span></span>
<span id="cb169-13"><span class="co">#> <chr><chr><dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl></span></span>
<span id="cb169-14"><span class="co">Damped Test 的相关数据为 0.288、3.69、3.00、0.347、2.26、0.663、0.636、0.336</span></span>
<span id="cb169-15"><span class="co">Holt Test 的数值分别为 0.0610、3.87、3.17、0.244、2.38、0.701、0.668、0.296</span></span>
<span id="cb169-16"><span class="co">SES 进行了 Test 1.46 ,各项数据分别为 6.05、4.81、0.904、3.55、1.06、1.04、0.803 。</span></span></code></pre></p>
你比较 MAE 值也好,比较 RMSE 值也罢,阻尼霍尔特法都是最好的。所以,我们会继续使用阻尼霍尔特法,并且把它运用到整个数据集上,从而获得未来几分钟的预测。
<p><pre class="sourceCode r"> <code class="sourceCode r"><span id="cb170-1">fit <span class="ot"><-</span> www_usage <span class="sc">|></span></span>
<span id="cb170-2"><span class="fu">model</span>(</span>
<span id="cb170-3"> <span class="at">Damped =</span> <span class="fu">ETS</span>(value <span class="sc">~</span> <span class="fu">error</span>(<span class="st">"A"</span>) <span class="sc">+</span> <span class="fu">trend</span>(<span class="st">"Ad"</span>) <span class="sc">+</span></span>
<span id="cb170-4"> <span class="fu">season</span>(<span class="st">"N"</span>))</span>
<span id="cb170-5">)</span>
<span id="cb170-6"><span class="co"># Estimated parameters:</span></span>
<span id="cb170-7"><span class="fu">tidy</span>(fit)</span>
<span id="cb170-8"><span class="co">#> # A tibble: 5 × 3</span></span>
<span id="cb170-9"><span class="co">#> .model termestimate</span></span>
<span id="cb170-10"><span class="co">#> <chr><chr> <dbl></span></span>
https://img1.baidu.com/it/u=1852399080,2276519152&fm=253&fmt=JPEG&app=138&f=JPEG?w=750&h=500
<span id="cb170-11"><span class="co">#> 1 Damped alpha 1.00</span></span>
<span id="cb170-12"><span class="co">#> 2 Damped beta 0.997 </span></span>
<span id="cb170-13"><span class="co">#> 3 Damped phi 0.815 </span></span>
<span id="cb170-14"><span class="co">#> 4 Damped l 90.4 </span></span>
<span id="cb170-15"><span class="co">#> 5 Damped b -0.0173</span></span></code></pre></p>
斜率的平滑参数估计接近 1 ,这显示趋势变化主要体现了互联网使用最后两分钟之间的斜率。\(\alpha\)的值近乎 1 ,意味着该水平对每一个新的观察结果都有着强烈的反应。
<p><pre class="sourceCode r"> <code class="sourceCode r"><span id="cb171-1">fit <span class="sc">|></span></span>
<span id="cb171-2"><span class="fu">forecast</span>(<span class="at">h =</span> <span class="dv">10</span>) <span class="sc">|></span></span>
<span id="cb171-3"><span class="fu">autoplot</span>(www_usage) <span class="sc">+</span></span>
<span id="cb171-4"><span class="fu">labs</span>(<span class="at">x=</span><span class="st">"分钟"</span>, <span class="at">y=</span><span class="st">"用户数"</span>,</span>
<span id="cb171-5"> <span class="at">title =</span> <span class="st">"每分钟的互联网用户数"</span>)</span></code></pre></p>
图8.6: 预测互联网用户数: 比较非季节方法的预测性能
这种预测看起来是合理的,且有下降的趋势。因为阻尼参数为 0.815 是低值,预测区间相对较宽能反映历史数据变化,所以这种下降趋势趋于平缓。预测区间是使用 节中描述的方法计算出来的。
在这个示例里,选择方法的过程比较容易。因为 MSE 和 MAE 的比较都显示出了相同的方法,即阻尼霍尔特法。不过,有时不同的准确性度量度会给出不同的预测方法,这时我们就需要决定使用哪种预测方法。预测任务可能因多个维度而有所不同,比如预测范围的长度、测试集的大小、预测误差度量以及数据的频率等。所以,一种方法不太可能在所有的预测场景中都比其他方法更好。我们对预测方法的要求是能够做出合理的预测,并且应该时常根据手头的任务来对这些预测进行评估。
页:
[1]