Quantifying the World

Case Study 8 - Time Series - Stock Data

Stacy Conant

March 2, 2020

Back to Top

Introduction

ARIMA, or ‘AutoRegressive Integrated Moving Average’, is a forecasting algorithm based on the idea that information from the past values of the time series can alone be used to predict the future values. Economic data, such as stock price, is notoriously tricky to analyze and forecast as there are many factors that can influence the economy. In this case study, two years of British Petroleum (BP) closing stock price data will be analyzed using ARIMA rules and AIC to decide the best fitting model. Then the analysis will be repeated using a grid search. The forecasts from each model will then be compared using the average square error (ASE). R will be used for this analysis, especially the tswge and fpp2 packages.

In [14]:
#load necessary librarys
library(quantmod)
options("getSymbols.warning4.0"=FALSE)
library(tswge)
library(ggplot2)
options(warn=-1)
library(tseries)
library(fpp2)

To pull the data, the getSymbols function from the quantmod package is used. For this analysis, the closing stock prices for BP are used. The time range is set for two years, February 2018 to Feb 2020.

In [15]:
#quantmod package - pull data
getSymbols("BP",from = "2018-2-1", to = "2020-2-3", auto.assign=TRUE)
'BP'
In [16]:
#Summary of data
summary(BP)
     Index               BP.Open         BP.High          BP.Low     
 Min.   :2018-02-01   Min.   :35.92   Min.   :36.09   Min.   :35.73  
 1st Qu.:2018-08-01   1st Qu.:39.05   1st Qu.:39.23   1st Qu.:38.70  
 Median :2019-02-01   Median :41.28   Median :41.47   Median :40.96  
 Mean   :2019-01-31   Mean   :41.38   Mean   :41.61   Mean   :41.08  
 3rd Qu.:2019-08-01   3rd Qu.:43.71   3rd Qu.:43.99   3rd Qu.:43.49  
 Max.   :2020-01-31   Max.   :47.38   Max.   :47.83   Max.   :47.38  
    BP.Close       BP.Volume         BP.Adjusted   
 Min.   :36.05   Min.   : 2348400   Min.   :33.88  
 1st Qu.:38.95   1st Qu.: 4712150   1st Qu.:36.81  
 Median :41.22   Median : 5975500   Median :38.47  
 Mean   :41.33   Mean   : 6525950   Mean   :38.45  
 3rd Qu.:43.80   3rd Qu.: 7845700   3rd Qu.:40.03  
 Max.   :47.79   Max.   :19987200   Max.   :42.95  
In [17]:
#get number of values/days
close = BP$BP.Close
nrow(close)
503

The data pulled by quantmod includes open, high, low, and adjusted prices, but for this study only the closing price will be analyzed. The mean closing price is \$41.33 and there are 503 observations.

Back to Top

Method

To analyze this stock data, the time series should be stationary. Stationarity has three conditions:

  1. Subpopulations of $𝑋_𝑡$ have the same mean for each $t$. Restated, the mean does not depend on time ($t$).
  2. Subpopulations of $X$ for a given time have a finite and constant variance for all t. Restated, the variance does not depend on time.
  3. The correlation of $𝑋_𝑡1$ and $𝑋_𝑡2$ depends only on $𝑡_2− 𝑡_1$. That is, the correlation between data points is dependent only on how far apart they are, not where they are.

Stationary data should be a "flat looking" series, without trend, with constant variance over time, a constant autocorrelation structure over time and no periodic seasonality. There are tools in R that can assist in assessing the stationarity of a time series.

For this analysis, the R package tswge will be primarily used to explore and model the data. First, the BP closing price time series will be plotted with plotts.wge().

In [18]:
#plot time series
plotts.wge(close)
title("Plot of Daily Closing Price for British Petroleum, Feb 2018 - Feb 2020")

Figure 1: Plot of Daily Closing Price for British Petroleum. It appears to have a wandering pattern with a few larger jumps in price.

Next, looking at the spectral density can give clues about the frequencies that may be present in this time series. The parzen.wge() function calculates and plots the smoothed periodogram using the Parzen window. Before the plot, it outputs the frequencies at which the smoothed periodogram is calculated (freq) and the smoothed periodogram point using the Parzen window (pzgram).

In [19]:
parzen.wge(close, trunc=100)
$freq
  1. 0.00198807157057654
  2. 0.00397614314115308
  3. 0.00596421471172962
  4. 0.00795228628230616
  5. 0.0099403578528827
  6. 0.0119284294234592
  7. 0.0139165009940358
  8. 0.0159045725646123
  9. 0.0178926441351889
  10. 0.0198807157057654
  11. 0.0218687872763419
  12. 0.0238568588469185
  13. 0.025844930417495
  14. 0.0278330019880716
  15. 0.0298210735586481
  16. 0.0318091451292246
  17. 0.0337972166998012
  18. 0.0357852882703777
  19. 0.0377733598409543
  20. 0.0397614314115308
  21. 0.0417495029821074
  22. 0.0437375745526839
  23. 0.0457256461232604
  24. 0.047713717693837
  25. 0.0497017892644135
  26. 0.0516898608349901
  27. 0.0536779324055666
  28. 0.0556660039761431
  29. 0.0576540755467197
  30. 0.0596421471172962
  31. 0.0616302186878728
  32. 0.0636182902584493
  33. 0.0656063618290258
  34. 0.0675944333996024
  35. 0.0695825049701789
  36. 0.0715705765407555
  37. 0.073558648111332
  38. 0.0755467196819085
  39. 0.0775347912524851
  40. 0.0795228628230616
  41. 0.0815109343936382
  42. 0.0834990059642147
  43. 0.0854870775347912
  44. 0.0874751491053678
  45. 0.0894632206759443
  46. 0.0914512922465209
  47. 0.0934393638170974
  48. 0.095427435387674
  49. 0.0974155069582505
  50. 0.099403578528827
  51. 0.101391650099404
  52. 0.10337972166998
  53. 0.105367793240557
  54. 0.107355864811133
  55. 0.10934393638171
  56. 0.111332007952286
  57. 0.113320079522863
  58. 0.115308151093439
  59. 0.117296222664016
  60. 0.119284294234592
  61. 0.121272365805169
  62. 0.123260437375746
  63. 0.125248508946322
  64. 0.127236580516899
  65. 0.129224652087475
  66. 0.131212723658052
  67. 0.133200795228628
  68. 0.135188866799205
  69. 0.137176938369781
  70. 0.139165009940358
  71. 0.141153081510934
  72. 0.143141153081511
  73. 0.145129224652087
  74. 0.147117296222664
  75. 0.149105367793241
  76. 0.151093439363817
  77. 0.153081510934394
  78. 0.15506958250497
  79. 0.157057654075547
  80. 0.159045725646123
  81. 0.1610337972167
  82. 0.163021868787276
  83. 0.165009940357853
  84. 0.166998011928429
  85. 0.168986083499006
  86. 0.170974155069582
  87. 0.172962226640159
  88. 0.174950298210736
  89. 0.176938369781312
  90. 0.178926441351889
  91. 0.180914512922465
  92. 0.182902584493042
  93. 0.184890656063618
  94. 0.186878727634195
  95. 0.188866799204771
  96. 0.190854870775348
  97. 0.192842942345924
  98. 0.194831013916501
  99. 0.196819085487078
  100. 0.198807157057654
  101. 0.200795228628231
  102. 0.202783300198807
  103. 0.204771371769384
  104. 0.20675944333996
  105. 0.208747514910537
  106. 0.210735586481113
  107. 0.21272365805169
  108. 0.214711729622266
  109. 0.216699801192843
  110. 0.218687872763419
  111. 0.220675944333996
  112. 0.222664015904573
  113. 0.224652087475149
  114. 0.226640159045726
  115. 0.228628230616302
  116. 0.230616302186879
  117. 0.232604373757455
  118. 0.234592445328032
  119. 0.236580516898608
  120. 0.238568588469185
  121. 0.240556660039761
  122. 0.242544731610338
  123. 0.244532803180915
  124. 0.246520874751491
  125. 0.248508946322068
  126. 0.250497017892644
  127. 0.252485089463221
  128. 0.254473161033797
  129. 0.256461232604374
  130. 0.25844930417495
  131. 0.260437375745527
  132. 0.262425447316103
  133. 0.26441351888668
  134. 0.266401590457256
  135. 0.268389662027833
  136. 0.27037773359841
  137. 0.272365805168986
  138. 0.274353876739563
  139. 0.276341948310139
  140. 0.278330019880716
  141. 0.280318091451292
  142. 0.282306163021869
  143. 0.284294234592445
  144. 0.286282306163022
  145. 0.288270377733598
  146. 0.290258449304175
  147. 0.292246520874752
  148. 0.294234592445328
  149. 0.296222664015905
  150. 0.298210735586481
  151. 0.300198807157058
  152. 0.302186878727634
  153. 0.304174950298211
  154. 0.306163021868787
  155. 0.308151093439364
  156. 0.31013916500994
  157. 0.312127236580517
  158. 0.314115308151093
  159. 0.31610337972167
  160. 0.318091451292247
  161. 0.320079522862823
  162. 0.3220675944334
  163. 0.324055666003976
  164. 0.326043737574553
  165. 0.328031809145129
  166. 0.330019880715706
  167. 0.332007952286282
  168. 0.333996023856859
  169. 0.335984095427435
  170. 0.337972166998012
  171. 0.339960238568588
  172. 0.341948310139165
  173. 0.343936381709742
  174. 0.345924453280318
  175. 0.347912524850895
  176. 0.349900596421471
  177. 0.351888667992048
  178. 0.353876739562624
  179. 0.355864811133201
  180. 0.357852882703777
  181. 0.359840954274354
  182. 0.36182902584493
  183. 0.363817097415507
  184. 0.365805168986084
  185. 0.36779324055666
  186. 0.369781312127237
  187. 0.371769383697813
  188. 0.37375745526839
  189. 0.375745526838966
  190. 0.377733598409543
  191. 0.379721669980119
  192. 0.381709741550696
  193. 0.383697813121272
  194. 0.385685884691849
  195. 0.387673956262425
  196. 0.389662027833002
  197. 0.391650099403579
  198. 0.393638170974155
  199. 0.395626242544732
  200. 0.397614314115308
  201. 0.399602385685885
  202. 0.401590457256461
  203. 0.403578528827038
  204. 0.405566600397614
  205. 0.407554671968191
  206. 0.409542743538767
  207. 0.411530815109344
  208. 0.41351888667992
  209. 0.415506958250497
  210. 0.417495029821074
  211. 0.41948310139165
  212. 0.421471172962227
  213. 0.423459244532803
  214. 0.42544731610338
  215. 0.427435387673956
  216. 0.429423459244533
  217. 0.431411530815109
  218. 0.433399602385686
  219. 0.435387673956262
  220. 0.437375745526839
  221. 0.439363817097416
  222. 0.441351888667992
  223. 0.443339960238569
  224. 0.445328031809145
  225. 0.447316103379722
  226. 0.449304174950298
  227. 0.451292246520875
  228. 0.453280318091451
  229. 0.455268389662028
  230. 0.457256461232604
  231. 0.459244532803181
  232. 0.461232604373757
  233. 0.463220675944334
  234. 0.465208747514911
  235. 0.467196819085487
  236. 0.469184890656064
  237. 0.47117296222664
  238. 0.473161033797217
  239. 0.475149105367793
  240. 0.47713717693837
  241. 0.479125248508946
  242. 0.481113320079523
  243. 0.483101391650099
  244. 0.485089463220676
  245. 0.487077534791252
  246. 0.489065606361829
  247. 0.491053677932406
  248. 0.493041749502982
  249. 0.495029821073559
  250. 0.497017892644135
  251. 0.499005964214712
$pzgram
  1. 16.772838980633
  2. 16.2804130300713
  3. 15.4655306322954
  4. 14.3394680899425
  5. 12.9235899723269
  6. 11.2560143774781
  7. 9.40118523874183
  8. 7.46091486856595
  9. 5.57937884624677
  10. 3.92280103864778
  11. 2.61455261852462
  12. 1.65722171320321
  13. 0.928236882161266
  14. 0.263550359724702
  15. -0.454433984108863
  16. -1.25221773329896
  17. -2.05326464894426
  18. -2.70400678411528
  19. -3.06832033262573
  20. -3.14276066102779
  21. -3.05579940967414
  22. -2.95949676424466
  23. -2.95264232770598
  24. -3.0738933533886
  25. -3.32376349841868
  26. -3.6870620162321
  27. -4.14750534640359
  28. -4.69167825573682
  29. -5.30216165687236
  30. -5.94586701993394
  31. -6.57061146176876
  32. -7.12210334747956
  33. -7.57422388522595
  34. -7.93968422372334
  35. -8.24010397248525
  36. -8.46306139844307
  37. -8.55586763587466
  38. -8.47666460407302
  39. -8.25754183158291
  40. -8.00435432941316
  41. -7.83655803324915
  42. -7.83437529246574
  43. -8.02043108437659
  44. -8.36065083735745
  45. -8.7741853288975
  46. -9.16062865195203
  47. -9.45034274629949
  48. -9.64823139219479
  49. -9.8252570303157
  50. -10.0647076890436
  51. -10.4118409505434
  52. -10.8511243707112
  53. -11.3098570354503
  54. -11.6904485349461
  55. -11.9267627401175
  56. -12.0237387214262
  57. -12.0392544751017
  58. -12.033148487768
  59. -12.037116811164
  60. -12.0599850947725
  61. -12.1091547098747
  62. -12.2051691014617
  63. -12.3783829451447
  64. -12.6517970619093
  65. -13.0200804368539
  66. -13.4321145783132
  67. -13.7864400901281
  68. -13.9595234940699
  69. -13.8758467678216
  70. -13.5684819732624
  71. -13.1601781570653
  72. -12.7912089344433
  73. -12.5692062276261
  74. -12.5580635457563
  75. -12.7827418174628
  76. -13.2322569131487
  77. -13.8564282118466
  78. -14.5621375022977
  79. -15.2259892140301
  80. -15.7412627803471
  81. -16.0793653012469
  82. -16.3023072000104
  83. -16.5080148982994
  84. -16.7643452378501
  85. -17.0709515986716
  86. -17.3475287879383
  87. -17.4580648629926
  88. -17.2947446767954
  89. -16.8724630428233
  90. -16.3189023415657
  91. -15.7748488874803
  92. -15.3243222495906
  93. -14.9896573486603
  94. -14.7587574469466
  95. -14.6170141639488
  96. -14.5679285254408
  97. -14.6326079522044
  98. -14.8298958180319
  99. -15.1490669168999
  100. -15.5297173549928
  101. -15.8681323217479
  102. -16.0693684904563
  103. -16.1205621258814
  104. -16.1067963344504
  105. -16.1492607098927
  106. -16.3385219714121
  107. -16.7091846328388
  108. -17.2423197765399
  109. -17.8728163824163
  110. -18.4916308226665
  111. -18.9492685804453
  112. -19.0929957860066
  113. -18.86211978889
  114. -18.3588231699607
  115. -17.7872239955281
  116. -17.3347052204913
  117. -17.1117892207522
  118. -17.1451410471108
  119. -17.3791094089647
  120. -17.6776409423237
  121. -17.8627133121044
  122. -17.820020073133
  123. -17.5863806952797
  124. -17.2967099277124
  125. -17.0608820934124
  126. -16.9013451568555
  127. -16.7645809134903
  128. -16.5752953896538
  129. -16.3013348410484
  130. -15.9803222786664
  131. -15.6875042875766
  132. -15.4858099672874
  133. -15.4003515713002
  134. -15.4210282028074
  135. -15.5197591263274
  136. -15.6693518170235
  137. -15.8524157657403
  138. -16.0541468592882
  139. -16.2456412808584
  140. -16.3761459926091
  141. -16.3920007902818
  142. -16.2774253594852
  143. -16.0795955737192
  144. -15.8887340387414
  145. -15.7938637214101
  146. -15.8496325286025
  147. -16.0618254712732
  148. -16.383144500235
  149. -16.7200941116107
  150. -16.9660055973877
  151. -17.0618667921584
  152. -17.0381474392318
  153. -16.9893162938786
  154. -17.0103734587098
  155. -17.1531810453361
  156. -17.4146924130013
  157. -17.742031873205
  158. -18.046618690445
  159. -18.2299743890962
  160. -18.2210370705358
  161. -18.0097327453929
  162. -17.6538195570827
  163. -17.2539318949409
  164. -16.916652290387
  165. -16.7268135648599
  166. -16.7339981969209
  167. -16.9472049640355
  168. -17.3323358766924
  169. -17.8146379733182
  170. -18.2957142308177
  171. -18.6887211815998
  172. -18.9482414718107
  173. -19.0617782937151
  174. -19.0171132627711
  175. -18.7994997787257
  176. -18.4315479652903
  177. -17.9996739176678
  178. -17.6248810634248
  179. -17.410152806823
  180. -17.4068784017244
  181. -17.6018926735514
  182. -17.9140949086896
  183. -18.2105849947292
  184. -18.3682689060502
  185. -18.3605842345937
  186. -18.2767087208553
  187. -18.2464320413389
  188. -18.3590250050362
  189. -18.6303193966252
  190. -19.0046526100888
  191. -19.3816324290244
  192. -19.6740210356948
  193. -19.872206414636
  194. -20.0528439744937
  195. -20.3203138543656
  196. -20.7384352260115
  197. -21.286419070774
  198. -21.8364976961457
  199. -22.1802133921256
  200. -22.1569341854739
  201. -21.7967158388447
  202. -21.2780500880908
  203. -20.7715986299015
  204. -20.365923131846
  205. -20.0800207379009
  206. -19.896341126447
  207. -19.7867798407909
  208. -19.7286996181756
  209. -19.7121292797738
  210. -19.7387612432839
  211. -19.8133783598748
  212. -19.9294547995832
  213. -20.0538028390635
  214. -20.121769900063
  215. -20.0597847328854
  216. -19.8352872650179
  217. -19.4917008995066
  218. -19.1258237304144
  219. -18.8310160799687
  220. -18.6563340282809
  221. -18.5959984437718
  222. -18.6011653189642
  223. -18.6095075919515
  224. -18.5845671374489
  225. -18.5400695430221
  226. -18.528180346754
  227. -18.6038510629099
  228. -18.7930631551122
  229. -19.0768799446571
  230. -19.3904665552673
  231. -19.6402522385743
  232. -19.7440930944685
  233. -19.6773709392895
  234. -19.4853452583798
  235. -19.2506462957609
  236. -19.0511960944803
  237. -18.9384459626803
  238. -18.9348300323497
  239. -19.0379362682099
  240. -19.2231879832111
  241. -19.4431882391269
  242. -19.6283063835854
  243. -19.6995491320087
  244. -19.6012771543106
  245. -19.3365103373485
  246. -18.9688263745037
  247. -18.5855262075561
  248. -18.2569184010669
  249. -18.018094317108
  250. -17.872309651484
  251. -17.8058798397819

Figure 2: Parzen plot of Spectral Density.

The highest frequency in the Parzen Window is at 0, which indicates that this is indeed a wandering time series with no decernible seasonality or trend.

The next task is to check the ACF and PACF. The ACF, or auto-correlation function, gives values of auto-correlation of any series with its lagged values. It is a graphic representation that helps to describe how well the present value of the series is related with its past values.

The PACF, or partial auto-correlation function, finds the correlation of the residuals with the next lag value, hence ‘partial’, and not ‘complete’ as it removes already found variations before it find the next correlation.

In [20]:
acf(close)

Figure 3: ACF for Daily BP Closing Stock Price.

This ACF is slowly dampening to zero, even out past 25 lags. This could indicate that future values of the series are correlated / heavily affected by past values and that the stock prices are not stationary.

Altering the lag max in the function to increase the plotted lags displays where the lags begin approach zero.

In [21]:
acf(close, lag.max = 105)

Figure 4: ACF to lag 105 for Daily BP Closing Stock Price.

There is good positive correlation with the lags up to about lag 80, this is the point where ACF plot cuts the upper confidence threshold.

In [22]:
pacf(close)

Figure 5: PACF for Daily BP Closing Stock Price.

The PACF displays the lag crossing the 95% limit line and approaching zero at lag 2. This could indicate an AR(2) process in this data.

A test that can be conducted to look for stationarity is the Augmented Dickey–Fuller (ADF) t-statistic test to find if the series has a unit root (a series with a trend line, or non-stationary, will have a unit root and result in a large p-value). The adt.test() function from the tseries package will be used.

In [23]:
#test for stationarity
adf.test(close)
	Augmented Dickey-Fuller Test

data:  close
Dickey-Fuller = -2.694, Lag order = 7, p-value = 0.2846
alternative hypothesis: stationary

The Dickey-Fuller for stationarity fails to reject the null and indicates that the BP stock data is not stationary with a p-value of 0.2846.

Back to Top

Differenced Data

To attempt to stationarize the data, one difference can be taken. Differencing can help stabilise the mean of a time series by computing the differences between consecutive observations to eliminate or reduce any trend and seasonality. To do this, tswge has the artrans.wge() function.

In [24]:
#take 1 difference with artrans
close.dif = artrans.wge(close, 1)

Figure 6: Plots of orginal time series and ACF (top) and differenced time series and ACF (bottom).

The transformed times series certainly appears more stationary. The ACF for the transformed data now appears as a white noise series. Another Dickey-Fuller test can assess the stationary of this difference data.

In [25]:
#stationarity test of differenced data
adf.test(close.dif)
	Augmented Dickey-Fuller Test

data:  close.dif
Dickey-Fuller = -7.9607, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary

The test for stationarity on the differenced data rejects the null and indicates that the BP stock data is stationary with a p-value of 0.01.

Another test for stationarity is the Jlung-Box test for independence. The Ljung-Box test examines whether there is significant evidence for non-zero correlations at given lags, with the null hypothesis of independence in a given time series (a non-stationary signal will have a low p-value). The jlung.wge() function of tswge is used for this test.

In [26]:
ljung.wge(close.dif)
Obs -0.07141393 -0.008923918 0.04868184 0.05578057 -0.08628334 0.02453343 -0.01434213 0.002765201 -0.008621435 -0.006807761 0.05791577 -0.01510345 -0.04470057 0.04159326 0.002429646 0.02486328 -0.01619557 0.006486719 -0.02650535 0.01441494 0.01414213 0.03533125 -0.02384849 0.01602258 
$test
'Ljung-Box test'
$K
24
$chi.square
15.60342289125
$df
24
$pval
0.901855815289356

The p-value of the test on the differenced time series has a large p-value indicating that the time series is stationary.

Back to Top

Next, the ACF and PACF of the differenced data can be assessed to attempt to determine the $p$ and $q$ terms for the ARIMA model. The acf() and pacf() functions are used for this.

In [27]:
#plot acf of differenced data
acf(close.dif)

Figure 7: Plot of the differenced time series ACF.

The ACF can help to determine the value of $q$ for the MA process. All of the lags are within the 95% confidence intervals. This could mean that 0 may be an apropriate MA term, but an MA term of 1 or 2 is also plausible. Rule 7 indicates that a negative lag 1 in the ACF, as shown in Fig. 7, could mean that the series is slightly "overdifferenced" and an MA term should be added to the model.

In [28]:
#plot pacf of differenced data
pacf(close.dif)

Figure 8: Plot of the differenced time series PACF.

The PACF can be assessed for the AR term, or $p$. Here again, all the of the lags are within the 95% limit lines. Conservatively, 0 could be used as the AR term, but rule 6 indicated that 1 or 2 may be more likely to give a better model.

Several models can be tried and judged in terms of AIC score. The Akaike Information Criterion (AIC) is an estimator of overall model quality and widely used for model selection. It measures the relative information loss by a given model. Thus, less information loss by a model, better the model quality.

$AIC$ = 2$\textit{k}$ - 2ln$({\widehat{L}})$

The function est.arma.wge() is used to calculate maximum likelihood estimates of parameters of stationary models. The $p$ and $q$ terms are plugged in to the function along with the differenced data. The AIC can be printed for the model.

In [29]:
#model 1 - AR(2) and MA(2)
dif.est1 = est.arma.wge(close.dif, p=2, q=2)
#print AIC
dif.est1$aic
Coefficients of Original polynomial:  
-0.1317 -0.8183 

Factor                 Roots                Abs Recip    System Freq 
1+0.1317B+0.8183B^2   -0.0805+-1.1025i      0.9046       0.2616
  
  
-1.28453096636655
In [30]:
#model 2 - AR(2) and MA(1)
dif.est2 = est.arma.wge(close.dif, p=2, q=1)
#print AIC
dif.est2$aic
Coefficients of Original polynomial:  
-0.3278 -0.0411 

Factor                 Roots                Abs Recip    System Freq 
1+0.3278B+0.0411B^2   -3.9830+-2.9051i      0.2028       0.3997
  
  
-1.26951496486221
In [31]:
#model 3 - AR(1) and MA(2)
dif.est3 = est.arma.wge(close.dif, p=1, q=2)
#print AIC
dif.est3$aic
Coefficients of Original polynomial:  
0.5969 

Factor                 Roots                Abs Recip    System Freq 
1-0.5969B              1.6753               0.5969       0.0000
  
  
-1.27086594941382

The lowest AIC is the first model, an ARIMA(2,1,2).

To fuller assess this model's appropriateness, the ACF of the residuals of the estimate can be viewed and tested for whiteness using the Jlung-Box test again.

In [32]:
#print pval of the Jlung-Box test of the residuals
ljung.wge(dif.est1$res)$pval
#view ACF
acf(dif.est1$res)
Obs -0.001238373 0.02357949 -0.01260141 0.03958959 -0.03265358 0.02737353 -0.0591823 0.0009563824 0.02623148 -0.004441566 0.02961887 -0.01142664 -0.0234686 0.03268989 -0.009071479 0.03756145 -0.004747382 -0.008058259 -0.03768635 0.0256482 0.02613063 0.03129668 -0.0295573 0.01725465 
0.99758910091187

Figure 9: Plot of the residuals of the ARIMA(2,1,2) model.

The ACF of the residuals show the lags staying completely withing the 95% limit lines and the Jlung-Box test returns a large p-value. Both of these indicate that the model does not exhibit any significant lack of fit and forecasting could be performed.

Calling dif.est1 displays the data output by the estimate, including phi and theta terms.

In [33]:
#display estimate data
dif.est1
$phi
  1. -0.131705494787098
  2. -0.818270945091426
$theta
  1. -0.058987572259534
  2. -0.793412552735592
$res
  1. -1.8885399143342
  2. -1.53215660742561
  3. 0.424904183017137
  4. -0.376514708988139
  5. -0.165468569015401
  6. -0.629103635457036
  7. 0.530303560070368
  8. 0.0560000571354485
  9. 0.882208805961413
  10. -0.656114215603534
  11. -0.36654917191723
  12. 0.0076182992996029
  13. -0.554731192718225
  14. 0.293376101359371
  15. 0.798482479951524
  16. 0.385658992202769
  17. -0.697638682458714
  18. -0.825322889471043
  19. -0.020444846302116
  20. -0.0690425559681614
  21. 0.542228156214026
  22. 0.305078835681294
  23. -0.232351335686394
  24. 0.237250580409219
  25. 0.221813123182491
  26. 0.165840751644937
  27. -0.269007749695408
  28. -0.651669371709304
  29. 0.10491863986532
  30. 0.45358772243333
  31. -0.621724526831332
  32. 0.266954882106652
  33. 1.03647828084281
  34. -0.964003988266217
  35. 0.0996327207632204
  36. 1.19881291328305
  37. -0.326816196056205
  38. -0.269309464957135
  39. 0.826322407279121
  40. -0.615945064889349
  41. 0.895004405291044
  42. 0.373841481727189
  43. 0.882685286838026
  44. -0.101529203358621
  45. -0.0590708597457392
  46. 1.08802867577183
  47. 0.356458278011931
  48. 0.0971425466899544
  49. -0.131348852093519
  50. -0.294410656901625
  51. -0.0337720664200627
  52. 0.938059921876043
  53. 0.198166918447283
  54. -0.159356966191151
  55. 0.382423725296806
  56. 0.376858720983786
  57. 0.0113847210606692
  58. 0.669219462104861
  59. -0.618143592729759
  60. 0.377029166169253
  61. 0.239508833484989
  62. -0.0547639421629195
  63. 0.0299501931805441
  64. 0.601893607168308
  65. 0.423703882062744
  66. -0.289213816690184
  67. 1.39794979377578
  68. -0.828511982951741
  69. -0.0142379483087327
  70. 0.51324863432872
  71. 0.433264001086306
  72. 0.0330866712737663
  73. 0.47096366983035
  74. 0.161809263558545
  75. 0.472665459944687
  76. -0.618707197824834
  77. -0.707140242161941
  78. -0.796395999406434
  79. -1.37977283137356
  80. 0.174168918554502
  81. 1.16198108415538
  82. 0.240924172459278
  83. 0.163803517137472
  84. -0.0295484041589014
  85. 0.316959353443105
  86. 0.0990244668062661
  87. 0.959479309475799
  88. -0.154095315436447
  89. 0.0876471292185722
  90. -0.975211786793233
  91. -0.160177090480593
  92. 0.408520371725512
  93. -1.21230191643322
  94. 0.174790267853405
  95. -0.263325061110419
  96. -0.331390983536557
  97. -0.55845690516855
  98. 1.54580313015057
  99. -1.43760729374703
  100. 0.659194724555339
  101. 0.505289074215003
  102. 0.309575858725515
  103. 0.0792636513496216
  104. -0.495326913102547
  105. 0.625649209349199
  106. 0.199965546235594
  107. 0.0458506695204354
  108. 0.600646524427609
  109. 0.444104564899117
  110. -1.78534721046905
  111. 0.107703731619003
  112. -0.0769904547661952
  113. -0.628357368797166
  114. -0.146177092935981
  115. -0.283473543250281
  116. 0.171605828248357
  117. 0.220610320609359
  118. -0.242442689264336
  119. 0.480984435907078
  120. 0.249843030526219
  121. -0.382884621642548
  122. -0.0547094798756334
  123. -0.184855093649186
  124. 0.732859140073168
  125. -0.420745629222272
  126. -0.805886522006007
  127. 0.0511188833610657
  128. 0.228925556510797
  129. 0.748750081940452
  130. -0.0790854445689188
  131. -1.19039397039156
  132. -0.597467535897074
  133. -0.273421247922618
  134. -0.034440801847144
  135. -1.09583513084325
  136. 0.110532394842963
  137. 0.222302683124281
  138. 0.577771465309221
  139. 0.114995622227337
  140. 0.478015214903966
  141. 0.00259442287761481
  142. 0.548900369643642
  143. 0.474135423854383
  144. -0.593352379286744
  145. 0.104783001434842
  146. 0.20065540433654
  147. -0.635488521522906
  148. -0.190352313135152
  149. 0.00965195395311359
  150. -0.514478393553439
  151. -0.155203339841672
  152. 0.0153125780491068
  153. 0.570937039561489
  154. 0.565108355985027
  155. 0.370576099729256
  156. -0.184907000983589
  157. -0.0424355331550941
  158. 0.363737092491557
  159. 0.486079719501295
  160. 0.617980144639591
  161. 0.127346643104944
  162. 0.576212392293611
  163. 1.14731940193369
  164. 0.0381579779920656
  165. 0.617731488288549
  166. -0.77738758721756
  167. 0.651559343564156
  168. 0.104613987138894
  169. 0.23847055612096
  170. -0.232547750874384
  171. -0.299089667269953
  172. -0.675347528007671
  173. 0.193410114674369
  174. -0.845756694903534
  175. -1.04509602111103
  176. 0.342714501309399
  177. 0.00774058972103765
  178. 0.292801753456265
  179. -0.359903321928166
  180. -0.555437752416736
  181. 0.135466477053889
  182. -0.781727902258028
  183. -0.871993993395716
  184. -1.42458958546773
  185. 0.54408498648426
  186. -0.356634120612233
  187. -0.0261365803829446
  188. 1.08241039091775
  189. 1.24310698780851
  190. -0.50570044344183
  191. -1.12401958986306
  192. 0.778175666559971
  193. 0.197827060210435
  194. 0.682042894330426
  195. -1.90892687197932
  196. -0.304567608585988
  197. -0.318956690337209
  198. -0.844729353338343
  199. 0.408607832134855
  200. 0.705319834351269
  201. -0.122898358843441
  202. 0.1927396771398
  203. -0.957086349240333
  204. 1.05908197630402
  205. -1.1719115116974
  206. 0.672548530422078
  207. -0.336700584761106
  208. 0.5303724084039
  209. -0.199407811178766
  210. -0.132872308424535
  211. 0.722436279463528
  212. -0.144849994867385
  213. -1.24082298429658
  214. -0.239301733224531
  215. -0.236700324812568
  216. -0.141987251135926
  217. 0.0415065017980782
  218. 0.137044742414345
  219. -0.635873395165866
  220. -0.179727999097719
  221. -0.595241913333765
  222. -0.24647733022787
  223. -0.380277700993725
  224. 0.465604094253516
  225. -0.982789120434471
  226. 1.32779154293082
  227. -0.249411127672199
  228. 0.499609270693666
  229. -0.230851396967784
  230. 0.614958065946683
  231. 0.311374426487995
  232. 1.31901816193398
  233. 0.173932130737699
  234. -0.143291316564817
  235. 0.317792988176219
  236. 0.400114649997336
  237. -0.392846499136997
  238. 0.131425122880487
  239. -0.20067213741801
  240. -0.0607768232062802
  241. 0.0377966405712152
  242. 0.607195234547966
  243. -0.626558567229277
  244. 0.04923320070171
  245. -0.0388515375911057
  246. 0.0227254319194149
  247. -0.490612971316123
  248. 0.597873648287759
  249. 0.493180161780241
  250. 0.552046065213196
  251. 0.268078627493372
  252. 0.0214678209143141
  253. 1.43072760844662
  254. 0.375909053486928
  255. -0.270143274199643
  256. -0.328989985820252
  257. -0.324047436150033
  258. 0.309508864206891
  259. 0.285471151824734
  260. -0.803511725451539
  261. 0.347594327902329
  262. 0.160344809243029
  263. 0.198368196004572
  264. -0.214020717851268
  265. 0.110121431752409
  266. 0.108313259614765
  267. 0.0900150266842628
  268. 0.299067501677955
  269. -0.200643891614644
  270. -0.0305411700282991
  271. 0.0843870260826492
  272. 0.0979642523237162
  273. -0.0223843306744335
  274. -0.0417601503323462
  275. -0.373644379850205
  276. 0.0559325384851133
  277. 0.150829064197495
  278. 1.42739733492391
  279. 0.231024636767828
  280. -0.0612220464354793
  281. 0.306055807263793
  282. 0.271872289170673
  283. 0.162271553001167
  284. -0.00109100451845905
  285. -0.63150298919905
  286. -0.43019015664549
  287. 0.314591510870854
  288. -0.18463510538793
  289. 0.0363827734906659
  290. 0.0794960349686295
  291. 0.409112645340063
  292. 0.203253894690365
  293. 0.0126979606198384
  294. -0.0993689589640338
  295. 0.393331958157855
  296. 0.768311857960217
  297. -0.166174559822918
  298. 0.0194787078635519
  299. -0.201019945192484
  300. -0.156991119296019
  301. 0.0313876670216857
  302. -0.202861442815747
  303. 0.130122476434634
  304. -0.185669677213477
  305. 0.670062569578086
  306. 0.0132771280062386
  307. -0.865090408086601
  308. -0.235006330861663
  309. -0.672328156402497
  310. -0.260516704789649
  311. 0.716815965867129
  312. -0.601288178822055
  313. -0.459814870576402
  314. 0.391136191698367
  315. -0.0891221012912028
  316. -0.638523289967293
  317. -0.0197951932694879
  318. -0.667202100567757
  319. -0.100734737460492
  320. -0.241871386784294
  321. 0.283972171237526
  322. 0.24445075893586
  323. 0.589177904655416
  324. 0.10657094396947
  325. 0.219558640544682
  326. 0.181283023292249
  327. -0.348033370806427
  328. -1.0101130893753
  329. 0.31660390377646
  330. -0.2321078695181
  331. -0.1126268244765
  332. -0.277888179744952
  333. -0.379608541493894
  334. 0.353742193304801
  335. 0.41688751268603
  336. -0.293177896085279
  337. 0.504315332816312
  338. 0.574170437434487
  339. 0.0703063737998513
  340. 0.254161117043312
  341. -1.33753962139377
  342. 0.1094938239799
  343. -0.253467839706602
  344. -0.42374448372486
  345. 0.578621020619081
  346. 0.0398588146239449
  347. 0.811661617497072
  348. 0.691591050721548
  349. -0.253247055541312
  350. -0.337560480420804
  351. 0.205846988001099
  352. -0.409106913172943
  353. 0.0883324648848052
  354. 0.361594978138647
  355. -0.167818842449012
  356. -0.466942009686339
  357. -0.38709749983011
  358. 0.0188003664235115
  359. -0.0593430290088476
  360. 0.310462532567425
  361. 0.046036837551466
  362. -0.062120928291302
  363. -0.484902284494166
  364. -0.49979076857385
  365. -0.97642620346911
  366. -0.473033561002265
  367. 0.401753965109803
  368. 0.191021341547152
  369. 0.15299286236459
  370. -0.348505154355295
  371. -0.168009770938077
  372. 0.0147146971709352
  373. 0.14646189370463
  374. 0.914075323482991
  375. -0.183494454445337
  376. -0.722813507946524
  377. -0.605721926249326
  378. -0.856705294780885
  379. -0.0242917299060402
  380. -0.0283313509289719
  381. -0.36978543900408
  382. -0.395452277813811
  383. -0.146689831808683
  384. 0.559298404511868
  385. -0.873280858874577
  386. -0.264273839571527
  387. 0.179712024336779
  388. 0.64693222964836
  389. -0.282518070269094
  390. 0.237011696984912
  391. -0.2795374457367
  392. -0.238287296765391
  393. 0.124897677115609
  394. 0.267964726971252
  395. 0.25634225029511
  396. 0.196128992830196
  397. 0.0171401078388894
  398. -0.0303462410440101
  399. 0.376275196404285
  400. 0.137377828391158
  401. 0.103920879852371
  402. 0.0225873112673989
  403. 0.356348275993848
  404. 0.0348347432719014
  405. -0.113758920388478
  406. 0.157303859883425
  407. 1.50902729418289
  408. -0.637154383681341
  409. 0.0227454394583739
  410. -0.0239439578551908
  411. 0.365536347549757
  412. -0.0831460231667694
  413. -0.671838274827695
  414. -0.0974991277479994
  415. 0.136642063612542
  416. 0.150644256252293
  417. -0.292431241734459
  418. -0.391568796734324
  419. -1.08592497455603
  420. -0.200677624740478
  421. 0.554251406183514
  422. 0.0422983394158966
  423. -0.0327007112905325
  424. 0.295101206646038
  425. -0.0130596178872149
  426. 0.355246586398312
  427. 0.00882029693464186
  428. -0.0740779530756531
  429. -0.165179724309739
  430. 0.577751361285519
  431. -0.205242549292805
  432. 0.784966825735593
  433. 0.259554162440557
  434. 0.618017421060799
  435. 0.204515251684371
  436. 0.135483719586877
  437. -0.110990858792344
  438. -1.28279504403861
  439. 0.459272835533301
  440. -0.533353171093103
  441. 0.846398590109737
  442. 0.715325696082604
  443. 0.517707221287629
  444. -0.398041818642966
  445. -0.11123876573397
  446. -0.141936807193929
  447. 0.00213290461027701
  448. 0.00892808588273648
  449. 0.0705007780377298
  450. -0.218987064118407
  451. 0.250232541998235
  452. -0.12083656956659
  453. -0.404878523719906
  454. -0.168901752048625
  455. 0.085355564079961
  456. 0.0130326723803826
  457. 0.0997903409940982
  458. -0.520339306860613
  459. -0.382104788294255
  460. -0.341266434779688
  461. -0.0979736907245979
  462. -0.478071705265993
  463. 0.219475994052466
  464. -0.384897072871485
  465. 0.459038181161329
  466. 0.0388468283575872
  467. -0.0670594423453729
  468. -0.452450987141151
  469. 0.375065648070735
  470. 0.0294028667358086
  471. 0.553440756626007
  472. 0.171597920346784
  473. 0.067748762273184
  474. 0.0185840961288457
  475. 0.222329266146763
  476. 0.286568375365839
  477. -0.0373237416223778
  478. -0.0535484713923346
  479. -0.12432428607271
  480. -0.247000663422062
  481. 0.148863221793096
  482. 0.420960064428863
  483. 0.742388969197263
  484. 1.08847670949239
  485. -0.346192518908257
  486. -0.635786659904385
  487. 0.0475631298824613
  488. -0.164116806005477
  489. 0.137422944002005
  490. -0.0019238083438465
  491. -0.139108242494969
  492. 0.169695854268346
  493. -0.0144441748519712
  494. -0.458545815585642
  495. -0.489952422705455
  496. 0.314629220768641
  497. 0.0103668713177734
  498. -0.659532649959547
  499. -0.119391674927083
  500. -0.250153191291754
  501. -0.058588787182633
  502. -0.884808286500733
$avar
0.271321372289188
$aic
-1.28453096636655
$aicc
-0.280208860849224
$bic
-1.24251303688755
$se.phi
  1. 0.0170644788585324
  2. 0.0263467889100477
$se.theta
  1. 0.0198434265366474
  2. 0.0290305029623824

Next, the estimates from model 1, (ARIMA(2,1,2), are used to create a forecast. In this case, the phi, theta and difference ($d$) term are included. The forecast will be for 30 days and will begin from 30 days before the end of the series so that comparisons can be made against the actual values.

In [34]:
#forecast in tswge for model 1 - ARIMA(2,1,2)
dif.fore1 = fore.aruma.wge(close, phi = dif.est1$phi, theta = dif.est1$theta, d = 1, n.ahead = 30, lastn = T, limits = T)
title("Time Series & Forecast for Last 30 Days of BP Closing with ARIMA (2,1,2)")

Figure 10: Plot of Forecasts for Last 30 Days of BP Closing Stock Price with ARIMA (2,1,2).

The forecast does not do a great job of forecasting against the actual closing prices

In [35]:
#Display of forecasts
dif.fore1$f
  1. 37.5974142829317
  2. 37.5978153015433
  3. 37.591695592887
  4. 37.5921734502652
  5. 37.5971180936086
  6. 37.5960758401021
  7. 37.5921670526341
  8. 37.5935347071834
  9. 37.5965530267799
  10. 37.5950363855234
  11. 37.5927663322817
  12. 37.5943063342414
  13. 37.5959610261328
  14. 37.5944829552595
  15. 37.5933236390174
  16. 37.5946857897871
  17. 37.5954550218431
  18. 37.5942391013569
  19. 37.5937698045247
  20. 37.5948265659016
  21. 37.595071396584
  22. 37.5941744339072
  23. 37.5940922309865
  24. 37.59483701606
  25. 37.594806188035
  26. 37.5942008122693
  27. 37.5943057692612
  28. 37.5947873072486
  29. 37.5946380027928
  30. 37.5942636384659
In [36]:
#close up the forecast
plot(seq(474,503,1), close[474:503],type = 'l', xlim = c(474,503))
lines(seq(474,503), dif.fore1$f, col = 'blue')
title("Plot of Forecasts for Last 30 Days of BP Closing with ARIMA (2,1,2)")

Figure 11: Close up Plot of Forecasts for Last 30 Days of BP Closing Stock Price with ARIMA (2,1,2). The forecast does not do a great job of forecasting against the actual closing prices

Forecasting errors can be evaluated in terms of Average Square Error (ASE). Lower ASE models are preferred.

$ASE = mean((Forecasted Value - Observed Actual Value)^2)$
In [37]:
#calculate ASE for model 1 - ARIMA(2,1,2)
ASE1 = mean((dif.fore1$f - close[(length(close) - 29):length(close)])^2)
ASE1
0.895149546218454

Back to Top

Next we can utilize the a grid search function to attempt to find the best fitting model for the BP stock data.

The auto.arima() function in the R package fpp2 uses a variation of the Hyndman-Khandakar algorithm, which combines unit root tests, minimization of the AICs and MLE to obtain an ARIMA model.

  1. The number of differences $0≤d≤2$ is determined using repeated KPSS tests.
  2. The values of $p$ and $q$ are then chosen by minimising the AICc after differencing the data $d$ times. Rather than considering every possible combination of $p$ and $q$, the algorithm uses a stepwise search to traverse the model space.

    a. Four initial models are fitted:

    • ARIMA(0,$d$,0),
    • ARIMA(2,$d$,2),
    • ARIMA(1,$d$,1),
    • ARIMA(0,$d$,1).

      A constant is included unless $d = 2$. If $d \leq 1$, an additional model is also fitted: ARIMA(0,$d$,0) without a constant.

    b. The best model (with the smallest AICc value) fitted in step (a) is set to be the “current model”.

    c. Variations on the current model are considered:

    • vary $p$ and/or $q$ from the current model by $±$ 1;
    • include/exclude $c$ from the current model. The best model considered so far (either the current model or one of these variations) becomes the new current model.

    d. Repeat Step 2(c) until no lower AICc can be found.

For this grid search, the default maximum values of $p$, $q$ and $d$ will be increased so that more models are searched.

In [38]:
#grid search function of fpp2 package
auto.arima(close.dif, max.p = 8, max.q = 5, max.d=3)
Series: close.dif 
ARIMA(2,0,1) with zero mean 

Coefficients:
          ar1      ar2     ma1
      -0.3148  -0.0400  0.2388
s.e.   0.5719   0.0595  0.5705

sigma^2 estimated as 0.2785:  log likelihood=-389.94
AIC=787.89   AICc=787.97   BIC=804.76

The grid search suggests an ARIMA(2,0,1) on the differenced data, which is also a ARIMA(2,1,1) on the original data.

Next, an estimate will be created using the suggested model, as it was on the manually searched model 1.

In [39]:
#create estimate of suggested ARIMA(2,1,1) model
dif.est4 = est.arma.wge(close.dif, p=2, q=1)
Coefficients of Original polynomial:  
-0.3278 -0.0411 

Factor                 Roots                Abs Recip    System Freq 
1+0.3278B+0.0411B^2   -3.9830+-2.9051i      0.2028       0.3997
  
  
In [40]:
#forecasting based on suggested grid search model ARIMA(2,1,1)
dif.fore4 = fore.aruma.wge(close, phi = dif.est4$phi, d = 1, n.ahead = 30, lastn = T, limits = T)
title("Time Series and Forecast for Last 30 Days of BP Closing with ARIMA (2,1,1)")

Figure 12: Plot of Forecasts for Last 30 Days of BP Closing Stock Price with ARIMA (2,1,1) as suggeested by grid search. The forecast does not do a great job of forecasting against the actual closing prices

In [41]:
#close up the forecast
plot(seq(474,503,1), close[474:503],type = 'l', xlim = c(474,503))
lines(seq(474,503), dif.fore4$f, col = 'blue')
title("Plot of Forecasts for Last 30 Days of BP Closing with ARIMA (2,1,1)")

Figure 13: Close up Plot of Forecasts for Last 30 Days of BP Closing Stock Price with ARIMA (2,1,1). The forecast does not do a great job of forecasting against the actual closing prices

In [42]:
#average square error for (2,1,1)
ASE4= mean((dif.fore4$f - close[(length(close) - 29):length(close)])^2)
ASE4
0.967104064727587

Using auto.arima() a plot of the future $n$ values can also be easily created.

In [43]:
#plot of future 30 days
plot(forecast(auto.arima(close, max.p = 8, max.q = 5, max.d=3),h=30))

Figure 14: Plot of Forecasts for Next 30 Days of BP Closing Stock Price with ARIMA (2,1,1). The forecast only seems to repeat the same value.

Summary of manual model search and grid search with auto.arima().

The ASE for the grid searched ARIMA(2,1,1) is slightly higher indicating that the first model tried from the manual search, ARIMA(2,1,2), was a slightly better fit for the BP stock data.

Manual Selection Grid Search
ARIMA(2,1,2) ARIMA(2,1,1)
ASE = 0.8951 ASE = 0.9671

Back to Top

Conclusion

Analyzing stock data is no easy task. The ARIMA models tried, both manual and grid searched, in this case study did not seem to do an adequate job of forecasting the actual values of the BP clsoing prices. This may be because of an error in the modelling, such as a seasonal term that was not accounted for. Or it indicates that economics is a complicated field with many other variables at work such as political environment, interest rates, and supply and demand. In the future, a multivariate model could be used that might better estimate the daily closing prices.

In [ ]: