A warning about arima() in R 2010/12/22Posted by Alex in : Uncategorized , trackback
Just spent a bit of time trying to work out why arima() and lm() were giving different estimates on some dengue data, and on solving the problem, thought I’d share what I found via simulated data.
set.seed(666) #for luck
n=10000 #large sample size
y=c(1,2,3,4) #initial values
for(i in 5:n)
Calling arima() gives the following coefficients:
- ar1 0.29 [should be 0.3, ok]
- ar2 0.19 [should be 0.2, ok]
- ar3 0.09 [should be 0.1, ok]
- ar4 -0.09 [should be -0.1, ok]
- intercept 1.0 [should be 0.5, wrong!]
while lm() gives
- ar1 0.29
- ar2 0.19
- ar3 0.09
- ar4 -0.09
- intercept 0.53 [ok!]
An explanation can be found via the website of Shumway and Stoffer: what R provides in arima() is the mean of the stationary distribution, not the “intercept” that you’d expect having learned basic regression. I wonder how many people have been caught out by this?
PS Just noticed in the original version I wrote anova in place of arima. Doh!