238 Marcella Niglio and Cira Perna
Procedure 2: Subsample length selection
- Fixm<Tand computeXˆT,m, the subsampling estimate from the entire data
setXT. - For allbm <m, computeXˆ(mi),bm, the subsampling estimate of the forecast
computed from(Xi,Xi+ 1 ,...,Xi+m− 1 ). - Select the valuebˆmthat minimizes the estimated mean square error (EMSE):
EMSE(m,bm)=(T−m+ 1 )−^1
T−∑m+ 1
i= 1
(
Xˆ(i)
m,bm−
XˆT,m
) 2
.
- Choosebˆ=(T/m)δ∗bˆm,whereδ∈( 0 , 1 )is a fixed real number.
3 Simulation results
To illustrate the performance of the proposed procedure we have used simulated data
sets generated by models with known structure. The aim is to evaluate the ability
of our procedure to select a proper value for the autoregressive parameterpin the
presence of given data-generating processes.
The simulated time series have been generated by two structures: a linear autore-
gressive model (AR) and a self-exciting threshold autoregressive model (SETAR)
that, as is well known, both belong to the class of Markov processes (1).
More precisely the simulated models are:
Model 1-AR(1):Xt=− 0. 8 Xt− 1 +t, witht∼N( 0 , 1 );
Model 2- SETAR(2;1,1):
Xt=
{
1. 5 − 0. 9 Xt− 1 +t Xt− 1 ≤ 0
− 0. 4 − 0. 6 Xt− 1 +t Xt− 1 > 0 ,
with t∼N( 0 , 1 ),
whereModel 2has been used in [21] to evaluate the forecast ability of SETAR models.
The simulation study has been implemented defining a grid value forp= 1 , 2 , 3 , 4
and using series of lengthT=70 andT=100.
In order to take intoaccount the two different lengths, we have chosen two grids
form.WhenT=70, the grid ism={ 20 , 25 , 30 , 35 }whereas forT=100 it is
m={ 25 , 35 , 40 , 50 }. The two values forThave been chosen to evaluate the proposed
procedure in the presence of series of moderate length whereas the grid formhas
been defined following [14].
Starting from these values, Procedure 1 has been run in a Monte Carlo study with
100 replications. Following [13] we have fixed the parameterδ= 0 .4 whereas the
kernel function is Gaussian and the bandwidthshi(i= 1 , 2 ,...,p) in (3) are selected
using a cross-validation criterion.
The results are summarised in Tables 1 and 2 where the distribution of the 100
simulated series is presented for the AR(1) and SETAR(2; 1,1) models respectively,
comparing the classes in whichbˆlies and the candidate values forp.