## Abstract

We consider the regression model y_{t} = η_{t} + ε_{t}, t = 0, 1, …, n − 1, where y_{t} are scalar observations, η_{t} is the unknown regression function, and ε_{t} are unobservable errors generated by a zero-mean weakly stationary process, independent of η_{t} and with completely unknown autocorrelation structure. We propose a data-driven method for selecting a parametric or nonparametric estimator of η_{t}. The method is based on cross-validation in the frequency domain and requires no assumptions about the form of the estimator or the error correlations. It does, however, require the discrete Fourier transform (DFT) of the signal η_{t} to be a smooth complex function of frequency, as is the case, for example, with transient signals or growth and decay curves. After giving some general motivations for the method, we focus on the special case of linear estimators of a nonparametric regression function, including both kernel and spline estimators. For these estimators, we develop efficient methods of evaluating the frequency domain cross-validation (FDCV) function. The standard time domain cross-validation (TDCV) method, which leaves out data points one at a time, is sensible only when the errors are independent. Autocorrelation among the errors can cause severe biases in the TDCV function, leading to poor selections. FDCV leaves out discrete Fourier transform values one at a time. These values are approximately independent regardless of the error correlation structure, and hence FDCV remains valid even for correlated errors, as long as the DFT of η_{t} at the omitted frequency can be predicted from those remaining. Asymptotic properties of FDCV are given for a class of transient signals. Then the usefulness of FDCV for transient and other signals is demonstrated in a Monte Carlo study comparing the performances of TDCV and FDCV for selecting a kernel estimator of a nonparametric regression function. The use of FDCV is illustrated with data on international airline travel.

Original language | English (US) |
---|---|

Pages (from-to) | 705-714 |

Number of pages | 10 |

Journal | Journal of the American Statistical Association |

Volume | 85 |

Issue number | 411 |

DOIs | |

State | Published - Sep 1990 |

## Keywords

- Cross-validation
- Model selection
- Nonparametric regression

## ASJC Scopus subject areas

- Statistics and Probability
- Statistics, Probability and Uncertainty