Description
In an information theoretical set-up, Rissanen (1983) has proposed the algorithm `context' for data compression. Using his idea we introduce a new sub-class of stationary, possibly sparse, Markov chains (context models), whose dimension is allowed to grow with increasing sample size. Asymptotically, this new class covers infinite dimensional models. Proposing a modification of Rissanen's algorithm (context algorithm), we show how such context models can be selected and fitted in a data-driven way. From this we gain several grounds: an excellent exploratory tool, a novel universal resampling procedure for categorical time series (context bootstrap) and a nonparametric prediction machine. We prove a novel consistency result for our data-driven context algorithm in an asymptotically infinite dimensional setting and also show the asymptotic validity of the context bootstrap for a broad range of situations. The computations can be done recursively and are very fast. A simulation study for the context bootstrap completes our exposition.