Sparse Markov Models for High-dimensional Inference

Abstract: Finite order Markov models are theoretically well-studied models for dependent data. Despite their generality, application in empirical work when the order is larger than one is quite rare. Practitioners avoid using higher order Markov models because (1) the number of parameters grow exponentially with the order, (2) the interpretation is often difficult. Mixture of transition distribution models (MTD) were introduced to overcome both limitations. MTD represent higher order Markov models as a convex mixture of single step Markov chains, reducing the number of parameters and increasing the interpretability. Nevertheless, in practice, estimation of MTD models with large orders are still limited because of curse of dimensionality and high algorithm complexity. Here, we prove that if only few lags are relevant we can consistently and efficiently recover the lags and estimate the transition probabilities of high order MTD models. Furthermore, we show that using the selected lags we can construct non-asymptotic confidence intervals for the transition probabilities of the model. The key innovation is a recursive procedure for the selection of the relevant lags of the model. Our results are based on (1) a new structural result of the MTD and (2) an improved martingale concentration inequality. Our theoretical results are illustrated through simulations.