Speaker
Xinyuan (Eric) Chen, Department of Statistics and Data Science, Yale University
Title
Virtual Statistics Seminar
Subtitle
Variational Bayesian Analysis of Nonhomogeneous Hidden Markov Models with Long and Ultra-long Sequences
Abstract: Nonhomogeneous hidden Markov models (NHMMs) are useful in modeling sequential and autocorrelated data. Bayesian approaches, particularly Markov chain Monte Carlo (MCMC) methods, are principal statistical inference tools for NHMMs. However, MCMC sampling is computationally demanding especially for long observation sequences. We develop a variational Bayes (VB) method for NHMMs, which utilizes a structured variational family of Gaussian distributions with factorized covariance matrices to approximate target posteriors, combining forward-backward algorithm and stochastic gradient ascent in estimation. To improve efficiency and handle ultra-long sequences, we further propose a subsequence VB (SVB) method that works on subsamples. The SVB method exploits the memory decay property of NHMMs and uses buffers to control for bias caused by breaking sequential dependence from subsampling. We highlight that local nonhomogeneity of NHMMs substantially affects required buffer lengths and propose the use of local Lyapunov exponents which characterizes local memory decay rates of NHMMs and determines buffer lengths adaptively. Our methods are applied in modeling eye-tracking scan-paths of autistic children to examine the comparative social salience of humanoid representation and person in designed social-communicative scenes and in modeling ultra-long sequences of customers' telecom records to uncover the relationship between their mobile Internet use behavior and conventional telecommunication behaviors.