Hidden Markov models : Identification, control and inverse filtering

Abstract: The hidden Markov model (HMM) is one of the workhorse tools in, for example, statistical signal processing and machine learning. It has found applications in a vast number of fields, ranging all the way from bioscience to speech recognition to modeling of user interactions in social networks. In an HMM, a latent state transitions according to Markovian dynamics. The state is only observed indirectly via a noisy sensor – that is, it is hidden. This type of model is at the center of this thesis, which in turn touches upon three main themes.Firstly, we consider how the parameters of an HMM can be estimated from data. In particular, we explore how recently proposed methods of moments can be combined with more standard maximum likelihood (ML) estimation procedures. The motivation for this is that, albeit the ML estimate possesses many attractive statistical properties, many ML schemes have to rely on local-search procedures in practice, which are only guaranteed to converge to local stationary points in the likelihood surface – potentially inhibiting them from reaching the ML estimate. By combining the two types of algorithms, the goal is to obtain the benefits of both approaches: the consistency and low computational complexity of the former, and the high statistical efficiency of the latter.The filtering problem – estimating the hidden state of the system from observations – is of fundamental importance in many applications. As a second theme, we consider inverse filtering problems for HMMs. In these problems, the setup is reversed; what information about an HMM-filtering system is exposed by its state estimates? We show that it is possible to reconstruct the specifications of the sensor, as well as the observations that were made, from the filtering system’s posterior distributions of the latent state. This can be seen as a way of reverse engineering such a system, or as using an alternative data source to build a model.Thirdly, we consider Markov decision processes (MDPs) – systems with Markovian dynamics where the parameters can be influenced by the choice of a control input. In particular, we show how it is possible to incorporate prior information regarding monotonic structure of the optimal decision policy so as to accelerate its computation. Subsequently, we consider a real-world application by investigating how these models can be used to model the treatment of abdominal aortic aneurysms (AAAs). Our findings are that the structural properties of the optimal treatment policy are different than those used in clinical practice – in particular, that younger patients could benefit from earlier surgery. This indicates an opportunity for improved care of patients with AAAs.