
Understanding Latent Tree Models in Machine Learning
Explore the fundamentals of Latent Tree Models in machine learning, covering basic properties, Bayesian networks, joint distributions, Pouch Latent Tree Models, and more. Learn about the relationship with finite mixture models and phylogenetic trees.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
AAAI 2014 Tutorial Latent Tree Models Part II: Definition and Properties Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. http://www.cse.ust.hk/~lzhang
Part II: Concept and Properties Latent Latent Tree Tree Models Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 2
Basic Latent Tree Models (LTM) Bayesian network All variables are discrete Structure is a rooted tree Leaf nodes are observed (manifest variables) Internal nodes are not observed (latent variables) Also known as Hierarchical latent class (HLC) models, HLC models (Zhang. JMLR 2004) Parameters: P(Y1), P(Y2|Y1),P(X1|Y2), P(X2|Y2), Semantics: AAAI 2014 Tutorial Nevin L. Zhang HKUST 3
Joint Distribution over Observed Variables Marginalizing out the latent variables in , we get a joint distribution over the observed variables . In comparison with Bayesian network without latent variables, LTM: Is computationally very simple to work with. Represent complex relationships among manifest variables. What does the structure look like without the latent variables? AAAI 2014 Tutorial Nevin L. Zhang HKUST 4
Pouch Latent Tree Models (PLTM) (Poon et al. ICML 2010) An extension of basic LTM Rooted tree Internal nodes represent discrete latent variables Each leaf node consists of one or more continuous observed variable, called a pouch. AAAI 2014 Tutorial Nevin L. Zhang HKUST 5
More General Latent Variable Tree Models Some internal nodes can be observed Internal nodes can be continuous Forest (Choi et al. JMLR 2011) Primary focus of this tutorial: the basic LTM AAAI 2014 Tutorial Nevin L. Zhang HKUST 6
Part II: Concept and Properties Latent Latent Tree Tree Models Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 7
Finite Mixture Models (FMM) Gaussian Mixture Models (GMM): Continuous attributes Graphical model AAAI 2014 Tutorial Nevin L. Zhang HKUST 8
Finite Mixture Models (FMM) GMM with independence assumption Block diagonal co-variable matrix Graphical Model AAAI 2014 Tutorial Nevin L. Zhang HKUST 9
Finite Mixture Models Latent class models (LCM): Discrete attributes Graphical Model Distribution for cluster k: Product multinomial distribution: All FMMs One latent variable Yielding one partition of data AAAI 2014 Tutorial Nevin L. Zhang HKUST 10
From FMMs to LTMs Start with several GMMs, Each based on a distinct subset of attributes Each partitions data from a certain perspective. Different partitions are independent of each other Link them up to form a tree model Get Pouch LTM Consider different perspectives in a single model Multiple partitions of data that are correlated. AAAI 2014 Tutorial Nevin L. Zhang HKUST 11
From FMMs to LTMs Start with several LCMs, Each based on a distinct subset of attributes Each partitions data from a certain perspective. Different partitions are independent of each other Link them up to form a tree model Get LTM Consider different perspectives in a single model Multiple partitions of data that are correlated. Summary: An LTM can be viewed as a collections of FMMs, with their latent variables linked up to form a tree structure. AAAI 2014 Tutorial Nevin L. Zhang HKUST 12
Part II: Concept and Properties Latent Latent Tree Tree Models Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 13
Phylogenetic trees TAXA (sequences) identify species Edge lengths represent evolution time Usually, bifurcating tree topology Durbin, et al. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. AAAI 2014 Tutorial Nevin L. Zhang HKUST 14
Probabilistic Models of Evolution Two assumptions There are only substitutions, no insertions/deletions (aligned) One-to-one correspondence between sites in different sequences Each site evolves independently and identically P(x|y, t) = i=1 to m P(x(i) | y(i), t) m is sequence length P(x(i)|y(i), t) Jukes-Cantor (Character Evolution) Model [1969] Rate of substitution AAAI 2014 Tutorial Nevin L. Zhang HKUST 15
Phylogenetic Trees are Special LTMs When focus on one site, phylogenetic trees are special latent tree models The structure is a binary tree The variables share the same state space. Each conditional distribution is characterized by only one parameters, i.e., the length of the corresponding edge AAAI 2014 Tutorial Nevin L. Zhang HKUST 16
Hidden Markov Models Hidden Markov models are also special latent tree models All latent variables share the same state space. All observed variables share the same state space. P(yt |st) and P(st+1 | st ) are the same for different t s. AAAI 2014 Tutorial Nevin L. Zhang HKUST 17
Part II: Concept and Basic Properties Latent Tree Models Definition Relationship with finite mixture models Relationship with phylogenetic trees Basic Properties Basic Properties AAAI 2014 Tutorial Nevin L. Zhang HKUST 18
Two Concepts of Models So far, a model consists of Observed and latent variables Connections among the variables Probability values For the rest of Part II, a model consists of Observed and latent variables Connections among the variables Probability parameters AAAI 2014 Tutorial Nevin L. Zhang HKUST 19
Model Inclusion AAAI 2014 Tutorial Nevin L. Zhang HKUST 20
Model Equivalence If m includes m and vice versa, then they are marginally equivalent. If they also have the same number of free parameters, then they are equivalent. It is not possible to distinguish between equivalent models based on data. AAAI 2014 Tutorial Nevin L. Zhang HKUST 21
Root Walking AAAI 2014 Tutorial Nevin L. Zhang HKUST 22
Root Walking Example Root walks to X2; Root walks to X3 AAAI 2014 Tutorial Nevin L. Zhang HKUST 23
Root Walking Theorem Theorem: Root walking leads to equivalent latent tree models. (Zhang, JMLR 2004) Special case of covered arc reversal in general Bayesian network, Chickering, D. M. (1995). A transformational characterization of equivalent Bayesian network structures. UAI. AAAI 2014 Tutorial Nevin L. Zhang HKUST 24
Implication Edge orientations in latent tree models are not identifiable. Technically, better to start with alternative definition of LTM: A latent tree model (LTM) is a Markov random field over an undirected tree, or tree-structured Markov network where variables at leaf nodes are observed and variables at internal nodes are hidden. AAAI 2014 Tutorial Nevin L. Zhang HKUST 25
Implication For technical convenience, we often root an LTM at one of its latent nodes and regard it as a directed graphical model. Rooting the model at different latent nodes lead to equivalent directed models. This is why we introduced LTM as directed models. AAAI 2014 Tutorial Nevin L. Zhang HKUST 26
Regularity |X|: Cardinality of variable X, i.e., the number of states. AAAI 2014 Tutorial Nevin L. Zhang HKUST 27
Regularity (Zhang, JMLR 2004) Can focus on regular models only Irregular models can be made regular Regularized models better than irregular models Theorem: Theorem: The set of all regular models for a given set of observed variables is finite. AAAI 2014 Tutorial Nevin L. Zhang HKUST 28