Recovering tree models via spectral graph theory
Modeling high dimensional data by latent tree graphical models is a common approach in multiple machine learning applications. In these models, the key task is to infer the structure of the tree, given only observations on its leaves. A canonical example of this setting is the tree of life, where the evolutionary history of a set of organisms is inferred by their nucleotide or protein sequences.
In this talk, we will show that the tree structure is strongly related to the spectral properties of a fully connected graph, defined over the terminal nodes of the tree. This relation forms the theoretical basis of two new methods to recover latent tree models: (i) spectral neighbor joining, where subsets of nodes are iteratively merged to form the full tree, and (ii) spectral top down recovery, where the terminal nodes are iteratively partitioned into smaller subsets. Comparing our approach to several competing methods, we show that in many settings, spectral methods have stronger theoretical guarantees and work better in practice.
תאריך עדכון אחרון : 15/12/2020