## Advances in Approximate Bayesian Inference

NIPS 2015 Workshop; December 11, 2015
Room 513 ab, Palais des Congrès de Montréal, Montréal, Canada

### Session 1

#### Chair: Tamara Broderick

 8:30 - 8:35 Introduction 8:35 - 9:00 Rajesh Ranganath: Challenges to variational inference: Optimization, automation, and accuracy 9:00 - 9:15 Theophane Weber: Reinforced Variational Inference (contributed) 9:15 - 10:00 Panel: Tricks of the Trade Matt Hoffman, Danilo Rezende, David Duvenaud, Alp Kucukelbir, Stephan Mandt, Michael Betancourt Moderator: Tamara Broderick

### Session 2

#### Chair: Alp Kucukelbir

 10:30 - 10:55 Andrea Montanari: Approximate inference with semidefinite relaxations 10:55 - 11:10 Rajesh Ranganath: Hierarchical Variational Models (contributed) 11:10 - 11:30 Poster spotlights Jose Miguel Hernandez-Lobato: Black-box α-divergence Minimization Brooks Paige: Inference Networks for Graphical Models Ryan Giordano: Robust Inference with Variational Bayes Philip Bachman: Training Deep Generative Models: Variations on a Theme 11:30 - 1:00 Poster session

### Session 3

#### Chair: Dustin Tran, James McInerney

 2:40 - 3:05 James Hensman: A framework for variational inference in Gaussian process models 3:05 - 3:20 Daniel Hernandez-Lobato: Stochastic Expectation Propagation for Large Scale Gaussian Process Classification (contributed) 3:20 - 3:35 Cedric Archambeau: Incremental Variational Inference for Latent Dirichlet Allocation (contributed) 3:35 - 4:00 Emily Fox: Variational inference for large-scale and streaming sequential data

### Session 4

#### Chair: Stephan Mandt

 4:30 - 4:55 Manfred Opper: Approximate inference for Ising models with random couplings 4:55 - 6:00 Panel: On the Foundations and Future of Approximate Inference Max Welling, Yee Whye Teh, Andrew Gelman, Steve MacEachern, Ulrich Paquet Moderator: David Blei

### Abstracts

#### Challenges to variational inference: Optimization, automation, and accuracy

Abstract. Variational inference is experiencing a resurgence. In recent years, researchers have expanded the scope of variational inference to more complex Bayesian models, reduced its computational cost, and developed new theoretical insights. However, several challenges remain. In this talk I will discuss some of our recent work on these challenges: addressing local optima with tempering, automating variational inference in Stan, and constructing richer variational approximations with variational models. Along the way, I'll discuss some research questions related to these areas.
Joint work with Dave Blei, Alp Kucukelbir, Stephan Mandt, James McInerney, and Dustin Tran.

#### Approximate inference with semidefinite relaxations

Abstract. Statistical inference problems arising within signal processing, data mining, and machine learning naturally give rise to hard combinatorial optimization problems. These problems become intractable when the dimensionality of the data is large, as is often the case for modern datasets. A popular idea is to construct convex relaxations of these combinatorial problems, which can be solved efficiently for large scale datasets. Semidefinite programming (SDP) relaxations are among the most powerful methods in this family, and are surprisingly well-suited for a broad range of problems where data take the form of matrices or graphs. It has been observed several times that, when the statistical noise' is small enough, SDP relaxations correctly detect the underlying combinatorial structures. I will present a few asymptotically exact predictions for the detection thresholds' of SDP relaxations, with applications to synchronization and community detection.
Joint work with Adel Javanmard, Federico Ricci-Tersenghi, and Subhabrata Sen.

#### A framework for variational inference in Gaussian process models

Abstract. Gaussian process models are widely used in statistics and machine learning. There are three key challenges to inference that might be tackled using variational methods: inference over the latent function values when the likelihood is non-Gaussian; scaling the computation to large datasets; inference over the kernel-parameters. I'll show how the variational framework can be used to tackle any or all of these challenges. In particular, I'll share recent insights which allow us to distinguish the variational stochastic process approximation, improving on the idea of a low-rank approximation to the posterior. To do this we show that it's possible to minimize the KL divergence between the true and approximate stochastic processes.
Joint work with Alexander G. de G. Matthews, Nicolo Fusi, Maurizio Filippone, Neil D. Lawrence, and Zoubin Ghahramani.

#### Variational inference for large-scale and streaming sequential data

Abstract. Variational inference algorithms have proven successful for Bayesian analysis in large data settings, with recent advances using stochastic variational inference (SVI). However, such methods have largely been studied in independent data settings. We develop an SVI algorithm to learn the parameters of hidden Markov models (HMMs). The challenge in applying stochastic optimization in this setting arises from dependencies in the chain, which must be broken to consider minibatches of observations. We propose an algorithm that harnesses the memory decay of the chain to adaptively bound errors arising from edge effects. We demonstrate the effectiveness of our algorithm on synthetic experiments and a large genomics dataset where a batch algorithm is computationally infeasible. We will also briefly discuss the streaming data scenario.
Joint work with Nick Foti, Jason Xu, Dillon Laird, and Alex Tank.

#### Approximate inference for Ising models with random couplings

Abstract. Assume that we try to compute the expectations of variables in an Ising model with pairwise interactions. Interpreting this model as a latent Gaussian variable model, an EP style algorithm would seem to be a possible solution. However, the necessary matrix inversions would make this approach numerically unfeasible when the model is large. Things simplify when it is known that the couplings in the model are drawn at random from an invariant random matrix distribution. In this case, the fixed points of EP are solutions to so—called TAP equations studied in statistical physics for which some costly matrix terms are ‘averaged out’. But how should one solve such equations? In this talk I will propose an algorithmic approach for this task and analyse its properties for large systems using dynamical functional methods of statistical physics. I will finally present results for different random matrix ensembles.
Joint work with Burak Cakmak and Ole Winther.