|8:30 - 8:35
|8:35 - 9:00
|Rajesh Ranganath: Challenges to variational inference: Optimization, automation, and accuracy Slides
|9:00 - 9:15
|Theophane Weber: Reinforced Variational Inference (contributed) Slides
|9:15 - 10:00
|Panel: Tricks of the Trade
Matt Hoffman, Danilo Rezende, David Duvenaud, Alp Kucukelbir, Stephan Mandt, Michael Betancourt
Moderator: Tamara Broderick
|10:30 - 10:55
|Andrea Montanari: Approximate inference with semidefinite relaxations Slides
|10:55 - 11:10
|Rajesh Ranganath: Hierarchical Variational Models (contributed) Slides
|11:10 - 11:30
|11:30 - 1:00
|2:40 - 3:05
|James Hensman: A framework for variational inference in Gaussian process models Slides
|3:05 - 3:20
|Daniel Hernandez-Lobato: Stochastic Expectation Propagation for Large Scale Gaussian Process Classification (contributed) Slides
|3:20 - 3:35
|Cedric Archambeau: Incremental Variational Inference for Latent Dirichlet Allocation (contributed) Slides
|3:35 - 4:00
|Emily Fox: Variational inference for large-scale and streaming sequential data Slides
|4:30 - 4:55
|Manfred Opper: Approximate inference for Ising models with random couplings Slides
|4:55 - 6:00
|Panel: On the Foundations and
Future of Approximate Inference
Max Welling, Yee Whye Teh, Andrew Gelman, Steve MacEachern, Ulrich Paquet
Moderator: David Blei
Variational inference is experiencing a resurgence. In
recent years, researchers have expanded the scope of
variational inference to more complex Bayesian models,
reduced its computational cost, and developed new
theoretical insights. However, several challenges remain. In
this talk I will discuss some of our recent work on these
challenges: addressing local optima with tempering,
automating variational inference in Stan, and constructing
richer variational approximations with variational models.
Along the way, I'll discuss some research questions related
to these areas.
Joint work with Dave Blei, Alp Kucukelbir, Stephan Mandt, James McInerney, and Dustin Tran.
Statistical inference problems arising within signal
processing, data mining, and machine learning naturally give
rise to hard combinatorial optimization problems. These
problems become intractable when the dimensionality of the
data is large, as is often the case for modern datasets. A
popular idea is to construct convex relaxations of these
combinatorial problems, which can be solved efficiently for
large scale datasets.
Semidefinite programming (SDP) relaxations are among the
most powerful methods in this family, and are surprisingly
well-suited for a broad range of problems where data take
the form of matrices or graphs. It has been observed several
times that, when the `statistical noise' is small enough,
SDP relaxations correctly detect the underlying
I will present a few asymptotically exact predictions for
the `detection thresholds' of SDP relaxations, with
applications to synchronization and community detection.
Joint work with Adel Javanmard, Federico Ricci-Tersenghi, and Subhabrata Sen.
Gaussian process models are widely used in statistics and
machine learning. There are three key challenges to
inference that might be tackled using variational methods:
inference over the latent function values when the
likelihood is non-Gaussian; scaling the computation to large
datasets; inference over the kernel-parameters. I'll show
how the variational framework can be used to tackle any or
all of these challenges. In particular, I'll share recent
insights which allow us to distinguish the variational
stochastic process approximation, improving on the idea of a
low-rank approximation to the posterior. To do this we show
that it's possible to minimize the KL divergence between the
true and approximate stochastic processes.
Joint work with Alexander G. de G. Matthews, Nicolo Fusi, Maurizio Filippone, Neil D. Lawrence, and Zoubin Ghahramani.
Variational inference algorithms have proven successful for
Bayesian analysis in large data settings, with recent
advances using stochastic variational inference (SVI).
However, such methods have largely been studied in
independent data settings. We develop an SVI algorithm to
learn the parameters of hidden Markov models (HMMs). The
challenge in applying stochastic optimization in this
setting arises from dependencies in the chain, which must be
broken to consider minibatches of observations. We propose
an algorithm that harnesses the memory decay of the chain to
adaptively bound errors arising from edge effects. We
demonstrate the effectiveness of our algorithm on synthetic
experiments and a large genomics dataset where a batch
algorithm is computationally infeasible. We will also
briefly discuss the streaming data scenario.
Joint work with Nick Foti, Jason Xu, Dillon Laird, and Alex Tank.
Assume that we try to compute the expectations of variables
in an Ising model with pairwise interactions. Interpreting
this model as a latent Gaussian variable model, an EP style
algorithm would seem to be a possible solution. However, the
necessary matrix inversions would make this approach
numerically unfeasible when the model is large. Things
simplify when it is known that the couplings in the model
are drawn at random from an invariant random matrix
distribution. In this case, the fixed points of EP are
solutions to so—called TAP equations studied in statistical
physics for which some costly matrix terms are ‘averaged
out’. But how should one solve such equations? In this
talk I will propose an algorithmic approach for this task
and analyse its properties for large systems using dynamical
functional methods of statistical physics. I will finally
present results for different random matrix ensembles.
Joint work with Burak Cakmak and Ole Winther.