Fall 2021 Colloquia
Thank you for your interest! At this time there are no upcoming colloquia for Fall 2021. Our Colloquia series will return in Spring 2022!
Title: Approximate Laplace Approximation
Abstract: Bayesian model selection requires an integration exercise in order to assign posterior model probabilities to each candidate model. The computation becomes cumbersome when the integral has no closed-form, particularly when the sample size is large, or the number of models is large. We present a simple yet powerful idea based on the Laplace approximation (LA) to an integral. LA uses a quadratic Taylor expansion at the mode of the integrand and is typically quite accurate, but requires cumbersome likelihood evaluations (for large n) an optimization (for large p). We propose the approximate Laplace approximation (ALA), which uses an Taylor expansion at the null parameter value. ALA brings very significant speed-ups by avoiding optimizations altogether, and evaluating likelihoods via sufficient statistics. ALA is an approximate inference method equipped with strong model selection properties in the family of non-linear GLMs, attaining comparable rates to exact computation. When (inevitably) the model is misspecified the ALA rates can actually be faster than for exact computation, depending on the type of misspecification. We show examples in non-linear Gaussian regression with non-local priors, for which no closed-form integral exists, as well as non-linear logistic, Poisson and survival regression.
Title: Search Algorithms and Loss Functions for Bayesian Clustering
Abstract: We propose a randomized greedy search algorithm to find a point estimate for a random partition based on a loss function and posterior Monte Carlo samples. Given the large size and awkward discrete nature of the search space, the minimization of the posterior expected loss is challenging. Our approach is a stochastic search based on a series of greedy optimizations performed in a random order and is embarrassingly parallel. We consider several loss functions, including Binder loss and variation of information. We note that criticisms of Binder loss are the result of using equal penalties of misclassification and we show an efficient means to compute Binder loss with potentially unequal penalties. Furthermore, we extend the original variation of information to allow for unequal penalties and show no increased computational costs. We provide a reference implementation of our algorithm. Using a variety of examples, we show that our method produces clustering estimates that better minimize the expected loss and are obtained faster than existing methods.
David B. Dahl; Devin J. Johnson; Peter Mueller
Title: Sparse Models for Sparse Networks
Abstract: Networks are ubiquitous in modern society and science. Stylized features of a typical network include network sparsity, degree heterogeneity and homophily among many others. This talk introduces a framework with a class of sparse models that utilize parameters to explicitly account for these network features. In particular, the degree heterogeneity is characterized by node-specific parametrization while homophily is captured by the use of covariates. To avoid over-parametrization, one of the key assumptions in our framework is to differentially assign node-specific parameters. We start by discussing the sparse \beta model when no covariates are present, and proceed to discuss a generalized model to include covariates. Interestingly for the former we can use \ell_0 penalization to identify and estimate the heterogeneity parameters, while for the latter we resort to penalized logistic regression with an \ell_1 penalty, thus immediately connecting our methodology to the lasso literature. Along the way, we demonstrate the fallacy of what we call data-selective inference, a common practice in the literature to discard less well-connected nodes in order to fit a model, which can be of independent interest.
Title: Gaussian Graphical Regression Models with High Dimensional Responses and Covariates
Abstract: Though Gaussian graphical models have been widely used in many scientific fields, relatively limited progress has been made to link graph structures to external covariates. We propose a Gaussian graphical regression model, which regresses both the mean and the precision matrix of a Gaussian graphical model on covariates. In the context of co-expression quantitative trait locus (QTL) studies, our method can determine how genetic variants and clinical conditions modulate the subject-level network structures, and recover both the population-level and subject-level gene networks. Our framework encourages sparsity of covariate effects on both the mean and the precision matrix. In particular for the precision matrix, we stipulate simultaneous sparsity, i.e., group sparsity and element-wise sparsity, on effective covariates and their effects on network edges, respectively. We establish variable selection consistency first under the case with known mean parameters and then a more challenging case with unknown means depending on external covariates, and establish in both cases the l2 convergence rates of the estimated precision parameters. The utility and efficacy of our proposed method is demonstrated through simulation studies and an application to a co-expression QTL study with brain cancer patients.
Title: Two Approaches to Unmeasured Spatial Confounding
Abstract: Spatial confounding has different interpretations in the spatial and causal inference literatures. I will begin this talk by clarifying these two interpretations. Then, seeing spatial confounding through the causal inference lense, I discuss two approaches to account for unmeasured variables that are spatially structured when we are interested in estimating causal effects. The first approach is based on the propensity score. We introduce the distance adjusted propensity scores (DAPS) that combine spatial distance and propensity score difference of treated and control units in a single quantity. Treated units are then matched to control units if their corresponding DAPS is low. We can show that this approach is consistent, and we propose a way to choose how much matching weight should be given to unmeasured spatial variables. In the second approach, we aim to bridge the spatial and causal inference literatures by estimating causal effects in the presence of unmeasured spatial variables using outcome modeling tools that are popular in spatial statistics. Motivated by the bias term of commonly-used estimators in spatial statistics, we propose an affine estimator that addresses this deficiency. I will discuss that estimation of causal parameters in the presence of unmeasured spatial confounding can only be achieved under an untestable set of assumptions. We provide one such set of assumptions which describe how the exposure and outcome of interest relate to the unmeasured variables.
Title: Least Squares and Maximum Likelihood Estimation of Sufficient Reductions in Regressions with Matrix Valued Predictors
Abstract: We propose methods to estimate sufficient reductions of matrix-valued predictors for regression or classification. We assume that the first moment of the predictor matrix given the response can be decomposed into a row and column component via a Kronecker product structure. We obtain least squares and maximum likelihood estimates of the sufficient reductions of the matrix predictors, derive statistical properties of the resulting estimates and present fast computational algorithms with assured convergence. The performance of the proposed approaches in regression and classification is compared in simulations. We illustrate the methods on two examples, using longitudinally measured serum biomarker and neuroimaging data.
Title: Challenges with Differential Abundance Analysis of Microbiome Data
Abstract: Increasingly researchers are interested in understanding changes in microbial compositions in two or more study groups or experimental conditions. Two common statistical parameters of interest are differential abundance of taxa and differential relative abundance of taxa, in a unit volume of an ecosystem. A variety of statistical methods are introduced in the literature under various assumptions. These methods are often evaluated using simulation studies. Unfortunately, the simulation studies are not always precise about the null hypothesis and thus resulting in misleading conclusions. One of the major challenges with these data is normalization of the data. This issue is not limited to microbiome data but also exists for other count data. In this talk we shall describe some recent developments in this field and illustrate our methodology using an unpublished HIV AIDS microbiome data.
Title: Inference and Privacy (and Bird Migration)
Abstract: Differential privacy is a dominant standard for privacy-preserving data analysis, and requires that an algorithm is randomized “just enough” to mask the contribution of any one individual in the input data set when viewing the algorithm’s output. This leads to new inference problems where we wish to reason about the truth given measurements with excess noise added for privacy. I will discuss recent work from my research group on inference and differential privacy, including probabilistic inference for more accurately answering database queries under privacy constraints, private Bayesian inference, and private confidence interval construction. I will briefly mention how underlying technical ideas were motivated by, and helped advance, inference about bird migration patterns from community science data.
Bio: Daniel Sheldon is an associate professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst, where, until 2020, he held a joint appointment with the Department of Computer Science at Mount Holyoke College. Previously, he was a postdoctoral fellow at Oregon State University. He received a Ph.D. in computer science from Cornell University in 2009 and an AB in mathematics from Dartmouth College in 1999. His research investigates fundamental problems in probabilistic machine learning and applied algorithms, often motivated by applications, including bird migration, endangered species conservation, data privacy, and epidemiology. His work is regularly funded by the NSF. He received an NSF CAREER award, and two best paper awards for computational sustainability at AAAI.
Title: Recent Approaches for Efficient Analysis of Multiple and Multivariate Time Series
Abstract: We present two modeling approaches for efficient analysis of multiple and multivariate time series. The first approach is based on a time-varying model representation in the partial autocorrelation domain, or TV-VPARCOR, that allows us to analyze multivariate non-stationary time series in a computationally efficient manner. A key aspect of the TV-VPARCOR representation is that it is of lower dimension than time-varying vector autoregressive representations that are commonly used in practice, while still providing significant modeling flexibility. We also discuss hierarchical extensions that allow us to analyze multiple, rather than multivariate, non-stationary time series. The second approach deals with modeling multivariate time series in the spectral domain and proposes fast and scalable inference via stochastic gradient variational Bayes. The PARCOR and spectral domain approaches are illustrated in extensive simulation studies and in several applied settings, including the analysis of multi-channel neuroimaging data.
Title: Non-Asymptotic Aspects of Sampling From Heavy-Tailed Distributions via Transformed Langevin Monte Carlo
Abstract: Langevin Monte Carlo (LMC) algorithms and their stochastic versions are widely used for sampling and large-scale Bayesian inference. Non-asymptotic properties of the LMC algorithm have been examined intensely over the last decade. However, existing analyses are restricted to the case of light-tailed (yet multi-modal) densities. In this talk, I will first present a variable transformation based approach for sampling from heavy-tailed densities using the LMC algorithm. This algorithm is motivated by a related approach for Metropolis random walk algorithm by Johnson and Geyer, 2013. I will next present non-asymptotic oracle complexity analysis of the proposed algorithm with illustrative examples. It will be shown that the proposed approach 'works' as long as the heavy-tailed target density satisfies certain tail conditions closely related to the so-called weak-Poincaré inequality.
Title: Stick-Breaking Non-Parametric Priors via Dependent Length Variables
Abstract: In this talk, we present new classes of Bayesian nonparametric prior distributions. By allowing length random variables, in stick-breaking constructions, to be exchangeable or Markovian, appealing models for discrete random probability measures appear. Tuning the stochastic dependence in such length variables allows to recover extreme families of random probability measures, i.e. Dirichlet and Geometric processes. As a byproduct, the ordering of the weights, in the species sampling representation, can be controlled and thus tuned for efficient MCMC implementations in density estimation or unsupervised classification problems. Various theoretical properties and illustrations will be presented.
Title: Orthogonal Subsampling for Big Data Linear Regression
Abstract: The dramatic growth of big datasets presents a new challenge to data storage and analysis. Data reduction, or subsampling, that extracts useful information from datasets is a crucial step in big data analysis. We propose an orthogonal subsampling (OSS) approach for big data with a focus on linear regression models. The approach is inspired by the fact that an orthogonal array of two levels provides the best experimental design for linear regression models in the sense that it minimizes the average variance of the estimated parameters and provides the best predictions. The merits of OSS are three-fold: (i) it is easy to implement and fast; (ii) it is suitable for distributed parallel computing and ensures the subsamples selected in different batches have no common data points; and (iii) it outperforms existing methods in minimizing the mean squared errors of the estimated parameters and maximizing the efficiencies of the selected subsamples. Theoretical results and extensive numerical results show that the OSS approach is superior to existing subsampling approaches. It is also more robust to the presence of interactions among covariates and, when they do exist, OSS provides more precise estimates of the interaction effects than existing methods. The advantages of OSS are also illustrated through analysis of real data.