Home > Events > 2011 Seminars & Colloquia > DAVID DUNSON - Duke University

DAVID DUNSON - Duke University

Main Content

” Sparse Nonparametric Bayesian Learning from Big Data”
When
20 October 2011 from 4:00 PM to 5:00 PM
Where
111 Tyson Bldg.
Add event to calendar
vCal
iCal

In modern applications, data sets tend to be big and highly structured, with large p, small n problems commonly encountered.  In such settings, sparse representations of the data are crucial and there is a rich frequentist literature focused on inducing sparsity through penalization (typically L1).  Motivated by genetic epidemiology and imaging applications, we instead develop nonparametric Bayesian methods that avoid parametric assumptions while favoring low-dimensional representations of complex high-dimensional data.  In this talk, the particular focus is on Bayesian probabilistic tensor factorizations, which generalize low rank matrix factorizations, such as SVD, to higher orders.  The framework accommodates general joint modeling of object data of different types (images, text, categorical, real, etc) but for simplicity we focus on two applications: (1) high-dimensional multivariate categorical data analysis (contingency tables); (2) estimation of lower dimensional manifolds from point cloud data.  In the contingency table case, we propose a collapsed Tucker factorization and develop associated methods for testing of associations and interactions in huge sparse tables.  In the manifold learning case, we propose a tensor product of basis functions for estimating 3d closed surfaces.  In both settings, theoretical results are provided on large support and asymptotic properties & efficient computational methods are developed, which scale to large data sets.

Joint work with Anirban Bhattacharya & Debdeep Pati

Filed under: ,