Talks and presentations

See a map of all the places I've given a talk!

Bayesian and empirical Bayesian forests.

July 06, 2015

Talk, The 32nd International Conference on Machine Learning (ICML), Lille, France

We derive ensembles of decision trees through a nonparametric Bayesian model, allowing us to view random forests as samples from a posterior distribution. This insight provides large gains in interpretability, and motivates a class of Bayesian forest (BF) algorithms that yield small but reliable performance gains. Based on the BF framework, we are able to show that high-level tree hierarchy is stable in large samples. This leads to an empirical Bayesian forest (EBF) algorithm for building approximate BFs on massive distributed datasets and we show that EBFs outperform subsampling based alternatives by a large margin.

Latent Dirichlet Allocation-based Diversified Retrieval for E-commerce Search.

February 24, 2014

Talk, The 7th ACM International Conference on Web Search and Data Mining Conference (WSDM), New York City, USA

Diversified retrieval is a very important problem on many e-commerce sites, e.g. eBay and Amazon. Using IR approaches without optimizing for diversity results in a clutter of redundant items that belong to the same products. Most existing product taxonomies are often too noisy, with overlapping structures and non-uniform granularity, to be used directly in diversified retrieval. To address this problem, we propose a Latent Dirichlet Allocation (LDA) based diversified retrieval approach that selects diverse items based on the hidden user intents.