CSE 291: Topics in Unsupervised Learning

Time
W 5-7.50 in Center 201

Instructor:
Sanjoy Dasgupta
Office hours Wed 3-5 in AP&M 4848


Lecture schedule

[1] September 29: Large deviation theory
Finite sample bounds: Markov, Chebyshev, Chernoff-Hoeffding
The central limit theorem; Berry-Esseen theorem

Readings:
Kearns and Vazirani, An Introduction to Computational Learning Theory, appendix on large deviations.
Devroye, Gyorfi and Lugosi, A Probabilistic Theory of Pattern Recognition, pp. 121--124, 133--137.

[2] October 6: Johnson-Lindenstrauss theorem

Readings:
Dasgupta and Gupta, An elementary proof of a theorem of Johnson and Lindenstrauss.
Ball, An elementary introduction to modern convex geometry, pp. 1--8.

[3] October 13: CLT and Gaussianity
Random sections of product distributions
Projection pursuit
Other effects of random projection

Readings:
Huber, Projection pursuit, Annals of Statistics, 13(2), June 1985, pp. 435--475.
Diaconis and Freedman, Asymptotics of graphical projection pursuit. Annals of Statistics, 12(3), Sept. 1984, pp. 793--815.
Dasgupta, Experiments with random projection.

[4] October 20: Analyzing EM
A probabilistic analysis of EM
Weak Gaussianity

Readings:
Dasgupta and Schulman, A two-round variant of EM for Gaussian mixtures.

[5] October 27: Data mining in information spaces
Clustering with KL divergence
Exponential families

Readings:
Brown, Fundamentals of statistical exponential families, pp 1--20.
Pereira, Tishby and Lee, Distributional clustering of English words.
Tishby, Pereira and Bialek, The information bottleneck method.
Slonim, Somerville, Tishby and Lahav, Objective classification of galaxy spectra.

[6] November 3: Information geometry
Analytic and convexity properties of exponential families
Bregman distances

Readings:
Azoury and Warmuth, Relative loss bounds for on-line density estimation with the exponential family.

[7] November 10: Information geometry, cont'd
Extending k-means and EM to exponential families
Extending PCA to exponential families
Geometry of Bregman spaces

Readings:
Banerjee, Merugu, Dhillon and Ghosh, Clustering with Bregman divergences.
Collins, Dasgupta and Schapire, A generalization of PCA to the exponential family.
Lafferty, Della Pietra and Della Pietra, Statistical learning algorithms based on Bregman divergences.

[8] November 17: Data mining in metric spaces
Wrap-up of information geometry
Two algorithms for the k-center problem
Uses and abuses of approximation algorithms
Set cover

*** No class on November 24 ***

[9] December 1: Clustering, cont'd
Asymmetric k-center
The k-median problem: a local search algorithm

Readings:
Panigrahy and Vishwanathan, An O(log* n) approximation algorithm for the asymmetric k-center problem. Journal of Algorithms, 27(2), 1998.
Arya, Garg, Khandekar, Munagala, Meyerson and Pandit, Local search heuristic for k-median.

[10] December 8: Project presentations in AP&M 4882