CS 584 - Big Data Analytics

Course Description

The course covers scalable machine learning and data mining algorithms for large/complex data. Topics include large-scale optimization techniques, hashing, recommendation systems, and tensor factorization. This will be structured as a seminar course with emphasis on public data sets such as Kaggle competitions, MovieLens, and various healthcare datasets. There will be introductory lectures that set the context and provide reviews of relevant material.

Course prerequisites: Graduate Data Mining (CS 570) and familiarity with Python, Matlab, or R.

Instructor: Joyce Ho
Office Hours: Tu/Th 1-4 pm @ MSC W414

Course Schedule

Topic	Subtopic	Readings	Presenter	Slides
Introduction			Instructor	Intro Convex Opt Part I Convex Opt Part II Data Mining Review Deep Learning Intro
Large-Scale Learning Techniques	Stochastic Gradient Descent	Large Scale Online Learning [Bottou & LeCun, 2003] Large-Scale Machine Learning with Stochastic Gradient Descent [Bottou, 2010] Stochastic Gradient Descent Tricks [Bottou, 2012] Hogwild! [Niu et al., 2011]	Instructor	SGD Hogwild!
	Alternating direction method of multipliers	Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers [Boyd et al., 2011] Augmented Lagrangian and Alternating Direction Methods for Convex Optimization: A Tutorial and Some Illustrative Computational Results [Eckstein, 2012] A distributed algorithm for fitting generalized additive models [Chu et al., 2013]	Instructor	ADMM GAM+ADMM
	Sampling	Big Data Bootstrap [Kleiner et al., 2012] Bayesian Methods for Machine Learning [Neal, 20014] An Introduction to MCMC for Machine Learning [Andrieu et al., 2003] Scalable and Robust Bayesian Inference via the Median Posterior [Minsker et al., 2014]	Instructor	Sampling+Performance BDB Bayesian Methods Overview M-Posterior
Nearest Neighbor Search	KD-tree	An introductory tutorial on kd-trees [Moore, 1991] An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions [Arya et al., 1998]	Instructor	NN+kd-tree
	Locality-sensitive hashing	Chapter 3 of Mining of Massive Datasets [Leskovec et al, 2014] Locality-Sensitive Hashing for Finding Nearest Neighbors [Slaney & Casey, 2008] Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions [Andoni & Indyk, 2008] Hashing for Similarity Search: A Survey [Wang et al., 2014] Streaming similarity search over one billion tweets using parallel locality-sensitive hashing [Sunderam et al., 2013] Coding for Random Projections [Li et al., 2014]	Instructor + students	LSH Part I LSH Part II Coding for Random Projections
	Sketches	Chapter 4 of Mining of Massive Datasets [Leskovec et al, 2014] Sketch Techniques for Approximate Query Processing [Cormode, 2011] Conditional Random Sampling: A Sketch-based Sampling Technique for Sparse Data [Li et al., 2006] One Sketch For All: Theory and Application of Conditional Random Sampling [Li et al, 2008] FLAG: Fast Large-Scale Graph Construction for NLP [Goyal et al., 2012]	Instructor + students	Sketching Overview
Matrix Factorization	Distributed matrix factorization	(Introduction) Matrix Factorization Techniques for Recommender Systems [Koren et al., 2009] Scalable Collaborative Filtering Approaches for Large Recommender Systems [Takacs et al., 2009] Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent [Gemulla et al., 2011] A Fast Parallel SGD for Matrix Factorization in Shared Memory Systems [Zhuang et al., 2013] Divide-and-Conquer Matrix Factorization [MacKey et al., 2011] Parallel Matrix Factorization for Recommender Systems [Yu et al., 2014]	Students	MF Overview
Tensor Factorization	Large scale tensor factorization	(Introduction) Tensor Decompositions and Applications [Kolda & Bader, 2009] PARCUBE: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition [Papalexakis, 2015] FlexiFaCT: Scalable Flexible Factorization of Coupled Tensors on Hadoop [Beutel et al., 2014] Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data [Hu et al., 2015]	Students	Tensor Overview FlexiFaCT
Transfer Learning	Multitask learning	A Survey on Transfer Learning [Pan & Yang, 2010] Integrating Low-Rank and Group-Sparse Structures for Robust Multi-Task Learning [Chen et al., 2011] A Regularization Approach to Learning Task Relationships in Multitask Learning [Zhang & Yeung, 2014] Scalable Hierarchical Multitask Learning Algorithms for Conversion Optimization in Display Advertising [Ahmed et al., 2014]	Students	Low-Rank + Group-Sparse Multitask