Overview of the course
Machine Learning 101, deals primarily with supervised learning problems. Machine Learning 102 covers unsupervised learning and fault detection.
Both 101 and 102 begin at the level of elementary probability and statistics and from that background survey a broad array of machine learning techniques. The classes will give participants a working knowledge of these techniques and will leave them prepared to apply those techniques to real problems. To get the most out of the class, participants will need to work through the homework assignments.
Prerequisites
This class assumes a moderate level of computer programming proficiency. We will use R (the open source statistics language) for the homework and for the examples in class. We will cover some of the basics of R and do not assume any prior knowledge of R. You can find references to how to use R on this website and we will give out sample code during classes that will help get you started.
You'll need some general beginnerlevel background in probability, calculus, linear algebra and vector calculus. We will cover most of what is required during the lectures. The appendices in the back of the Tan text are more than sufficient level for this class.
Machine Learning 101 and 102 can be taken in any any order. The prerequisites for the two classes are the same. They second five week session (Machine Learning 102) will culminate in the students giving presentations on papers they have read.
Why use R?
We're going to use R as our lingua franca for looking at homework problems, discussing them and comparing different solution approaches. Load R onto your laptop or desk computer before you come to the first class. http://cran.rproject.org/ We will include some descriptive material on using R in the first two lectures in order to get everyone up to speed on it. To integrate R with Eclipse click here. References for R are here: References for R Comment on these references here: Reference for R Comments More R references
Please note that anyone can read this web site, however only the instructors have permission to write on the site. We welcome new members to the class, but we are not granting permissions to edit this site.
General Sequence of Classes:
Machine Learning 101: Supervised learning
Machine Learning 102: Unsupervised Learning and Fault Detection
Text: "Introduction to Data Mining", by PangNing Tan, Michael Steinbach and Vipin Kumar
Machine Learning 201: Advanced Regression Techniques, Generalized Linear Models, and Generalized Additive Models
Machine Learning 202: Collaborative Filtering, Bayesian Belief Networks, and Advanced Trees
Text: "The Elements of Statistical Learning  Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman
Future Topics
Data Mining Social Networks
Text Mining
Recommender Methods
Big Data
Outline for Fall 2010
Machine Learning 101  
1st Week  

22Sep  Chapter 1 & 2 Notes For First Week 

23Sep  Chapter 3  
2nd Week  

29Sep  Chapter 4 Notes for 2nd Week 
HW #1 Due 
30Sep  Chapter 4  
3rd Week  

6Oct  Simple Regression Notes for Week 03 
HW # 2 Due 
7Oct  Ridge Regression & k Nearest Neighbors 

4th Week  

13Oct 
Chapter 5 Week04 finish k Nearest Neighbors Naive Bayes 
HW #3 Due 
14Oct 
Chapter 5 Support Vector Machines 

5th Week  

20Oct 
Chapter 5 Week05 finish SVM Start Ensemble Methods 
HW #4 Due

21Oct 
Chapter 5 Finish Ensemble Methods 

Machine Learning 102  
6th Week  

27Oct 
Chapter 5 Week06 Class Imbalance 
HW #5 Due 
28Oct  Chapter 6  
7th Week  

3Nov  Chapter 8 Week07 
HW #6 Due 
4Nov  Chapter 8 Cluster Analysis  
8th Week  

10Nov 
Papers Week08 
Work Hard on your Presentations 
11Nov  Papers Group 1, Group 2  
13Nov  Data Mining Camp Saturday  Instructions 
9th Week  

17Nov  Chapter 9 Week09  HW #7 on Chapter 8 Due 
18Nov  Chapter 9  
10th Week  
1Dec  Chapter 10, Week10  HW #8 on Chapter 9 Due 
2Dec  Chapter 10  



Lectures are in the Lectures Folder
Homeworks are in the Homework Folder
There are more Machine Learning References on Patricia's web site http://patriciahoffmanphd.com/