The topics are divided into roughly three units, as outlined below. The pace is roughly 1 or 2 topics per week and 1 graded item (a lab notebook or exam) per week.
Dates | Notes | Topic | Lab/exam |
---|---|---|---|
Aug 21 | 0+1 | 0 DUE Sun Aug 27 | |
Aug 28 | Watch videos | 2 | 1 DUE Sun Sep 3 |
Sep 4 | Holiday (9⁄4) | 3 | 2 DUE Sun Sep 10 |
Sep 11 | 4 | 3 (not graded) | |
Sep 18 | 5+6 | 4 DUE Sun Sep 24 | |
Sep 25 | 7 | 5 DUE Sun Oct 1; 6 is optional | |
Oct 2 | 8 | 7 DUE Sun Oct 8 | |
Oct 9 | Fall break (10⁄9) | 9 | Midterm 1 on Wed Oct 11 (Topics 1-6) |
Oct 16 | 9 | 8 DUE Sun Oct 22 | |
Oct 23 | Drop deadline (10⁄28) | 10 | 9 DUE Sun Oct 29 |
Oct 30 | 11 | 10 DUE Nov 5 | |
Nov 6 | 12 | 11 Due Nov 12 | |
Nov 13 | 13 | Midterm 2 on Wed Nov 15 (Topics 7-11) | |
Nov 20 | 14 | 12 DUE Nov 26 | |
Nov 27 | 15 | 13+14 DUE Dec 3 | |
Dec 4 | 16 | 15 DUE Dec 5 | |
Dec 12 | Finals week | Final exam on Tu Dec 12 (all topics) |
The final exam will be held on Tuesday, December 12 from 6-8:50 pm in Scheller 200.
Module 0: Fundamentals.
- Topic 0: Overview + intro to Jupyter
- Topic 1: Python bootcamp review
- Topic 2: Pairwise associatoin mining
- default dictionaries, asymptotic running time
- Topic 3: Mathematical preliminaries
- probability, calculus, linear algebra
- Topic 4: Representing numbers
- floating-point arithmetic, numerical analysis
Module 1: Representing, transforming, and visualizing data.
- Topic 5: Preprocessing unstructured data
- Strings and regular expressions
- Topic 6: Mining the web
- (Notebook only) HTML processing, web APIs
- Topic 7: Tidying data
- Pandas, merge/join, tibbles and bits, melting and casting
- Topic 8: Visualizing data and results
- Seaborn, Bokeh
- Topic 9: Relational data (SQL)
Module 2: The analysis of data.
- Topic 10: Intro to numerical computing
- NumPy / SciPy
- Topic 11: Ranking relational objects
- Graphs as (sparse) matrices, PageRank
- Topic 12: Linear regression
- Direct (e.g., QR) and online (e.g., LMS) methods
- Topic 13: Classification
- Logistic regression, numerical optimization
- Topic 14: Clustering
- The k-means algorithm
- Topic 15: Compression
- Principal components analysis (PCA), singular value decomposition (SVD)
- Topic 16: Putting it all together
- (Notebook only) Eigenfaces