Who are you?
We are your instructors for the Fall 2017 semester: Professor Richard (Rich) Vuduc and graduate teaching assistant Mikhail Isaev.
The best way to reach us is to use this course’s online discussion forum, available through the course’s Canvas site. You can also visit us during our respective office hours; see below. (We will set these during the first week of class.)
Office hours.
- Vuduc: Wednesdays, 1-2 pm, in Klaus 1334
- Isaev: Thursdays, 11am to noon, in Klaus 1335 (or if that is occupied, the nearby alcove)
What will I learn?
You will build, “from scratch,” the basic components of a data analysis pipeline: collection, preprocessing, storage, analysis, and visualization. You will see many examples of high-level data analysis questions, concepts and techniques for formalizing those questions into mathematical or computational tasks, and methods for translating those tasks into code. Beyond programming languages and best practices, you’ll learn elementary data processing algorithms, notions of program correctness and efficiency, and numerical methods for linear algebra and mathematical optimization.
How will I do all that?
(Labs and exams.) Your grade will be based on a combination of “lab notebooks” (programming homework) and three exams.
- Lab notebooks: 51% (except the first, all notebook are weighted equally)
- Midterm 1: 10%
- Midterm 2: 15%
- Final exam: 24%
What should I know already?
(Prerequisites.) You should have at least an undergraduate-level understanding in the following topics:
- Programming proficiency in Python or similar language
- Basic calculus
- Probability and statistics
- Linear algebra
What does “programming proficiency” mean? For context, this course aims to fill in gaps in your programming background so you can complete other programming-intensive courses in the MS Analytics program, most notably, CSE 6242. If you have a significant programming background already, you will probably get less out of this course than you might have otherwise.
The formal prerequisite for this course, as it appears in OSCAR, is: Undergraduate Semester level CS 1371 Minimum Grade of D
.
In more human terms, you should be familiar with basic programming ideas at the level of the Python Bootcamp, which most on-campus MS Analytics students would have taken.
Will I need any school supplies?
The main pieces of equipment you will need are a pen or pencil, paper, an internet-enabled device, and your brain!
There is no required textbook; however, the following may be a handy resource.
- William McKinney. Python for Data Analysis: Data wrangling with Pandas, NumPy, and IPython. O’Reilly Media, October 2012. ISBN-13: 978-1449319793. Buy on Amazon
Note: The edition above is the first, which is pretty old at this point (2012). A new edition is scheduled to come out in early Fall 2017, which is why we did not require it for the course.
Accommodations for individuals with disabilities.
If you have learning needs that require special accommodation, please contact the Office of Disability Services at (404) 894-2563 or http://disabilityservices.gatech.edu/, as soon as possible, to make an appointment to discuss your special needs and to obtain an accommodations letter. Please also e-mail me as soon as possible in order to set up a time to discuss your learning needs.
Any advice?
Of course!
First, the basic philosophy of this course is that you learn the material best by a combination of reading, thinking, and most importantly, actively doing. Therefore, you should make an effort to complete all assignments, including any “optional” parts.
Secondly, don’t cheat! All course participants—you, the teaching assistants, and me—are expected and required to abide by the letter and the spirit of the Georgia Tech Honor Code. In particular, always keep the following in mind:
- Ethical behavior is extremely important in all facets of life. Honest and ethical behavior is expected at all times.
- You are responsible for completing your own work.
- All incidents of suspected dishonesty or violations of the Georgia Tech Honor Code will be reported to and handled by the Dean of Students office. Penalties for violating the collaboration policy can be severe; alleged violations are adjudicated by the Dean of Students office and not by the instructor.
The kinds of allowed collaboration will be spelled out with the assignments.