Midterm 1 Practice Problems Release Notes
Practice problems will release along with Notebook 5.
This collection of notebooks contains both exams from prior semesters and additional problems which we think will help you hone your skills with the material covered by Midterm 1. The problem numbering indicates that there would be 27 problems, however we have curated the collection to remove some of the more outdated or redundant problems. There are 12 notebooks listed below in priority of how useful we think they will be in your preparation. The grading details and time limits for the exam problems has changed over many iterations of the course. Where available we will give the basic grading criteria.
Tier 1 and 2 problems were given in recent semesters follow the same format in terms of time (3-4 hours), independent exercises, and random test cases which you will see on your Midterm 1 exam.
Tier 1
Past exam problem. Same format of length, random test cases, independent exercises, test case variables.
Problem 26
- 7 exercises; 14 available points; 11 points required for 100%; Time limit 4 hours
- Topic: USDA food labels - In this notebook you will explore ingesting data from a raw JSON object, reformatting to a more useful form, extracting statistics, and performing some simple analytics.
- Key skills: Navigating nested data, string processing, implementing mathematical functions.
- Note this problem includes the test case variables feature which you saw in Notebook 1. It is worth taking the time to get familiar with this feature, as it will be included on your exam!
Tier 2
Past exam problem. Same format of length, random test cases, independent exercises. No test case variables.
Problem 25
- 8 excercises; 16 available points; 14 points required for 100%; Time limit 3 hours
- Topic: Chess Ratings - In this notebook you will process a
.pgn
file (popular format for storing chess game data) to compute the changes to players' Elo rating from match to match and from beginning to end of a chess tournament.
- Key skills: String processing, building nested data structures, implementing mathematical functions, functional programming.
Problem 23
- 7 exercises; 14 available points; 12 points required for 100%; Time limit 3 hours
- Topic: How partisan is the US Congress - In this notebook you will work with JSON formatted data about Congressional votes and perform some filtering and transformations leading to the ultimate goals of calculating a metric to guage how differently the two major parties vote and how similar a pair of members are in terms of their voting record.
- Key skills: Navigating nested data, control flow logic, implementing mathematical funcitons.
Note - problems 21 and 22 were given together as part of the same exam. There was a total time limit of 3 hours and a total of 16 points were required for 100%
Problem 22
- 6 exercises; 10 available points
- Topic: Ingredient substitution - In this notebook you will work with recipe data to expand on pairwise association mining to determine the most suitable substitutions for an ingredient in a recipe by calculating similarity metrics.
- Key skills: Control flow logic, implementing mathematical functions, functional programming.
Problem 21
- 5 exercises; 10 available points
- Topic: Caption contest - In this notebook you will work with JSON data to extract and clean text captions and implement an algorithm to rank them.
- Key skills: String processing, implementing mathematical functions.
Tier 3
Past exam problems with an older format than what you will see on your exam. Students were given a longer time window for exams, but had to solve multiple problems for a single exam. Additionally, the exercises were often interdependent, making the each problem more of an "all-or-nothing" proposition.
Also included are problems which were not given as part of an exam. The value in these problems is practicing your problem solving skills rather than being a dry run for your exam.
Problem 20
- Topic: Document clustering
- Key skills: String processing, implementing mathematical functions, working with nested data.
Problem 16
- Topic: Debug-a-thon
- Key skills: Debugging
- Note - This one is a little "different". Here you are given 5 exercises with solutions which are almost correct but fail for some reason. Your task is to find and correct the bug (or just rewrite from scratch).
Problem 15
- Topic: Hidden test demo
- Note - This notebook is a demonstration of the "hidden test" feature. It will be used in your exam. The hidden tests on your exam contain the same logic as the exposed tests, so if you pass the exposed test you can expect to pass the hidden test.
Problem 14
- Topic: Scraping data from "FiveThirtyEight"
- Key skills: String processing, implementing mathematical functions.
Problem 9
- Topic: Maximum likelihood and floating-point
- Key skills: Implementing mathematical functions
Problem 7
- Topic: Hamlet Sentence Generator
- Key skills: String processing, implementing mathematical functions
Problem 2
- Topic: DNA Sequence Analysis
- Key skills: String processing