Practice problems will release along with Notebook 11.

This collection of notebooks contains exams from prior semesters which we think will help you hone your skills with the material covered by Midterm 2. The problem numbering indicates that there would be 23 problems. However, we have curated the collection to remove some of the more outdated or redundant problems. There are 11 notebooks listed below in priority of how useful we think they will be in your preparation. The grading details and time limits for the exam problems has changed over many iterations of the course. Where available we will give the basic grading criteria.

Tier 1

Past exam problem. Same format of length, random test cases, independent exercises, test case variables. Tier 1 problems will be very similar in format to the upcoming exam.

Problem 23

  • 9 exercises; 17 available points; 12 points required for 100% (lowered from 14); Time limit 4 hours
  • Topic: Actor network analysis. In this notebook you will explore a dataset of film credits and create/analyze a relationship network of actors starring in the films.
  • Key skills: Pandas, native Python data structures, and incorporating new tools given appropriate documentation.
  • Note: This notebook includes the test case variables feature which you saw in Notebook 1. It is worth taking the time to get familiar with this feature, as it will be included on your exam!
  • There is a link to the solution in the notebook in the introduction.

Problem 22

  • 7 exercises; 13 available points; 9 points required for 100%; Time limit 4 hours
  • Topic: Campaign finance geography. In this notebook you will calculate how similar ZIP codes in the United States are to one another based their residents' donations to political candidates in the 2020 election cycle.
  • Key skills: Pandas, SQLite, sparse matrices
  • Note: This notebook includes an early version of the test case variables feature. It is the same basic concept as what you will see on your exam, but the details are slightly different.
  • There is a link to the solution in the notebook in the introduction.

Tier 2

Past exam problem. Same format of length, random test cases, independent exercises. Tier 2 problems will be similar to the upcoming exam, but they will not have the test case variables feature implemented.

Problem 21

  • 8 exercises; 13 available points; 11 points required for 100%; Time limit 3.5 hours
  • Topic: Taxi Data. In this notebook you will work with NYC taxi data to extract useful information on fares, trip durations, and shortest paths between zones.
  • Key skills: Pandas and Numpy

Problem 20

  • 8 exercises; 18 available points; 15 points required for 100%; Time limit 3 hours
  • Topic: Human migration. In this notebook you will combine data from several sources to predict future migration patterns in the United States.
  • Key skills: Pandas, Numpy, SQLite, and basic Python

Note problems 18 and 19 were given as a single exam. Between the two problems there were 17 points available, 12 points required for 100% and a 4 hour time limit.

Problem 19

  • 4 exercises; 6 points available
  • Topic: Shortest paths. In this notebook you will work with geographic data to determine the shortest route between two points in California.
  • Key skills: SQLite, Pandas, Numpy

Problem 18

  • 5 exercises; 11 points available
  • Topic: Covid-19. In this notebook you will perform analysis to determine if the connectivity of the US air transportation network has an effect on the spread of Covid-19 (as of Spring 2020)
  • Key skills: Pandas, SQL, PageRank

Tier 3

Past exam problems. Differences in format, length, and exercises may be interdependent. Tier 3 problems will test the same content and problem-solving techniques required to succeed on the upcoming exam, but it will be tested in a somewhat different format.

Problem 15

  • Topic: Forward pass of a neuaral network.
  • Key skills: Numpy

Problem 13

  • Topic: Soccer Guru
  • Key skills: SQLite

Problem 9

  • Topic: SQL operations
  • Key skills: SQL or Pandas (either tool for working with tabular data can be used)

Problem 7

  • Topic: Tensor computations
  • Key skills: Pandas, Numpy

Problem 2

  • Topic: Conway's "Game of Life"
  • Key skills: Numpy
Updated: 2022-12-16