This problem is about tensors, which generalize matrices from linear algebra. It tests whether you can read an abstract description of a computational problem and translate it into an efficient implementation.
In particular, you need to implement a computational operation of the form,
$$ A(i, r) = \sum_{j=0}^{N_1-1} \sum_{k=0}^{N_2-1} \sum_{l=0}^{N_3-1} T(i, j, k, l) \cdot B(j, r) \cdot C(k, r) \cdot D(l, r). $$Observe that it involves not only matrices ($A$, $B$, $C$, and $D$), but a tensor, $T$, which in this case has 4 "modes" (i.e., indices or axes) rather than just 2 as in the case of matrices. While that may, at first glance, seem innocuous enough, here is the catch: we will give you one implementation, and your goal is to create another one that is faster. You will get some points for correctness on a small dataset, and additional points for correctness and speed on a larger dataset.
We believe most of you will find this problem to be hard. So plan accordingly. The problem has two parts, or "exercises."
In this problem, we are providing a random subset of blogs taken from the Blog Authorship Corpus, which in turn were taken from blogger.com in 2004.
Each blog consists of several posts. Each post consists of words. For each post, we went ahead and created "word co-occurrence pairs" based on whatever words appear close to one another within the post. Lastly, the corpus has several pieces of metadata, namely, the author's self-reported gender, age, job sector, and Western zodiac sign.
Let's use Pandas to read in these data, which are stored in comma-separated value (CSV) files. To do that, run the following code cell.
from IPython.display import display
import pandas as pd
def get_timer(): # Used in some timing experiments
from time import perf_counter
return perf_counter()
def read_blogs(tag, verbose=True):
def msg(f):
if verbose: print("Reading {}...".format(f))
from os.path import isdir, isfile
ROOT = './resource/asnlib/publicdata/blogs/' if isdir('.voc') else './publicdata/blogs/'
BLOGGERS_FILE = '{}bloggers{}.csv'.format(ROOT, tag)
POSTS_FILE = '{}posts{}.csv'.format(ROOT, tag)
OCCUR_FILE = '{}occur{}.csv'.format(ROOT, tag)
TENSOR_COORDS_DF_SOLN_FILE = '{}tensor_coords_df_soln{}.csv'.format(ROOT, tag)
t_read__ = get_timer()
msg(BLOGGERS_FILE)
bloggers_df = pd.read_csv(BLOGGERS_FILE)
msg(POSTS_FILE)
posts_df = pd.read_csv(POSTS_FILE)
msg(OCCUR_FILE)
occur_df = pd.read_csv(OCCUR_FILE)
from os.path import isfile
if isfile(TENSOR_COORDS_DF_SOLN_FILE):
msg(TENSOR_COORDS_DF_SOLN_FILE)
soln_df = pd.read_csv(TENSOR_COORDS_DF_SOLN_FILE)
else:
soln_df = None
t_read__ = get_timer() - t_read__
if verbose:
print("==> Total loading time: ~ {:.1f} seconds.".format(t_read__))
return bloggers_df, posts_df, occur_df, soln_df
# Read in a small sample dataset:
bloggers_df, posts_df, occur_df, soln_df = read_blogs('-25x2')
The cell produces three notable dataframes: bloggers_df
, posts_df
, and occur_df
. Let's quickly look at each one in turn.
Bloggers (bloggers_df
). The first dataframe consists of bloggers:
print("There are {} bloggers.".format(len(bloggers_df)))
display(bloggers_df.head())
Each blogger has a unique integer ID number, blog_id
, which starts at zero. The age is another integer; the gender, job sector, and zodiac signs are strings.
For example, suppose the following were a row in the dataframe:
blog_id |
age |
gender |
sector |
zodiac |
---|---|---|---|---|
11 | 16 | male | Student | Taurus |
Then that would indicate that blogger #11 is 16 years of age, male, a student, and is a Taurus.
Posts (posts_df
). The second dataframe is a list of posts:
print("There are {} posts.".format(len(posts_df)))
display(posts_df.head())
Each post has a unique integer ID, post_id
, numbered starting at zero. The author is given by blog_id
, his or her integer ID. For instance, suppose this row were in the table:
post_id |
blog_id |
---|---|
2291 | 94 |
Then that would indicate that blogger #94 wrote a post whose ID number is #2291.
Co-occurrences (occur_df
). This third dataframe is the list of co-occurrences:
print("There are {} co-occurrence triplets.".format(len(occur_df)))
occur_df.head()
The column word_i
is an integer ID of a word; the column word_j
is an integer ID of a second word; and the column post_id
is the integer ID of a post. Each row means that the two corresponding words appeared "near" one another within given post. For example, suppose the following row were part of the table:
word_i |
word_j |
post_id |
---|---|---|
37 | 1738 | 598 |
Then that would mean word #37 and word #1738 appeared "close together" in post #598.
Exercise 0 (2 points). Suppose we assign each Western zodiac sign the following integer IDs (run the next cell to see the IDs):
zodiac_signs = ['Aquarius', 'Pisces', 'Aries', 'Taurus', 'Gemini',
'Cancer', 'Leo', 'Virgo', 'Libra', 'Scorpio', 'Sagittarius', 'Capricorn']
zodiac_ids = {s: zodiac_signs.index(s) for s in zodiac_signs}
zodiac_ids
Write a function,
def add_zodiac_ids(occur_df, posts_df, bloggers_df):
that takes the three input dataframes as input and returns a copy of occur_df
with one new column, zodiac_id
, containing the integer ID of the Zodiac sign for the blogger who authored the post.
For example, suppose occur_df
has the following row:
word_i |
word_j |
post_id |
---|---|---|
126 | 2618 | 165 |
If you look up post #165 in posts_df
, you'll see the following entry:
post_id |
blog_id |
---|---|
165 | 11 |
And blogger #11 has these attributes, according to bloggers_df
:
blog_id |
age |
gender |
sector |
zodiac |
---|---|---|---|---|
11 | 16 | male | Student | Taurus |
Since a Taurus has an ID of 3, your output should contain a row of the form,
word_i |
word_j |
post_id |
zodiac_id |
---|---|---|---|
126 | 2618 | 165 | 3 |
import numpy as np
def add_zodiac_ids(occur_df, posts_df, bloggers_df):
### BEGIN SOLUTION
#posts_zodiac_df = posts_df.merge(bloggers_df[['blog_id', 'zodiac']], how='left', on='blog_id')
#posts_zodiac_df['zodiac_id'] = posts_zodiac_df['zodiac'].apply(lambda z: zodiac_ids[z])
#occur_zodiac_df = occur_df.merge(posts_zodiac_df[['post_id', 'zodiac_id']], how='left', on='post_id')
#return occur_zodiac_df
tempdf = occur_df.merge(posts_df, on='post_id')
tempdf = tempdf.merge(bloggers_df, on='blog_id')
tempdf['zodiac_id'] = tempdf['zodiac'].map(zodiac_ids)
return tempdf.drop(['blog_id', 'age', 'gender', 'sector', 'zodiac'], axis=1)
### END SOLUTION
# Preview your result:
occur_zodiac_df = add_zodiac_ids(occur_df, posts_df, bloggers_df)
assert type(occur_zodiac_df) is pd.DataFrame, "Does your function return a Pandas DataFrame?"
print("First few rows:")
display(occur_zodiac_df.head())
print("\nRow from the example:")
occur_zodiac_df[(occur_zodiac_df['word_i'] == 126)
& (occur_zodiac_df['word_j'] == 2618)
& (occur_zodiac_df['post_id'] == 165)]
# Test cell: `occur_zodiac_df_test`
def canonicalize_tibble(X):
var_names = sorted(X.columns)
Y = X[var_names].copy()
Y.sort_values(by=var_names, inplace=True)
Y.reset_index(drop=True, inplace=True)
return Y
def tibbles_are_equivalent(A, B):
A_hat = canonicalize_tibble(A)
B_hat = canonicalize_tibble(B)
equal = (A_hat == B_hat)
return equal.all().all()
print("Checking equivalence...")
assert tibbles_are_equivalent(occur_zodiac_df,
soln_df), "Your result does not match the sample solution."
print("\n(Passed!)")
Matrices are suitable for looking at 2-way relations (between rows and columns), as in the pairwise association mining problem (Topic 2) and PageRank (Topic 11).
However, a dataset may often be naturally decomposed into more than two features. For example, consider this recent blog post by Chris Moody, a data analyst at the e-commerce clothing site, Stitch Fix.
Moody considers the problem of mining user comments using, as an example, a three-way data representation: two to represent a word pair and the third to represent the document (i.e., comment) in which a word pair co-occurs. That is, imagine a 3-D table $T$, where each entry, $T(i, j, k)$, somehow measures the association of word $i$, word $j$, and document $k$. The left-hand side of the following image illustrates this concept.
The acronym "PMI" in this figure stands for pointwise mutual information, but for the purpose of this problem, you don't need to know anything about this concept.
Definition: tensors and tensor modes. We call this type of multiway table a tensor. In particular, a $d$-way tensor is one with $d$ "axes" or modes.
You've already seen tensors where $d \leq 2$: a scalar is a 0-way tensor; a vector is a 1-way tensor; and a matrix is a 2-way tensor.
In Exercise 0 above, you created occur_zodiac_df
. Each row could be interpreted as the coordinates of a 4-way tensor, and indeed, we will use it as such below.
Tensor decompositions. Moody's computational analysis task is to "factor" a $d$-way tensor into a kind of multiplication among $d$ matrices, as illustrated in the right-hand side of the figure above. This factoring is also known as a tensor decomposition. You don't need to know anything specific about tensor decompositions, only that they motivated this exam problem.
The computational bottleneck: an "MTTKRP." The slowest part of many tensor decompositions is an operation known by the (regrettable) acronym, MTTKRP, which stands for "matricized tensor times Katri-Rao product." (Try saying that five times quickly in succession!) There are several variations, but here is the one you need in this problem.
Suppose $d=4$. Next, let $N_0$, $N_1$, $N_2$, $N_3$, and $R$ be positive integers. Also let
The mode-0 MTTKRP computes $A$ from $T$, $B$, $C$, and $D$ as follows. For all $0 \leq i < N_0$ and $0 \leq r < R$,
$$ A(i, r) = \sum_{j=0}^{N_1-1} \sum_{k=0}^{N_2-1} \sum_{l=0}^{N_3-1} T(i, j, k, l) \cdot B(j, r) \cdot C(k, r) \cdot D(l, r). $$We will also refer to $T$ as the input tensor; $B$, $C$, and $D$ as the input factors; and $A$ as the output factor.
Assumptions. In data analysis problems, $T$ is typically very sparse, meaning it is mostly zero. By contrast, the factor matrices $A$, $B$, $C$, and $D$ are typically dense matrices. Also, the factors are usually "tall-and-skinny," meaning $R$ is small compared to $N_0$, $N_1$, $N_2$, and $N_3$.
Based on those assumptions, we have provided you with a baseline implementation written in "pure" Python.
The next code cell implements this approach. Take a minute to inspect it and compare it to the formula for $A(i, r)$. Here is the same formula repeated, so it's easier to compare against the code: for all $0 \leq i < N_0$ and $0 \leq r < R$,
$$ A(i, r) = \sum_{j=0}^{N_1-1} \sum_{k=0}^{N_2-1} \sum_{l=0}^{N_3-1} T(i, j, k, l) \cdot B(j, r) \cdot C(k, r) \cdot D(l, r). $$def mttkrp_0__py(T, B, C, D, N0, R):
assert R >= 1
assert (type(T) is dict) and all([type(X) is list for X in [B, C, D]])
A = [0.] * (N0 * R)
for (i, j, k, l), T_ijkl in T.items():
for r in range(R):
A[i*R + r] += T_ijkl * B[j*R + r] * C[k*R + r] * D[l*R + r]
return A
Now run the following two code cells. The first one creates a sparse tensor, T_py
, using the coordinates from occur_zodiac_df
and random nonzero values. It also creates the random input factor matrices, using $R=6$. The second cell runs mttkrp_0__py
to compute the output factor, and measures how long it takes.
from cse6040bench import benchit
def get_timer(): # Used in some timing experiments
from time import perf_counter
return perf_counter()
def uniform_random_values(K):
"""Returns a Python list having N uniformly sampled values in (-1, 1)."""
from random import uniform
return [uniform(-1, 1) for _ in range(K)]
# Construct a 4-way tensor from `occur_zodiac_df`
def assemble_random_tensor_4way_from_df(df, columns):
from random import uniform
assert type(columns) is list and all([type(c) is str for c in columns])
ci, cj, ck, cl = columns[0], columns[1], columns[2], columns[3]
I, J, K, L = list(df[ci]), list(df[cj]), list(df[ck]), list(df[cl])
V = [uniform(-1, 1) for _ in range(len(I))]
T = {(i, j, k, l): v for i, j, k, l, v in zip(I, J, K, L, V)}
return T
def generate_inputs(df, R=None):
assert R is not None
num_words = max(occur_zodiac_df['word_i'].max(), occur_zodiac_df['word_j'].max()) + 1
num_posts = occur_zodiac_df['post_id'].max() + 1
num_zodiac = occur_zodiac_df['zodiac_id'].max() + 1
N0, N1, N2, N3 = num_words, num_words, num_posts, num_zodiac
print("Using a 4-way tensor of size N0 x N1 x N2 x N3 == {} x {} x {} x {}.".format(N0, N1, N2, N3))
print("There are {} nonzero entries.".format(len(df)))
print("Using factors with {} columns.".format(R))
print("\nGenerating tensor... (may take a minute)")
t0__ = get_timer()
T_py = assemble_random_tensor_4way_from_df(occur_zodiac_df, ['word_i', 'word_j', 'post_id', 'zodiac_id'])
print("==> Done generating inputs. (~ {:.1f} seconds)".format(get_timer() - t0__))
print("\nGenerating input factors...")
B_py = uniform_random_values(N1 * R)
C_py = uniform_random_values(N2 * R)
D_py = uniform_random_values(N3 * R)
print("==> Done.")
return T_py, B_py, C_py, D_py, N0, N1, N2, N3, R
# Demo: Generate random-valued inputs
T_py, B_py, C_py, D_py, N0, N1, N2, N3, R = generate_inputs(occur_zodiac_df, R=6)
def generate_solution(T_py, B_py, C_py, D_py, N0, R):
print("Executing the baseline MTTKRP method...")
t1__ = get_timer()
A_py = mttkrp_0__py(T_py, B_py, C_py, D_py, N0, R)
print("==> Done with one instance. (~ {:.1f} seconds)".format(get_timer() - t1__))
print("""
==> NOTE: This timing is only a quick estimate;
it is based on only one measurement, and so should
not be regarded as reliable. (Later, we will test
more rigorously.)
""")
return A_py
# Demo
A_py = generate_solution(T_py, B_py, C_py, D_py, N0, R)
Your task is now to implement a version of the mode-0 MTTKRP that is faster than the pure Python one.
To start, think about how you want to store the data objects. The pure Python method used a dictionary keyed on tuples to store the tensor and row-major lists to store the matrices. What are some alternatives? You'll want to think carefully about your approach. In particular, although we will test your code for correctness on a small example, we will perform timings using both a small and a (relatively) large example. So be sure you think about efficient use of time and storage.
Next, think about how your choice affects how the MTTKRP operation will look in code.
There are many potential approaches. One idea is to view the sparse MTTKRP operation as a relational query, but do not feel constrained by this suggestion.
Once you think you have a promising approach, you'll need to implement it by creating three different functions. You must create these; our testing and timing code will assume their existence.
The first function is convert_inputs()
:
def convert_inputs(T_py, B_py, C_py, D_py, N0, N1, N2, N3, R):
...
T, B, C, D = convert_inputs(...)
It should translate the baseline's data structures into your alternative method. Our test code needs convert_inputs()
to ensure that your implementation has the same inputs as the baseline. Your function should return the converted objects.
We will not grade based on the speed of convert_inputs()
, so it's okay if it is not very fast. However, it should not be too slow, either, as you don't want the autograder to time out!
The second function you need to write is mttkrp_0()
, which should use your converted objects to calculate the mode-0 MTTKRP:
def mttkrp_0(T, B, C, D, N0, N1, N2, N3, R):
...
A = mttkrp_0(T, B, ...)
Your mttkrp_0()
should return the output factor, A
, as your alternative object type.
We will grade based on how fast mttkrp_0()
is. Your implementation needs to be at least two-times ($2\times$) faster than the baseline to get full credit. (There is some partial credit if is faster but does not meet the target above.)
The third function is convert_output()
. It should take the result of your mttkrp_0()
function and convert it back into a row-major list:
def convert_output(A, N0, R):
...
A_py_result = convert_output(A, N0, R)
We need convert_output()
so we can check your output against the baseline.
Lastly, before you start, note that there are three separate code cells, one for each of the above functions. After these cells, there are several more for testing and timing cells.
import numpy as np
import random
def convert_inputs(T_py, B_py, C_py, D_py, N0, N1, N2, N3, R):
assert type(T_py) is dict and all([type(X) is list for X in [B_py, C_py, D_py]])
assert len(B_py) >= N1*R and len(C_py) >= N2*R and len(D_py) >= N3*R
### BEGIN SOLUTION
# This approach uses Pandas to store the tensor
# and Numpy to store the matrices
T = py_tensor_to_df(T_py, columns=['i', 'j', 'k', 'l', 'v'])
B = py_matrix_to_np(B_py, N1, R)
C = py_matrix_to_np(C_py, N2, R)
D = py_matrix_to_np(D_py, N3, R)
return T, B, C, D
def py_tensor_to_df(T_py, columns):
indices = list(T_py.keys())
values = list(T_py.values())
index_colnames = columns[:-1]
value_colname = columns[-1]
T = pd.DataFrame(indices, columns=index_colnames)
T[value_colname] = values
return T
def py_matrix_to_np(X, M, N):
return np.array(X).reshape((M, N))
### END SOLUTION
print("Calling your conversion routine...")
T, B, C, D = convert_inputs(T_py, B_py, C_py, D_py, N0, N1, N2, N3, R)
print(len(T_py))
print(len(B_py))
print(len(C_py))
print(len(D_py))
print(N0)
print("==> Done. At least it runs without crashing!")
def mttkrp_0(T, B, C, D, N0, N1, N2, N3, R):
### BEGIN SOLUTION
I = T['i']
J = T['j']
K = T['k']
L = T['l']
T_values = T['v'].values
# Elementwise products: E_np(i, r), an N0 x R matrix:
E_np = T_values[:, np.newaxis] * B[J, :] * C[K, :] * D[L, :]
# Convert to dataframe so we can group-by I
E_df = pd.DataFrame(E_np) # Convert to dataframe
E_df['i'] = I
# For each (i, r), evaluate the sum over (j, k, l).
# Result is a N0_hat x R array, where N0_hat <= N0.
# (There may be fewer rows because output might be sparse.)
A_df = E_df.groupby('i').sum()
# Return the result as a dense Numpy array
A = np.zeros((N0, R))
A[A_df.index, :] = A_df.values
return A
### END SOLUTION
print("Calling your mode-0 MTTKRP...")
t2__ = get_timer()
A = mttkrp_0(T, B, C, D, N0, N1, N2, N3, R)
print(" ==> Done (~ {:.3f} seconds).".format(get_timer() - t2__))
print("""\n==> NOTE: This timing is only a quick estimate;
it is based on only one measurement, and so should
not be regarded as reliable. (Later, we will test
more rigorously, after a correctness check.)
""")
def convert_output(A, N0, R):
### BEGIN SOLUTION
return list(A.reshape(N0*R))
### END SOLUTION
print("Converting back for testing...")
A_py__yours = convert_output(A, N0, R)
Test cells for Exercise 1. Here are the test cells that must pass. There are several, so you can accumulate some partial credit. The comment at the top of each test cell tells you how many points it is worth.
# Test cell: `correctness_0` (1 point)
def compare_answers(A_py, A_py__yours, multiplier=None):
assert multiplier is not None
ABS_ERR_THRESHOLD = 2e-15 * multiplier
print("Basic correctness check: Comparing your output to the reference, A_py...")
print(" [threshold: {}]...".format(ABS_ERR_THRESHOLD))
your_differences = [(a-b) for a, b in zip(A_py, A_py__yours)]
your_abs_differences = [abs(x) for x in your_differences]
your_max_abs_difference = max(your_abs_differences)
print("==> Maximum absolute error observed:", max(your_abs_differences))
if your_max_abs_difference <= ABS_ERR_THRESHOLD:
big_err_locations = None
else:
big_err_locations = [(int(i/R), i % R, x) for i, x in enumerate(your_abs_differences) if x > ABS_ERR_THRESHOLD]
print("Your solution produced a difference greater than the acceptable tolerance, {}.".format(ABS_ERR_THRESHOLD))
print("For a list of entries, print the list, `big_err_locations`.")
print("(Note that {} locations exceeded the threshold.)".format(len(big_err_locations)))
return big_err_locations
big_err_locations = compare_answers(A_py, A_py__yours, multiplier=len(T_py))
assert big_err_locations is None, \
f"Errors at {len(big_err_locations)} locations. Check `big_err_locations` for details."
print("\n(Passed correctness check!)")
# Test cell: `correctness_1` (1 point)
#
# This cell tests on the same structure with different nonzero values.
print("Generating a new test instance...")
from random import choice
T_py, B_py, C_py, D_py, N0, N1, N2, N3, R = generate_inputs(occur_zodiac_df, R=1+choice(range(9)))
print("==> Done.")
print("Running your implementation...")
print(" input conversion...")
T, B, C, D = convert_inputs(T_py, B_py, C_py, D_py, N0, N1, N2, N3, R)
print(" mode-0 MTTKRP...")
t3__ = get_timer()
A = mttkrp_0(T, B, C, D, N0, N1, N2, N3, R)
t3__ = get_timer() - t3__
print(" output conversion...")
A_py__yours = convert_output(A, N0, R)
print("==> Done. (MTTKRP ~ {:.1f} seconds.)".format(t3__))
print("""==> NOTE: This timing is only a quick estimate;
it is based on only one measurement, and so should
not be regarded as reliable. (Later, we will test
more rigorously, after a correctness check.""")
print("Checking the answer...")
A_py = generate_solution(T_py, B_py, C_py, D_py, N0, R)
big_err_locations = compare_answers(A_py, A_py__yours, multiplier=len(T_py))
assert big_err_locations is None
print("\n(Passed!)")
# Test cell: `correctness_2` (1 point)
#
# This cell tests on the same structure but on a larger problem.
print("Loading a larger test problem...")
bloggers_df, posts_df, occur_df, soln_df = read_blogs('-100x2')
occur_zodiac_df = add_zodiac_ids(occur_df, posts_df, bloggers_df)
T_py, B_py, C_py, D_py, N0, N1, N2, N3, R = generate_inputs(occur_zodiac_df, R=5)
print("==> Done.")
print("Running your implementation...")
print(" input conversion...")
T, B, C, D = convert_inputs(T_py, B_py, C_py, D_py, N0, N1, N2, N3, R)
print(" mode-0 MTTKRP...")
t3__ = get_timer()
A = mttkrp_0(T, B, C, D, N0, N1, N2, N3, R)
t3__ = get_timer() - t3__
print(" output conversion...")
A_py__yours = convert_output(A, N0, R)
print("==> Done. (MTTKRP ~ {:.1f} seconds.)".format(t3__))
print("""==> NOTE: This timing is only a quick estimate;
it is based on only one measurement, and so should
not be regarded as reliable. (Later, we will test
more rigorously, after a correctness check.)
""")
print("Checking the answer...")
A_py = generate_solution(T_py, B_py, C_py, D_py, N0, R)
big_err_locations = compare_answers(A_py, A_py__yours, multiplier=len(T_py))
assert big_err_locations is None
print("\n(Passed!)")
# Test cell: `performance_0`
#
# This cell checks that you achieved at least a 1.25x (25%) speed improvement.
print("\nTiming baseline...")
t_baseline = benchit("mttkrp_0__py(T_py, B_py, C_py, D_py, N0, R)", scope=globals())
print("\nTiming your method...")
t_better = benchit("mttkrp_0(T, B, C, D, N0, N1, N2, N3, R)", scope=globals())
speedup = t_baseline['mean_time_per_run'] / t_better['mean_time_per_run']
speedup_target = 1.25
assert speedup >= speedup_target, "Your method was only {:.2f}x faster (< 1 means it was slower)!".format(speedup,
speedup_target)
print("\n(Passed -- you were {:.1f}x > {}x faster!)".format(speedup, speedup_target))
# Test cell: `performance_1`
#
# This cell checks that you achieved at least a 1.5x (50%) speed improvement.
speedup_target = 1.5
assert speedup >= speedup_target, "Your method was only {:.2f}x faster (< 1 means it was slower)!".format(speedup,
speedup_target)
print("\n(Passed -- you were {:.1f}x > {}x faster!)".format(speedup, speedup_target))
# Test cell: `performance_2`
#
# This cell checks that you achieved at least a 2x (100%) speed improvement.
speedup_target = 2.0
assert speedup >= speedup_target, "Your method was only {:.2f}x faster (< 1 means it was slower)!".format(speedup,
speedup_target)
print("\n(Passed -- you were {:.1f}x > {}x faster!)".format(speedup, speedup_target))
Fin! You've reached the end of this problem. Be sure to restart the kernel and re-run it from top-to-bottom to make sure should work. Then try submitting it to the autograder. If it passes the autograder, then you'll be good to go!