Spring 2026 Midterm 2: Predicting At-Risk Students with Learning Analytics Data

Version 1.0.0

All of the header information is important. Please read it..

Topics number of exercises: This problem builds on your knowledge of SQL queries and aggregations, pandas feature engineering, NumPy classification metrics. It has 11 exercises numbered 0 to 10. There are 22 available points. However to earn 100% the threshold is 14 points. (Therefore once you hit 14 points you can stop. There is no extra credit for exceeding this threshold.)

Exercise ordering: Each exercise builds logically on previous exercises but you may solve them in any order. That is if you can't solve an exercise you can still move on and try the next one. Use this to your advantage as the exercises are not necessarily ordered in terms of difficulty. Higher point values generally indicate more difficult exercises.

Demo cells: Code cells starting with the comment ### Run Me!!! load results from prior exercises applied to the entire data set and use those to build demo inputs. These must be run for subsequent demos to work properly but they do not affect the test cells. The data loaded in these cells may be rather large (at least in terms of human readability). You are free to print or otherwise use Python to explore them but we may not print them in the starter code.

Debugging your code: Right before each exercise test cell there is a block of text explaining the variables available to you for debugging. You may use these to test your code and can print/display them as needed (careful when printing large objects you may want to print the head or chunks of rows at a time).

Exercise point breakdown:

  • Exercise 0 - : 2 point(s)

  • Exercise 1 - : 2 point(s)

  • Exercise 2 - : 3 point(s)

  • Exercise 3 - : 2 point(s)

  • Exercise 4 - : 2 point(s)

  • Exercise 5 - : 1 point(s)

  • Exercise 6 - : 2 point(s)

  • Exercise 7 - : 3 point(s)

  • Exercise 8 - : 3 point(s)

  • Exercise 9 - : 1 point(s)

  • Exercise 10 - : 1 point(s)

Final reminders:

  • Submit after every exercise
  • Review the generated grade report after you submit to see what errors were returned
  • Stay calm, skip problems as needed and take short breaks at your leisure
In [1]:
### Global imports
import dill
from cse6040_devkit import plugins, utils
from cse6040_devkit.training_wheels import run_with_timeout, suppress_stdout
import tracemalloc
from time import time
from pprint import pprint
import sqlite3 
import pandas as pd
import numpy as np
In [2]:
DATABASE_URL = r'./resource/asnlib/publicdata/oulad2.db'
conn = sqlite3.connect(DATABASE_URL)

The Problem

This exam explores early identification of at-risk students using the Open University Learning Analytics Dataset (OULAD). Student dropout is a critical challenge in distance education. At the Open University, approximately 78% of students fail to complete their degree. The highest dropout rates occur within the first few weeks of a course, making early intervention essential but difficult.

Traditional predictive models rely on legacy data from previous courses to identify students at risk of failing or withdrawing. However, this approach cannot be used for new courses since they lack historical data. The Ouroboros methodology, introduced by Hlosta et al. (2017), proposes a solution: a "self-learning" model that uses data from the current course rather than legacy data.

Your Overall Task

In this exam, you will implement components of the Ouroboros approach for early risk prediction. The methodology centers on a critical insight: the first assessment (A1) serves as a key indicator of student success. Students who submit A1 have dramatically different outcomes than those who don't, and their behaviors can be used to predict which currently-enrolled students are at risk of not submitting.

The exam has three parts:

  1. Explore the OULAD Database Use SQL queries to understand course-level outcomes, assessment submission patterns, and Virtual Learning Environment (VLE) engagement data that reveal early behavioral indicators of student risk.

  2. Engineer Predictive Features Use Python, Pandas, and NumPy to transform raw student interaction data into meaningful features (VLE clicks, days active, registration timing) that capture engagement patterns before assessment deadlines.

  3. Implement Temporal Modeling Implement the Ouroboros "self-learning" concept by creating temporal train/test splits and calculating evaluation metrics that account for severe class imbalance inherent in early prediction windows.

By the end of this exam, you will have implemented key parts of the pipeline that identifies at-risk students without relying on legacy course data.

The Data

The Open University Learning Analytics Dataset (OULAD) contains anonymized data from 32,593 students across 7 modules and 22 course presentations spanning 2013-2014.

The database consists of 7 tables:

Core Student & Course Tables:

  • studentInfo - Student demographics and final course results
  • courses - Module and presentation
  • studentRegistration - Enrollment dates and withdrawal timestamps

Assessment Tables:

  • assessments - Assessment definitions including type, deadlines, and weights
  • studentAssessment - Individual student scores and submission dates

VLE Interaction Tables:

  • vle - Learning materials (activity_type, availability weeks) organized by module
  • studentVle - Daily student interaction logs

Schema Diagram:

oulad_schema_diagram.png

Part 1: Exploring Course Outcomes and Student Behavior with SQL

The OULAD contains 32,593 students across 22 course presentations, with over 10 million VLE interaction records and detailed assessment submission data.

Pass rates vary significantly across courses. Some courses see less than 50% of students pass while others exceed 60%. Understanding these baseline statistics provides context into why early intervention matters and helps us identify which courses might benefit most from predictive modeling.

Introduction: Exercise 0

Before we can analyze how students interact with learning materials, we need to understand what types of materials exist and how they're distributed across courses.

The following exercise explores the distribution of learning materials. This foundational knowledge shapes how we'll aggregate VLE interactions into meaningful features for predict.

Exercise 0: (2 points)

vle_activity_summary

Your task: define vle_activity_summary_query as follows:

Write a SQL query that summarizes Virtual Learning Environment (VLE) activity types and their usage patterns across all courses.

Inputs: None

Return: query: a query that produces a table with 3 columns:

  • activity_type: The type of VLE activity
  • resource_count: Total number of VLE resources of each activity type across all courses
  • courses_using: Number of distinct course presentations utilizing each activity type

Requirements/steps:

  • Query the vle table
  • Group results by activity_type
  • For resource_count: count all VLE entries of each activity_type
  • For courses_using: count distinct course (combination of code_module and code_presentation) of each activity_type
  • Order the results by resource_count in descending order then courses_using in ascending order
  • Return activity_type, resource_count, courses_using in that order
In [3]:
### Solution - Exercise 0  
vle_activity_summary_query = '''YOUR QUERY HERE'''
###
### YOUR CODE HERE
###
### BEGIN SOLUTION
vle_activity_summary_query = '''
SELECT activity_type,
       COUNT(*) AS resource_count,
       COUNT(DISTINCT code_module || '_' || code_presentation) AS courses_using
  FROM vle
 GROUP BY activity_type
 ORDER BY resource_count DESC,
          courses_using ASC;
'''
### END SOLUTION

### Demo function call
demo_result_vle_activity_summary = pd.read_sql(vle_activity_summary_query, conn)
demo_result_vle_activity_summary
Out[3]:
activity_type resource_count courses_using
0 resource 807 4
1 subpage 122 4
2 oucontent 77 4
3 forumng 56 4
4 url 50 4
5 quiz 19 4
6 oucollaborate 6 3
7 glossary 5 4
8 questionnaire 4 1
9 homepage 4 4
10 sharedsubpage 3 3
11 ouelluminate 1 1

The demo should display this output.

activity_type resource_count courses_using
0 resource 807 4
1 subpage 122 4
2 oucontent 77 4
3 forumng 56 4
4 url 50 4
5 quiz 19 4
6 oucollaborate 6 3
7 glossary 5 4
8 questionnaire 4 1
9 homepage 4 4
10 sharedsubpage 3 3
11 ouelluminate 1 1


The cell below will test your solution for vle_activity_summary (exercise 0). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [4]:
### Test Cell - Exercise 0  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=plugins.sql_executor(vle_activity_summary_query),
              ex_name='vle_activity_summary',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to vle_activity_summary did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 1.76 seconds
memory after test: 3.83 MB
memory peak during test: 90.31 MB
Passed! Please submit.

Introduction: Exercise 1

The first assessment (A1) is an important indicator. Submission of A1 correlates strongly with eventual course completion. The following exercise examines this relationship by analyzing submission patterns across courses. You will explore differences between students who submit A1 early versus those who submit late. This behavioral difference is fundamental to the self-learning approach: early submitters provide the signal that enables prediction of which current students are at risk.

Exercise 1: (2 points)

course_pass_rate

Your task: define course_pass_rate_query as follows:

Write a SQL query that calculates pass rate statistics for each course presentation.

Inputs: None

Return: query: a query that produces a table with 5 columns:

  • code_module: The course module code
  • code_presentation: The presentation code
  • total_students: Total number of students enrolled in the course presentation
  • passed_students: Number of students who achieved 'Pass' or 'Distinction' as their final result
  • pass_rate: Percentage of students who passed (rounded to 2 decimal places)

Requirements/steps:

  • Join the courses table with the studentInfo table using both code_module and code_presentation
  • Group results by code_module and code_presentation
  • Calculate total_students by counting all students in each grouping
  • Calculate passed_students by summing students whose final_result is either 'Pass' or 'Distinction'
  • Calculate pass_rate as a percentage: (passed_students / total_students) * 100, rounded to 2 decimal places
  • Filter to include only groupings with at least 500 students
  • Order results by pass_rate descending, then code_presentation ascending (to break ties deterministically)
  • Return code_module, code_presentation, total_students, passed_students, pass_rate in that order
In [5]:
### Solution - Exercise 1  
course_pass_rate_query = '''YOUR QUERY HERE'''
###
### YOUR CODE HERE
###
### BEGIN SOLUTION
course_pass_rate_query = '''
SELECT c.code_module,
       c.code_presentation,
       COUNT(s.id_student) AS total_students,
       SUM(CASE WHEN s.final_result IN ('Pass', 'Distinction') THEN 1 ELSE 0 END) AS passed_students,
       ROUND(100.0 * SUM(CASE WHEN s.final_result IN ('Pass', 'Distinction') THEN 1 ELSE 0 END) / COUNT(s.id_student), 2) AS pass_rate
  FROM courses AS c
  JOIN studentInfo AS s
    ON c.code_module = s.code_module
   AND c.code_presentation = s.code_presentation
 GROUP BY c.code_module, c.code_presentation
HAVING COUNT(s.id_student) >= 500
 ORDER BY pass_rate DESC, c.code_presentation ASC;
'''
### END SOLUTION

### Demo function call
demo_result_course_pass_rate = pd.read_sql(course_pass_rate_query, conn)
demo_result_course_pass_rate
Out[5]:
code_module code_presentation total_students passed_students pass_rate
0 BBB 2014J 2292 1152 50.26
1 BBB 2013J 2237 1072 47.92
2 BBB 2013B 1767 803 45.44
3 BBB 2014B 1613 727 45.07

The demo should display this output.

code_module code_presentation total_students passed_students pass_rate
0 BBB 2014J 2292 1152 50.26
1 BBB 2013J 2237 1072 47.92
2 BBB 2013B 1767 803 45.44
3 BBB 2014B 1613 727 45.07


The cell below will test your solution for course_pass_rate (exercise 1). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [6]:
### Test Cell - Exercise 1  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=plugins.sql_executor(course_pass_rate_query),
              ex_name='course_pass_rate',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to course_pass_rate did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 1.66 seconds
memory after test: 0.53 MB
memory peak during test: 75.99 MB
Passed! Please submit.

Introduction: Exercise 2

Not all A1 submitters are equal. Students who submit assessments early versus late show different engagement patterns in the Virtual Learning Environment.

In the following exercise, you'll compare VLE interaction patterns before the median A1 submission date. This reveals behavioral differences that separate engaged students from those who may be struggling. These differences are visible long before the assignment deadline.

Exercise 2: (3 points)

DEBUG_early_late_submitters

Your task: define DEBUG_early_late_submitters_query as follows:

Write a SQL query that compares VLE engagement patterns between students who submitted the first assessment early versus late.

Inputs: None

Return: query: a query that produces a table with 5 columns:

  • submission_category: Either 'Early' or 'Late' based on submission timing
  • student_count: Total number of unique students in each category
  • avg_total_clicks: Average total VLE clicks per student (rounded to 2 decimal places)
  • avg_days_active: Average number of distinct days with VLE activity (rounded to 2 decimal places)
  • avg_submission_day: Average day when the assessment was submitted (rounded to 2 decimal places)

Requirements/steps:

  • This is a debugging exercise! You are free to delete the provided code and rewrite.
  • Bugs exist in the function provided below. Identify and fix them all.
  • Join studentAssessment with the assessments table on id_assessment
  • Calculate total VLE engagement metrics for each id_student, code_module, and code_presentation. Merge with the previous studentAssessment and assessments table on id_student, code_module, and code_presentation:
    • total_clicks: SUM of sum_click from studentVle table
    • days_active: COUNT of DISTINCT date values from studentVle table
    • Filter to course code_module = 'BBB' and code_presentation = '2013B'
  • Categorize students into submission categories using a CASE statement:
    • 'Early': students with date_submitted <= 25
    • 'Late': students with date_submitted > 25
  • Filter analysis to id_assessment = 14984 (the first assessment)
  • Group results by submission_category
  • For each submission_category, calculate the four aggregate metrics (student_count, avg_total_clicks (rounded to 2 decimal places), avg_days_active (rounded to 2 decimal places), avg_submission_day (rounded to 2 decimal places))
  • Students with no VLE activity should default to 0 for total_clicks and days_active
  • Return submission_category, student_count, avg_total_clicks, avg_days_active, avg_submission_day in that order
In [7]:
### Solution - Exercise 2  
# DEBUG_early_late_submitters_query = '''
# SELECT
#     CASE
#         WHEN sa.date_submitted < 25 THEN 'Early'
#         ELSE 'Late'
#     END AS submission_category,
#     COUNT(sa.id_student) AS student_count,
#     AVG(sv.total_clicks) AS avg_total_clicks,
#     AVG(sv.days_active) AS avg_days_active,
#     AVG(sa.date_submitted) AS avg_submission_day
# FROM studentAssessment sa
# INNER JOIN assessments a ON sa.id_assessment = a.id_assessment
# INNER JOIN (
#     SELECT
#         id_student,
#         code_module,
#         code_presentation,
#         SUM(sum_click) as total_clicks,
#         COUNT(date) as days_active
#     FROM studentVle
#     GROUP BY id_student, code_module, code_presentation
# ) sv
#     ON sa.id_student = sv.id_student
#     AND a.code_module = sv.code_module
#     AND a.code_presentation = sv.code_presentation
# WHERE sa.id_assessment = 14984
# GROUP BY submission_category
# ORDER BY submission_category;
# '''
###
### YOUR CODE HERE
###
### BEGIN SOLUTION
DEBUG_early_late_submitters_query = '''
SELECT
    CASE
        WHEN sa.date_submitted <= 25 THEN 'Early'
        ELSE 'Late'
    END AS submission_category,
    COUNT(DISTINCT sa.id_student) AS student_count,
    ROUND(AVG(COALESCE(sv.total_clicks, 0)), 2) AS avg_total_clicks,
    ROUND(AVG(COALESCE(sv.days_active, 0)), 2) AS avg_days_active,
    ROUND(AVG(sa.date_submitted), 2) AS avg_submission_day
FROM studentAssessment sa
INNER JOIN assessments a ON sa.id_assessment = a.id_assessment
LEFT JOIN (
    SELECT
        id_student,
        code_module,
        code_presentation,
        SUM(sum_click) as total_clicks,
        COUNT(DISTINCT date) as days_active
    FROM studentVle
    WHERE code_module = 'BBB'
      AND code_presentation = '2013B'
    GROUP BY id_student, code_module, code_presentation
) sv
    ON sa.id_student = sv.id_student
    AND a.code_module = sv.code_module
    AND a.code_presentation = sv.code_presentation
WHERE sa.id_assessment = 14984
GROUP BY submission_category
ORDER BY submission_category;
'''
### END SOLUTION

### Demo function call
demo_result_DEBUG_early_late_submitters = pd.read_sql(DEBUG_early_late_submitters_query, conn)
demo_result_DEBUG_early_late_submitters
Out[7]:
submission_category student_count avg_total_clicks avg_days_active avg_submission_day
0 Early 1276 1022.25 59.48 17.45
1 Late 76 377.47 23.51 34.68

The demo should display this output.

submission_category student_count avg_total_clicks avg_days_active avg_submission_day
0 Early 1276 1022.25 59.48 17.45
1 Late 76 377.47 23.51 34.68


The cell below will test your solution for DEBUG_early_late_submitters (exercise 2). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [8]:
### Test Cell - Exercise 2  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=plugins.sql_executor(DEBUG_early_late_submitters_query),
              ex_name='DEBUG_early_late_submitters',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to DEBUG_early_late_submitters did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 4.86 seconds
memory after test: 0.42 MB
memory peak during test: 297.21 MB
Passed! Please submit.

Introduction: Exercise 3

There are additional early indicators beyond submission patterns that predict at-risk students. VLE engagement metrics total clicks and days active correlate strongly with course outcomes.

The following exercise introduces the importance of behavioral features. Students with low engagement early in the course likely need intervention.

Exercise 3: (2 points)

vle_engagement_risk

Your task: define vle_engagement_risk_query as follows:

Write a SQL query that identifies at-risk students based on low VLE engagement in the first 30 days of a course.

Inputs: None

Return: query: a query that produces a table with 5 columns:

  • id_student: The student identifier
  • total_clicks: Total sum of VLE clicks across all interactions before day 30
  • days_active: Count of distinct days with VLE activity before day 30
  • at_risk_flag: Binary flag (0 or 1) indicating risk status
  • final_result: The student's final course outcome

Requirements/steps:

  • Filter studentVle records to code_module = 'BBB', code_presentation = '2013J', and date < 30
  • Join the filtered studentVle with studentInfo on id_student, code_module, and code_presentation
  • Group results by id_student and final_result
  • Calculate VLE engagement metrics for each student:
    • total_clicks: SUM of sum_click values
    • days_active: COUNT of DISTINCT date values
  • Create at_risk_flag using a CASE statement:
    • Set flag to 1 if total_clicks < 50 OR days_active < 5
    • Set flag to 0 otherwise
  • Include the final_result column from the studentInfo table
  • Order results by id_student ascending, then final_result ascending
  • Return id_student, total_clicks, days_active, at_risk_flag, final_result in that order
In [9]:
### Solution - Exercise 3  
vle_engagement_risk_query = '''YOUR QUERY HERE'''
###
### YOUR CODE HERE
###
### BEGIN SOLUTION
vle_engagement_risk_query = '''
SELECT sv.id_student,
       SUM(sv.sum_click) AS total_clicks,
       COUNT(DISTINCT sv.date) AS days_active,
       CASE 
           WHEN SUM(sv.sum_click) < 50 OR COUNT(DISTINCT sv.date) < 5 
           THEN 1 
           ELSE 0 
       END AS at_risk_flag,
       s.final_result
  FROM studentVle AS sv
  JOIN studentInfo AS s
    ON sv.id_student = s.id_student
   AND sv.code_module = s.code_module
   AND sv.code_presentation = s.code_presentation
 WHERE sv.code_module = 'BBB'
   AND sv.code_presentation = '2013J'
   AND sv.date < 30
 GROUP BY sv.id_student, s.final_result
 ORDER BY sv.id_student ASC, s.final_result ASC;
'''
### END SOLUTION

### Demo function call
demo_result_vle_engagement_risk = pd.read_sql(vle_engagement_risk_query, conn)
demo_result_vle_engagement_risk.head(20)
Out[9]:
id_student total_clicks days_active at_risk_flag final_result
0 23798 152 12 0 Distinction
1 27759 123 12 0 Fail
2 30091 219 8 0 Pass
3 31014 409 26 0 Withdrawn
4 31849 414 17 0 Pass
5 37622 88 4 1 Pass
6 38234 34 5 1 Fail
7 47855 5 3 1 Withdrawn
8 48040 307 26 0 Pass
9 48503 121 8 0 Pass
10 51301 502 36 0 Withdrawn
11 52797 280 15 0 Pass
12 52899 448 21 0 Pass
13 53360 22 3 1 Withdrawn
14 54388 42 1 1 Withdrawn
15 55026 118 13 0 Pass
16 56340 4 1 1 Fail
17 56789 124 8 0 Pass
18 57079 112 12 0 Fail
19 63044 74 6 0 Fail

The demo should display this output.

id_student total_clicks days_active at_risk_flag final_result
0 23798 152 12 0 Distinction
1 27759 123 12 0 Fail
2 30091 219 8 0 Pass
3 31014 409 26 0 Withdrawn
4 31849 414 17 0 Pass
5 37622 88 4 1 Pass
6 38234 34 5 1 Fail
7 47855 5 3 1 Withdrawn
8 48040 307 26 0 Pass
9 48503 121 8 0 Pass
10 51301 502 36 0 Withdrawn
11 52797 280 15 0 Pass
12 52899 448 21 0 Pass
13 53360 22 3 1 Withdrawn
14 54388 42 1 1 Withdrawn
15 55026 118 13 0 Pass
16 56340 4 1 1 Fail
17 56789 124 8 0 Pass
18 57079 112 12 0 Fail
19 63044 74 6 0 Fail


The cell below will test your solution for vle_engagement_risk (exercise 3). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [10]:
### Test Cell - Exercise 3  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=plugins.sql_executor(vle_engagement_risk_query),
              ex_name='vle_engagement_risk',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to vle_engagement_risk did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 2.71 seconds
memory after test: 0.46 MB
memory peak during test: 179.99 MB
Passed! Please submit.
In [11]:
### Run Me!!!
students_by_course = utils.load_object_from_publicdata('students_by_course')
registrations_by_course = utils.load_object_from_publicdata('registrations_by_course')
vle_by_course = utils.load_object_from_publicdata('vle_by_course')
STUDENT_FEATURES = utils.load_object_from_publicdata('STUDENT_FEATURES')
REGISTRATION_FEATURES = utils.load_object_from_publicdata('REGISTRATION_FEATURES')
RESULT_FEATURES = utils.load_object_from_publicdata('RESULT_FEATURES')

Part 2: Feature Engineering and Temporal Modeling

Now we will shift from exploration to implementation. You'll transform raw VLE interaction data into structured features for machine learning and implement the temporal modeling framework that enables self-learning predictions.

Introduction: Exercise 4

The analysis requires that we create features that capture engagement patterns before assessment submissions. We need to aggregate VLE interactions, calculate days active, and extract registration timing given a temporal cutoff day. In the following exercise, you will produce one row per student with engineered behavioral features computed only using data available up to the cutoff day. These features are the foundation for subsequent modeling.

Exercise 4: (2 points)

create_student_features

Your task: define create_student_features as follows:

Create a comprehensive student feature matrix that combines demographics, registration timing, and VLE engagement metrics.

Inputs:

  • students_df (DataFrame): Students from studentInfo table with columns: code_module, code_presentation, id_student, age_band, highest_education, disability, num_of_prev_attempts, final_result
  • registrations_df (DataFrame): Registration records from studentRegistration table with columns: code_module, code_presentation, id_student, date_registration
  • vle_df (DataFrame): VLE interaction records from studentVle table with columns: code_module, code_presentation, id_student, date, sum_click
  • cutoff_day (int): Temporal cutoff for VLE activity analysis (only include VLE activity before this day)

Return: A pandas DataFrame with the following structure:

  • Index: id_student (student identifier)
  • Columns: All feature columns plus code_module and code_presentation for course identification

Requirements/steps:

  1. Start with students_df using only the columns in STUDENT_FEATURES
  2. Merge with registrations_df (on right) to add registration timing features:
    • Use a left join on code_module, code_presentation, and id_student
    • Within registrations_df dataframe, keep only the columns in REGISTRATION_FEATURES
  3. Calculate VLE features from vle_df:
    • Filter vle_df to only include records where date < cutoff_day
    • Group by code_module, code_presentation, and id_student
    • Calculate two aggregated metrics:
      • total_clicks: SUM of sum_click
      • total_active_days: COUNT of DISTINCT date values
    • Calculate avg_clicks_per_day which equals total_clicks/total_active_days
  4. Merge the VLE features (on right) with the features DataFrame (from Step 2) using a left join on code_module, code_presentation, and id_student
  5. Fill any missing VLE feature values with 0 (for students with no VLE activity)
  6. Ensure dtypes for (total_clicks, total_active_days, avg_clicks_per_day) are (int64, int64, float64)
  7. Create a column days_registered_before_start equal to date_registration
  8. Return only columns from RESULT_FEATURES indexed by id_student

Note: The preloaded lists STUDENT_FEATURES, REGISTRATION_FEATURES, and RESULT_FEATURES are provided to save typing.

In [12]:
### Solution - Exercise 4  
def create_student_features(students_df, registrations_df, vle_df, cutoff_day):
    ###
    ### YOUR CODE HERE
    ###
    ### BEGIN SOLUTION
    
    features = students_df[STUDENT_FEATURES].copy()
    
    features = features.merge(
        registrations_df[REGISTRATION_FEATURES], 
        on=['code_module', 'code_presentation', 'id_student'], 
        how='left'
    )

    vle_filtered = vle_df[vle_df['date'] < cutoff_day].copy() 
    vle_agg = vle_filtered.groupby(['code_module', 'code_presentation', 'id_student']).agg(
        total_clicks=('sum_click', 'sum'),
        total_active_days=('date', 'nunique')
    ).reset_index()
    
    vle_agg['avg_clicks_per_day'] = vle_agg['total_clicks'] / vle_agg['total_active_days']
    
    features = features.merge(vle_agg, on=['code_module', 'code_presentation', 'id_student'], how='left')
    
    features['total_clicks'] = features['total_clicks'].fillna(0).astype(int)
    features['total_active_days'] = features['total_active_days'].fillna(0).astype(int)
    features['avg_clicks_per_day'] = features['avg_clicks_per_day'].fillna(0.0)
    
    features['days_registered_before_start'] = features['date_registration']
    
    result = features[RESULT_FEATURES].set_index('id_student')
    
    return result
    ### END SOLUTION
    
### Demo function call
demo_students = students_by_course[('AAA', '2013J')].head(50)
demo_registrations = registrations_by_course[('AAA', '2013J')]
demo_vle = vle_by_course[('AAA', '2013J')]

demo_cutoff = 30

result = create_student_features(demo_students, demo_registrations, demo_vle, demo_cutoff)

result.head(10)
Out[12]:
code_module code_presentation age_band highest_education disability num_of_prev_attempts days_registered_before_start total_clicks total_active_days avg_clicks_per_day final_result
id_student
11391 AAA 2013J 55<= HE Qualification N 0 -159.0 419 9 46.555556 Pass
28400 AAA 2013J 35-55 HE Qualification N 0 -53.0 618 19 32.526316 Pass
30268 AAA 2013J 35-55 A Level or Equivalent Y 0 -92.0 281 12 23.416667 Withdrawn
31604 AAA 2013J 35-55 A Level or Equivalent N 0 -52.0 503 23 21.869565 Pass
32885 AAA 2013J 0-35 Lower Than A Level N 0 -176.0 567 24 23.625000 Pass
38053 AAA 2013J 35-55 A Level or Equivalent N 0 -110.0 727 32 22.718750 Pass
45462 AAA 2013J 0-35 HE Qualification N 0 -67.0 561 15 37.400000 Pass
45642 AAA 2013J 0-35 A Level or Equivalent N 0 -29.0 423 20 21.150000 Pass
52130 AAA 2013J 0-35 A Level or Equivalent N 0 -33.0 366 19 19.263158 Pass
53025 AAA 2013J 55<= Post Graduate Qualification N 0 -179.0 571 20 28.550000 Pass

The demo should display this output.

code_module code_presentation age_band highest_education disability num_of_prev_attempts days_registered_before_start total_clicks total_active_days avg_clicks_per_day final_result
id_student
11391 AAA 2013J 55<= HE Qualification N 0 -159.0 419 9 46.555556 Pass
28400 AAA 2013J 35-55 HE Qualification N 0 -53.0 618 19 32.526316 Pass
30268 AAA 2013J 35-55 A Level or Equivalent Y 0 -92.0 281 12 23.416667 Withdrawn
31604 AAA 2013J 35-55 A Level or Equivalent N 0 -52.0 503 23 21.869565 Pass
32885 AAA 2013J 0-35 Lower Than A Level N 0 -176.0 567 24 23.625000 Pass
38053 AAA 2013J 35-55 A Level or Equivalent N 0 -110.0 727 32 22.718750 Pass
45462 AAA 2013J 0-35 HE Qualification N 0 -67.0 561 15 37.400000 Pass
45642 AAA 2013J 0-35 A Level or Equivalent N 0 -29.0 423 20 21.150000 Pass
52130 AAA 2013J 0-35 A Level or Equivalent N 0 -33.0 366 19 19.263158 Pass
53025 AAA 2013J 55<= Post Graduate Qualification N 0 -179.0 571 20 28.550000 Pass


The cell below will test your solution for create_student_features (exercise 4). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [13]:
### Test Cell - Exercise 4  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=plugins.sqlite_blocker(create_student_features),
              ex_name='create_student_features',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=20)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to create_student_features did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 7.33 seconds
memory after test: 0.86 MB
memory peak during test: 253.98 MB
Passed! Please submit.
In [14]:
### Run Me!!!
all_students = utils.load_object_from_publicdata('all_students')
all_student_assessments = utils.load_object_from_publicdata('all_student_assessments')

Introduction: Exercise 5

In the following exercise, you'll create binary labels indicating whether students submitted A1 (1) or did not submit (0). These labels must be merged with the features from Exercise 4. Together, the features and labels create our machine learning dataset, with each row representing a student's behavioral profile and their eventual submission outcome.

Exercise 5: (1 points)

create_submission_labels

Example: we have defined create_submission_labels as follows:

Generate binary submission labels indicating whether students submitted the first assessment within a specific temporal window.

THIS IS A FREE EXERCISE. Submit your exam to obtain your free point

Inputs:

  • students_df (DataFrame): Students from studentInfo table with columns including: code_module, code_presentation, id_student
  • student_assessments_df (DataFrame): Assessment submissions from studentAssessment table merged with assessments table, containing columns including: id_student, id_assessment, date_submitted, code_module, code_presentation
  • first_assessment_id (int): ID of the first assessment to analyze
  • label_start_day (int): Start of the label window (inclusive)
  • label_end_day (int): End of the label window (inclusive)

Return: A pandas DataFrame with the following columns:

  • id_student: Student identifier
  • code_module: Course module code
  • code_presentation: Course presentation code
  • label: Binary label (1 if submitted within window, 0 otherwise)

Requirements/steps:

  1. Filter student_assessments_df to only include records where id_assessment equals first_assessment_id
  2. Create a submission DataFrame from the filtered student assessments (from step 1):
    • Filter submissions to those where label_start_day <= date_submitted <= label_end_day
    • Only keep the columns id_student, code_module, and code_presentation
  3. Merge the students_df (on left) with submission DataFrame (from step 2):
    • Only keep the columns id_student, code_module, and code_presentation from students_df
    • Use a left join on id_student, code_module, and code_presentation
    • This ensures all students from students_df are included, even those who never submitted
  4. Create label column which corresponds to 1 for students who submitted within the window, 0 otherwise
  5. Select and return exactly four columns: id_student, code_module, code_presentation and label all indexed by id_student
In [15]:
### Solution - Exercise 5  
def create_submission_labels(students_df, student_assessments_df, first_assessment_id, label_start_day, label_end_day):
    filtered_assessments = student_assessments_df[
        student_assessments_df['id_assessment'] == first_assessment_id
    ]
    
    submitted_in_window = filtered_assessments[
        (filtered_assessments['date_submitted'] >= label_start_day) &
        (filtered_assessments['date_submitted'] <= label_end_day)
    ][['id_student', 'code_module', 'code_presentation']].drop_duplicates()
    
    result = students_df[['id_student', 'code_module', 'code_presentation']].copy()
    
    result = result.merge(
        submitted_in_window,
        on=['id_student', 'code_module', 'code_presentation'],
        how='left',
        indicator=True
    )
    
    result['label'] = (result['_merge'] == 'both').astype(int)
    
    result = result.drop(columns=['_merge'])
    
    result = result.set_index('id_student')
    
    return result

### Demo function call
demo_students = all_students[
    (all_students['code_module'] == 'AAA') &
    (all_students['code_presentation'] == '2013J')
].head(50)

demo_student_assessments = all_student_assessments[
    (all_student_assessments['code_module'] == 'AAA') &
    (all_student_assessments['code_presentation'] == '2013J')
]

course_assessments = demo_student_assessments.sort_values('date_submitted')
first_assessment_id = course_assessments.iloc[0]['id_assessment'] if len(course_assessments) > 0 else None

result = create_submission_labels(
    demo_students,
    demo_student_assessments,
    first_assessment_id,
    label_start_day=0,
    label_end_day=30
)

print(result['label'].iloc[:10])
id_student
11391    1
28400    1
30268    0
31604    1
32885    1
38053    1
45462    1
45642    1
52130    1
53025    1
Name: label, dtype: int64

The demo should display this printed output.

id_student
11391    1
28400    1
30268    0
31604    1
32885    1
38053    1
45462    1
45642    1
52130    1
53025    1
Name: label, dtype: int64


The test cell below will always pass. Please submit to collect your free points for create_submission_labels (exercise 5).

In [16]:
### Test Cell - Exercise 5  


print('Passed! Please submit.')
Passed! Please submit.

Introduction: Exercise 6

A difficulty with the Ouroboros approach is class imbalance. 75% or more of students submit A1. This imbalance can lead to misleading evaluation metrics. For example, if 90% of students submit A1, a model that always predicts "submit" would achieve 90% accuracy but would be useless for identifying at-risk students.

This exercise quantifies the imbalance ratio and calculates how many samples we have in each class. Understanding this imbalance is essential because it affects which evaluation metrics we should use.

Exercise 6: (2 points)

analyze_class_imbalance

Your task: define analyze_class_imbalance as follows:

Analyze class imbalance in binary classification labels. In early prediction scenarios, positive examples (students who submit) are typically much fewer than negative examples (students who don't submit), creating severe class imbalance. Understanding this imbalance is crucial for model evaluation and metric selection.

Inputs:

  • labels (Series): Binary labels (0 or 1) indexed by id_student

Return: A Python dictionary containing the following keys and values:

  • 'total_samples': Total number of labels (integer)
  • 'positive_count': Count of labels with value 1 (integer)
  • 'negative_count': Count of labels with value 0 (integer)
  • 'positive_ratio': Ratio of positive samples to total samples (float, rounded to 4 decimal places)
  • 'imbalance_ratio': Ratio of majority class to minority class (float, rounded to 4 decimal places)
  • 'majority_class': The more frequent class label (integer, either 0 or 1)

Requirements/steps:

  1. Count the total number of samples and name it total_samples
  2. Count the number of positive labels (value = 1) and name it positive_count
  3. Count the number of negative labels (value = 0) and name it negative_count
  4. Calculate the positive ratio: positive_count / total_samples, rounded to 4 decimal places and name it positive_ratio
  5. Calculate the imbalance ratio: max(positive_count, negative_count) / min(positive_count, negative_count), rounded to 4 decimal places and name it imbalance_ratio
    • If either positive_count or negative_count is 0, then set imbalance_ratio to inf
  6. Determine the majority class: the label (0 or 1) that appears more frequently and name it majority_class
    • If classes are equal, set majority_class to 1
  7. Return all six metrics in a dictionary with the exact keys and values specified above
In [17]:
### Solution - Exercise 6  
def analyze_class_imbalance(labels):
    ###
    ### YOUR CODE HERE
    ###
    ### BEGIN SOLUTION
    total_samples = len(labels)
    positive_count = int(labels.sum())
    negative_count = total_samples - positive_count
    
    positive_ratio = round(positive_count / total_samples, 4)
    
    if negative_count == 0:
        imbalance_ratio = float('inf')
    elif positive_count == 0:
        imbalance_ratio = float('inf')
    else:
        imbalance_ratio = round(max(positive_count, negative_count) / min(positive_count, negative_count), 4)
    
    majority_class = 1 if positive_count >= negative_count else 0
    
    return {
        'total_samples': total_samples,
        'positive_count': positive_count,
        'negative_count': negative_count,
        'positive_ratio': positive_ratio,
        'imbalance_ratio': imbalance_ratio,
        'majority_class': majority_class
    }
    ### END SOLUTION
    
### Demo function call
demo_labels = pd.Series(
    [1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1] * 2,
    index=range(11391, 11391 + 40)
)

result = analyze_class_imbalance(demo_labels)

pprint(result)
{'imbalance_ratio': 4.0,
 'majority_class': 1,
 'negative_count': 8,
 'positive_count': 32,
 'positive_ratio': 0.8,
 'total_samples': 40}

The demo should display this printed output.

{'imbalance_ratio': 4.0,
 'majority_class': 1,
 'negative_count': 8,
 'positive_count': 32,
 'positive_ratio': 0.8,
 'total_samples': 40}


The cell below will test your solution for analyze_class_imbalance (exercise 6). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [18]:
### Test Cell - Exercise 6  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=analyze_class_imbalance,
              ex_name='analyze_class_imbalance',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to analyze_class_imbalance did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 0.13 seconds
memory after test: 0.06 MB
memory peak during test: 1.51 MB
Passed! Please submit.
In [19]:
### Run Me!!!
all_student_features = utils.load_object_from_publicdata('all_student_features')
all_submission_labels = utils.load_object_from_publicdata('all_submission_labels')

Introduction: Exercise 7

Not all behavioral metrics are equally informative. In the following exercise, you will compute correlations between VLE engagement features and submission labels to show which patterns have the strongest predictive power.

By identifying which features correlate most with submission outcomes, we can focus on the most relevant behavioral signals. This step is crucial for building an effective predictive model.

Exercise 7: (3 points)

compute_feature_correlations

Your task: define compute_feature_correlations as follows:

Compute Pearson correlation coefficients between numerical features and binary outcome labels. This analysis identifies which features have the strongest linear relationships with the outcome, guiding feature selection for predictive modeling.

Inputs:

  • features_df (DataFrame): Student features with id_student as the index
  • outcome_series (Series): Binary outcomes (0 or 1) with id_student as the index
  • numerical_features (list): List of column names for numerical features to analyze

Return: A pandas DataFrame with three columns:

  • feature_name: Name of the feature (string)
  • correlation: Pearson correlation coefficient with the outcome (float, rounded to 4 decimal places)
  • abs_correlation: Absolute value of the correlation (float, rounded to 4 decimal places)

The DataFrame should be sorted by abs_correlation with the strongest correlations first.

Requirements/steps:

  1. Align the features_df and outcome_series using their shared id_student index
  2. For each feature in numerical_features:
    • Extract the feature values as a Series
    • Exclude NaN values
    • If there are 2+ values for that feature and the Standard Deviation of those values is positive:
      • Calculate the Pearson correlation coefficient of the feature with outcome_series
      • Calculate the absolute value of the correlation
      • Round both the correlation and absolute value of the correlation to 4 decimal places
  3. Create a DataFrame with feature_name, correlation, and abs_correlation as specified above
  4. Sort the DataFrame by abs_correlation in descending order
  5. Return the sorted DataFrame

Hint: Use pandas .corr() method to calculate Pearson correlations.

In [20]:
### Solution - Exercise 7  
def compute_feature_correlations(features_df, outcome_series, numerical_features):
    ###
    ### YOUR CODE HERE
    ###
    ### BEGIN SOLUTION
    indexes=list(set(outcome_series.index) & set(features_df.index))
    aligned_features = features_df.loc[indexes]
    aligned_outcome = outcome_series.loc[indexes]
    
    correlations = []
    
    for feature in numerical_features:
        if feature not in aligned_features.columns:
            continue
        
        feature_data = aligned_features[feature]
        
        valid_mask = feature_data.notna() & aligned_outcome.notna()
        feature_clean = feature_data[valid_mask]
        outcome_clean = aligned_outcome[valid_mask]
        
        if len(feature_clean) > 1 and feature_clean.std() > 0:
            corr = feature_clean.corr(outcome_clean)
            correlations.append({
                'feature_name': feature,
                'correlation': round(corr, 4),
                'abs_correlation': round(abs(corr), 4)
            })
    
    result_df = pd.DataFrame(correlations)
    result_df = result_df.sort_values('abs_correlation', ascending=False).reset_index(drop=True)
    
    return result_df
    ### END SOLUTION
    
### Demo function call
demo_features = pd.DataFrame({
    'num_of_prev_attempts': [0, 1, 0, 2, 0, 1, 3, 0, 1, 0],
    'days_registered_before_start': [-50, -30, -20, -60, -10, -45, -25, -15, -35, -5],
    'total_clicks': [120, 80, 150, 45, 200, 90, 60, 180, 70, 160],
    'total_active_days': [15, 10, 18, 5, 20, 12, 8, 19, 9, 17],
    'avg_clicks_per_day': [8.0, 8.0, 8.3, 9.0, 10.0, 7.5, 7.5, 9.5, 7.8, 9.4]
}, index=pd.Index(range(11391, 11401), name='id_student'))

demo_labels = pd.Series(
    [1, 0, 1, 0, 1, 1, 0, 1, 0, 1],
    index=pd.Index(range(11396, 11406), name='id_student')
)

demo_numerical_features = [
    'num_of_prev_attempts',
    'days_registered_before_start',
    'total_clicks',
    'total_active_days',
    'avg_clicks_per_day'
]

demo_result_compute_feature_correlations = compute_feature_correlations(
    demo_features, 
    demo_labels, 
    demo_numerical_features
)

demo_result_compute_feature_correlations
Out[20]:
feature_name correlation abs_correlation
0 total_active_days 0.8474 0.8474
1 total_clicks 0.7873 0.7873
2 num_of_prev_attempts -0.7454 0.7454
3 avg_clicks_per_day 0.6168 0.6168
4 days_registered_before_start 0.2887 0.2887

The demo should display this output.

feature_name correlation abs_correlation
0 total_active_days 0.8474 0.8474
1 total_clicks 0.7873 0.7873
2 num_of_prev_attempts -0.7454 0.7454
3 avg_clicks_per_day 0.6168 0.6168
4 days_registered_before_start 0.2887 0.2887


The cell below will test your solution for compute_feature_correlations (exercise 7). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [21]:
### Test Cell - Exercise 7  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=plugins.sqlite_blocker(compute_feature_correlations),
              ex_name='compute_feature_correlations',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to compute_feature_correlations did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 1.03 seconds
memory after test: 0.11 MB
memory peak during test: 3.41 MB
Passed! Please submit.
In [22]:
### Run Me!!!
all_students = utils.load_object_from_publicdata('all_students')
all_student_assessments = utils.load_object_from_publicdata('all_student_assessments')

Introduction: Exercise 8

In the following exercise, you will implement the core Ouroboros temporal windowing concept. Given a current_day and the cutoff_day for A1, we must split students into training and testing sets based on their submission timing.

  • Training set: Students who already submitted A1 by current_day. Their features (computed from behavior before they submitted) and their submission timing (early vs. late) provide labeled examples for learning.

  • Testing set: Students who haven't submitted yet. These are the at-risk candidates we want to identify. We compute their current behavioral features and predict whether they will eventually submit.

This is "self-learning": recent submitters train the model to recognize patterns that predict outcomes for students still enrolled. As each day passes, more students move from the test set to the training set, continuously improving our predictions.

This is a DEBUG EXERCISE. We have provided the code and requirements. The task is to identify and fix any issues to ensure the function works as intended.


Example: demo case (current_day=15, cutoff_day=25)

Day:   0────────[current_day=15]───────────[cutoff_day=25]──>

Student  Submission day                                         Label
──────── ──────────────────────────────────────────────────    ──────────
S1       ●(5)          |                          |            train = 1
S2         ●──(10)     |                          |            train = 1
S3           ●──(12)   |                          |            train = 1
S4                     |                          |            test  = 0  (never)
S5                     |                          |            test  = 0  (never)
S6                     |     ●─(18)               |            test  = 1
S7                     |              ●──────(25) |            test  = 1
S8                     |                          |  ●─(30)    test  = 0  (past cutoff)
S9                     |                          |            test  = 0  (never)
S10                    |                          |            test  = 0  (never)
       ◄──── TRAIN ────►◄────── TEST WINDOW ──────►
  • marks the day a student submitted A1; · means no submission yet at current_day
  • Students who submitted on or before current_dayTRAIN (train_labels)
  • All other students → TEST (test_labels); label = 1 only if submission falls within the test window (current_day, cutoff_day]

Exercise 8: (3 points)

DEBUG_create_temporal_split

Your task: define DEBUG_create_temporal_split as follows:

Implement the temporal windowing methodology to create train/test splits for self-learning prediction.

Note: This is a debugging exercise!

Inputs:

  • students_df (DataFrame): Student records with columns: id_student, code_module, code_presentation
  • student_assessments_df (DataFrame): Assessment submissions with columns: id_student, id_assessment, date_submitted, code_module, code_presentation
  • first_assessment_id (int): ID of the first assessment to predict
  • current_day (int): Current day for making predictions
  • cutoff_day (int): Deadline for assessment submission

Return: A tuple containing two pandas Series: (train_labels, test_labels)

  • train_labels: Series indexed by id_student for all course students
  • Value = 1 if student submitted between day 0 and current_day (inclusive)
  • Value = 0 if student did not submit by current_day
  • test_labels: Series indexed by id_student
  • Only includes students who had NOT submitted by current_day (i.e., students with train_labels == 0)
  • Value = 1 if student submitted after current_day and by cutoff_day
  • Value = 0 if student never submitted or submitted after cutoff_day

Requirements/steps:

  1. This is a debugging exercise! You are free to delete the provided code and rewrite.
  2. Bugs exist in the function provided below. Identify and fix them all.
  3. Extract code_module and code_presentation from students_df first row. Filter both students_df and student_assessments_df to keep only rows where code_module and code_presentation match these extracted values.
  4. Filter student_assessments_df to only include records where id_assessment equals first_assessment_id
  5. Create training labels for all course students:
    • From course_assessments (step 4 result), identify students who submitted by current_day (inclusive): date_submitted <= current_day
    • Using all students from course_students (step 3 result), assign label = 1 if they appear in the submitted set, 0 otherwise
    • Index the Series by id_student and set index.name = 'id_student'
  6. Create test labels for students who have not yet submitted:
    • Start from the subset of course_students whose id_student is NOT in the submitted-by-current set (i.e., students with train_labels == 0); this subset defines both the test population and the test Series index
    • Among these students, identify those who submitted after current_day and by cutoff_day (inclusive): current_day < date_submitted <= cutoff_day
    • Assign label = 1 for students who submitted in this window, 0 for students who never submitted or submitted after cutoff_day
    • Index the Series by id_student and set index.name = 'id_student'
  7. Ensure both Series contain only integer values (0 or 1)
  8. The train and test sets are mutually exclusive by construction: test_labels contains only students with train_labels == 0
  9. Return the tuple (train_labels, test_labels)
In [23]:
### Solution - Exercise 8  
def DEBUG_create_temporal_split(students_df, student_assessments_df,
                                 first_assessment_id, current_day, cutoff_day):
    ###
    ### YOUR CODE HERE
    ###
#     target_module = students_df['code_module'].iloc[0]
#     target_presentation = students_df['code_presentation'].iloc[0]

#     course_students = students_df.copy()

#     course_assessments = student_assessments_df[
#         (student_assessments_df['code_module'] == target_module) &
#         (student_assessments_df['code_presentation'] == target_presentation) &
#         (student_assessments_df['id_assessment'] == first_assessment_id)
#     ].copy()

#     submitted_by_current = course_assessments[
#         course_assessments['date_submitted'] < current_day
#     ]['id_student'].unique()

#     submitted_after_current = course_assessments[
#         course_assessments['date_submitted'] > current_day
#     ]['id_student'].unique()

#     train_labels = pd.Series(
#         course_students['id_student'].isin(submitted_by_current).astype(int).values,
#         index=course_students['id_student']
#     )

#     test_labels = pd.Series(
#         course_students['id_student'].isin(submitted_after_current).astype(int).values,
#         index=course_students['id_student']
#     )
#     test_labels.index.name = 'id_student'

#     return train_labels, test_labels

    ### BEGIN SOLUTION
    target_module = students_df['code_module'].iloc[0]
    target_presentation = students_df['code_presentation'].iloc[0]

    course_students = students_df[
        (students_df['code_module'] == target_module) &
        (students_df['code_presentation'] == target_presentation)
    ].copy()

    course_assessments = student_assessments_df[
        (student_assessments_df['code_module'] == target_module) &
        (student_assessments_df['code_presentation'] == target_presentation) &
        (student_assessments_df['id_assessment'] == first_assessment_id)
    ].copy()

    submitted_by_current = course_assessments[
        course_assessments['date_submitted'] <= current_day
    ]['id_student'].unique()

    submitted_after_current = course_assessments[
        (course_assessments['date_submitted'] > current_day) &
        (course_assessments['date_submitted'] <= cutoff_day)
    ]['id_student'].unique()

    train_labels = pd.Series(
        course_students['id_student'].isin(submitted_by_current).astype(int).values,
        index=course_students['id_student']
    )
    train_labels.index.name = 'id_student'

    students_not_submitted_by_current = course_students[
        ~course_students['id_student'].isin(submitted_by_current)
    ]

    test_labels = pd.Series(
        students_not_submitted_by_current['id_student'].isin(submitted_after_current).astype(int).values,
        index=students_not_submitted_by_current['id_student']
    )
    test_labels.index.name = 'id_student'

    return train_labels, test_labels
    ### END SOLUTION

### Demo function call
demo_students = pd.DataFrame({
    'id_student': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'code_module': ['AAA'] * 10,
    'code_presentation': ['2013J'] * 10
})

demo_student_assessments = pd.DataFrame({
    'id_student': [1, 2, 3, 6, 7, 8],
    'id_assessment': [1001] * 6,
    'date_submitted': [5, 10, 12, 18, 25, 30],
    'code_module': ['AAA'] * 6,
    'code_presentation': ['2013J'] * 6
})

first_assessment_id = 1001
current_day = 15
cutoff_day = 25

train_labels, test_labels = DEBUG_create_temporal_split(
    demo_students, demo_student_assessments,
    first_assessment_id, current_day, cutoff_day
)

print('Training set:')
pprint(train_labels)
print('\nTesting set:')
pprint(test_labels)
Training set:
id_student
1     1
2     1
3     1
4     0
5     0
6     0
7     0
8     0
9     0
10    0
dtype: int64

Testing set:
id_student
4     0
5     0
6     1
7     1
8     0
9     0
10    0
dtype: int64

The demo should display this printed output.

Training set:
id_student
1     1
2     1
3     1
4     0
5     0
6     0
7     0
8     0
9     0
10    0
dtype: int64

Testing set:
id_student
4     0
5     0
6     1
7     1
8     0
9     0
10    0
dtype: int64


The cell below will test your solution for DEBUG_create_temporal_split (exercise 8). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [24]:
### Test Cell - Exercise 8  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=DEBUG_create_temporal_split,
              ex_name='DEBUG_create_temporal_split',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=40)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to DEBUG_create_temporal_split did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 5.49 seconds
memory after test: 0.40 MB
memory peak during test: 163.16 MB
Passed! Please submit.

Introduction: Exercise 9

Now we need to evaluate predictions on this imbalanced dataset, however, standard accuracy is misleading when 90% of students submit A1. A naive classifier that always predicts "will submit" achieves 90% accuracy but can not help to identify at-risk students.

Instead, you will calculate precision, recall, and F1-score using a 0.5 probability threshold. These metrics balance the tradeoff between identifying at-risk students and avoiding false alarms. The F1-score combines precision and recall, providing a single metric that reflects model quality on imbalanced classification tasks.

Use the following formulas to compute these metrics:

  • Precision = TP / (TP + FP)
  • Recall = TP / (TP + FN)
  • F1-score = 2 (Precision Recall) / (Precision + Recall)

Exercise 9: (1 points)

calculate_metrics

Your task: define calculate_metrics as follows:

Calculate standard binary classification metrics (precision, recall, and F1-score) for predicted probabilities.

Inputs:

  • y_pred_proba (array): Predicted probabilities for the positive class (values between 0.0 and 1.0)
  • y_true (array): True binary labels (0 or 1)

Return: A Python dictionary containing three metrics:

  • 'precision': Proportion of positive predictions that are correct (float, rounded to 4 decimal places)
  • 'recall': Proportion of actual positives correctly identified (float, rounded to 4 decimal places)
  • 'f1_score': Harmonic mean of precision and recall (float, rounded to 4 decimal places)

Requirements/steps:

  1. Convert predicted probabilities to binary predictions using a threshold of 0.5:
    • Prediction = 1 if y_pred_proba >= 0.5
    • Prediction = 0 if y_pred_proba < 0.5
  2. Using the following logic:
    • True Positives: prediction == 1 and actual == 1
    • False Positives: prediction == 1 and actual == 0
    • True Negatives: prediction == 0 and actual == 0
    • False Negatives: prediction == 0 and actual == 1
  3. Calculate precision:
    • precision = True Positives / (True Positives + False Positives)
    • If there are no positive predictions, precision should be 0.0
  4. Calculate recall:
    • recall = True Positives / (True Positives + False Negatives)
    • If there are no actual positives, recall should be 0.0
  5. Calculate F1-score:
    • f1_score = 2 * (precision * recall) / (precision + recall)
    • If both precision and recall are 0, F1-score should be 0.0
  6. Round all three metrics to 4 decimal places
  7. Return the metrics in a dictionary with the exact keys specified above

Hint: The metric should be 0.0 when division by zero occurs.

In [25]:
### Solution - Exercise 9  
def calculate_metrics(y_pred_proba, y_true):
    ###
    ### YOUR CODE HERE
    ###
    ### BEGIN SOLUTION
    y_pred_proba = np.array(y_pred_proba)
    y_true = np.array(y_true)
    
    y_pred = (y_pred_proba >= 0.5).astype(int)
    
    true_positives = np.sum((y_pred == 1) & (y_true == 1))
    false_positives = np.sum((y_pred == 1) & (y_true == 0))
    true_negatives = np.sum((y_pred == 0) & (y_true == 0))
    false_negatives = np.sum((y_pred == 0) & (y_true == 1))
    
    if true_positives + false_positives > 0:
        precision = true_positives / (true_positives + false_positives)
    else:
        precision = 0.0
    
    if true_positives + false_negatives > 0:
        recall = true_positives / (true_positives + false_negatives)
    else:
        recall = 0.0
    
    if precision + recall > 0:
        f1_score = 2 * (precision * recall) / (precision + recall)
    else:
        f1_score = 0.0
    
    return {
        'precision': round(precision, 4),
        'recall': round(recall, 4),
        'f1_score': round(f1_score, 4)
    }
    ### END SOLUTION
    
### Demo function call
from pprint import pprint

demo_y_pred_proba = np.array([0.9, 0.8, 0.7, 0.6, 0.55, 0.45, 0.3, 0.2, 0.1, 0.05])
demo_y_true = np.array([1, 1, 1, 0, 1, 0, 0, 1, 0, 0])

result = calculate_metrics(demo_y_pred_proba, demo_y_true)

pprint(result)
{'f1_score': 0.8, 'precision': 0.8, 'recall': 0.8}

The demo should display this printed output.

{'f1_score': 0.8, 'precision': 0.8, 'recall': 0.8}


The cell below will test your solution for calculate_metrics (exercise 9). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [26]:
### Test Cell - Exercise 9  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=calculate_metrics,
              ex_name='calculate_metrics',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to calculate_metrics did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 0.09 seconds
memory after test: 0.04 MB
memory peak during test: 1.45 MB
Passed! Please submit.

Introduction: Exercise 10

Universities can't provide intensive support to every student. They must prioritize due to limited resources. We need to be able to answer the question, "If we can only intervene with a certain percent of students we are confident are at-risk, how accurate are our top predictions?"

Top-K Precision, which you will implement in the following exercise, answers this question. With K=10%, we rank all students by predicted risk and evaluate only the top 10%. This will allow student support teams to make the most of their limited resources.

This is a DEBUG EXERCISE. We have provided the code and requirements. The task is to identify and fix any issues to ensure the function works as intended.

Exercise 10: (1 points)

DEBUG_calculate_topk_precision

Your task: define DEBUG_calculate_topk_precision as follows:

Calculate precision among the top K% of predictions ranked by predicted probability.

Note: This is a debugging exercise!

Inputs:

  • y_pred_proba (array): Predicted probabilities for the positive class (values between 0.0 and 1.0)
  • y_true (array): True binary labels (0 or 1)
  • k_percent (float): Percentage of top predictions to consider (e.g., 0.10 for top 10%, 0.25 for top 25%)

Return: A float representing the top-K precision, rounded to 2 decimal places.

Requirements/steps:

  1. This is a debugging exercise! You are free to delete the provided code and rewrite.
  2. Bugs exist in the function provided below. Identify and fix them all.
  3. Calculate the number of samples to include: k = ceil(k_percent * total_number_of_samples)
  4. Sort all predictions by y_pred_proba with the highest probabilities first
  5. Select the top K predictions and their corresponding true labels
  6. Calculate precision among these top K predictions:
    • top_k_precision = number_of_true_positives_in_top_k / k
  7. Round the result to 2 decimal places
  8. Return the rounded top-K precision as a float
In [27]:
### Solution - Exercise 10  
def DEBUG_calculate_topk_precision(y_pred_proba, y_true, k_percent):
    ###
    ### YOUR CODE HERE
    ###
#     y_pred_proba = np.array(y_pred_proba)
#     y_true = np.array(y_true)

#     k = int(len(y_pred_proba) * k_percent)

#     if k < 1:
#         k = 1

#     sorted_indices = np.argsort(y_pred_proba)

#     top_k_indices = sorted_indices[:k]

#     top_k_labels = y_true[top_k_indices]

#     precision = np.sum(top_k_labels) / k

#     return precision

    ### BEGIN SOLUTION
    y_pred_proba = np.array(y_pred_proba)
    y_true = np.array(y_true)

    k = int(np.ceil(len(y_pred_proba) * k_percent))

    if k < 1:
        k = 1

    sorted_indices = np.argsort(y_pred_proba)[::-1]

    top_k_indices = sorted_indices[:k]

    top_k_labels = y_true[top_k_indices]

    precision = np.sum(top_k_labels) / k

    return round(precision, 2)
    ### END SOLUTION
    
### Demo function call
demo_y_pred_proba = np.array([0.95, 0.88, 0.82, 0.76, 0.65, 0.58, 0.45, 0.32, 0.21, 0.15,
                               0.92, 0.84, 0.73, 0.61, 0.54, 0.48, 0.38, 0.28, 0.18, 0.09])
demo_y_true = np.array([1, 1, 1, 0, 1, 0, 0, 1, 0, 0,
                        1, 1, 1, 0, 0, 1, 0, 0, 0, 0])
demo_k_percent = 0.10

result = DEBUG_calculate_topk_precision(demo_y_pred_proba, demo_y_true, demo_k_percent)

print(f"Top-K Precision (K={demo_k_percent*100}%): {result}")
Top-K Precision (K=10.0%): 1.0

The demo should display this printed output.

Top-K Precision (K=10.0%): 1.0


The cell below will test your solution for DEBUG_calculate_topk_precision (exercise 10). The testing variables will be available for debugging under the following names in a dictionary format.

  • input_vars - Input variables for your solution.
  • original_input_vars - Copy of input variables from prior to running your solution. Any key:value pair in original_input_vars should also exist in input_vars - otherwise the inputs were modified by your solution.
  • returned_output_vars - Outputs returned by your solution.
  • true_output_vars - The expected output. This should "match" returned_output_vars based on the question requirements - otherwise, your solution is not returning the correct output.
In [28]:
### Test Cell - Exercise 10  


from cse6040_devkit.tester_fw.testers import Tester
from yaml import safe_load
from time import time

tracemalloc.start()
mem_start, peak_start = tracemalloc.get_traced_memory()
print(f"initial memory usage: {mem_start/1024/1024:.2f} MB")

# Load testing utility
with open('resource/asnlib/publicdata/execute_tests', 'rb') as f:
    executor = dill.load(f)

@run_with_timeout(error_threshold=200.0, warning_threshold=100.0)
@suppress_stdout
def execute_tests(**kwargs):
    return executor(**kwargs)


# Execute test
start_time = time()
passed, test_case_vars, e = execute_tests(func=DEBUG_calculate_topk_precision,
              ex_name='DEBUG_calculate_topk_precision',
              key=b'lHBM3PNfS53dAewPCes_bZJ3D8A_5sWq911kmfTrzK0=', 
              n_iter=50)
# Assign test case vars for debugging
input_vars, original_input_vars, returned_output_vars, true_output_vars = test_case_vars
duration = time() - start_time
print(f"Test duration: {duration:.2f} seconds")
current_memory, peak_memory = tracemalloc.get_traced_memory()
print(f"memory after test: {current_memory/1024/1024:.2f} MB")
print(f"memory peak during test: {peak_memory/1024/1024:.2f} MB")
tracemalloc.stop()
if e: raise e
assert passed, 'The solution to DEBUG_calculate_topk_precision did not pass the test.'

###
### AUTOGRADER TEST - DO NOT REMOVE
###

print('Passed! Please submit.')
initial memory usage: 0.00 MB
Test duration: 0.08 seconds
memory after test: 0.04 MB
memory peak during test: 1.63 MB
Passed! Please submit.