You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bishop State Student-Level Dataset - Data Dictionary
File: bishop_state_student_level_with_predictions.csvRecords: 4,000 students (one row per student) Features: 156 columns (including 22 ML prediction columns) Purpose: Predictive modeling for student outcomes in education with comprehensive ML predictions
📋 FEATURE CATEGORIES
1. IDENTIFIERS (3 features)
Feature
Description
Type
id
Original cohort record ID
Integer
Student_GUID
Unique student identifier (PRIMARY KEY)
String
ar_id
AR data record ID
Integer
2. DEMOGRAPHICS (10 features)
Feature
Description
Values/Type
Student_Age
Age category at enrollment
"20 and younger", ">20 - 24", "Older than 24"
Race
Student's race
"White", "Black or African American", "Asian", "Hispanic", etc.
Ethnicity
Hispanic/Latino ethnicity
"H" (Hispanic), "N" (Non-Hispanic)
Gender
Student's gender
"M", "F"
First_Gen
First generation student status (cohort)
"A", "B", "C", "N", "P", "UK"
NASPA_First_Generation
NASPA definition of first-gen (cohort)
-1, 0, 1, etc.
ar_naspa_first_gen
NASPA first-gen (AR data)
-1, 0, 1, etc.
Incarcerated_Status
Incarceration status
"Y", "N"
Military_Status
Military service status
0, 1, 2, 3, etc.
Disability_Status
Disability status
"Y", "N"
3. ENROLLMENT CHARACTERISTICS (10 features)
Feature
Description
Values/Type
Institution_ID
Institution identifier
Integer
Cohort
Cohort year
"2019-20", "2018-19", etc.
Cohort_Term
Initial enrollment term
"FALL", "SPRING", "SUMMER"
Enrollment_Type
Type of enrollment
"First-Time", "Transfer-In"
Enrollment_Intensity_First_Term
First term intensity
"Full-Time", "Part-Time"
Dual_and_Summer_Enrollment
Dual enrollment or summer start
"DE", "SE", etc.
Pell_Status_First_Year
Pell grant recipient
"Y", "N", "UK"
Attendance_Status_Term_1
Term 1 attendance
"First-Time Full-Time", "Transfer-In Full-Time", etc.
Special_Program
Participation in special programs
"Bridge Program", etc.
Employment_Status
Employment status
-1, 0, 1, 2, 3, 4
4. ACADEMIC PREPARATION (4 features)
Feature
Description
Values/Type
Math_Placement
Math placement level
"C" (College-level), "R" (Remedial), "N" (Not placed)
English_Placement
English placement level
"C", "R", "N"
Reading_Placement
Reading placement level
"C", "R", "N"
Foreign_Language_Completion
Foreign language completed
"Y", "N"
5. PROGRAM INFORMATION (3 features)
Feature
Description
Type
Credential_Type_Sought_Year_1
Initial credential goal
"A" (Associate's), "B" (Bachelor's), "01" (Certificate), etc.
Program_of_Study_Term_1
CIP code for initial program
Float (e.g., 420101.0)
Program_of_Study_Year_1
CIP code for year 1 program
Float
6. 🎯 ENGINEERED COURSE FEATURES (29 features)
These features are aggregated from course-level data - KEY PREDICTORS
Enrollment Metrics
Feature
Description
Type
Example
total_courses_enrolled
Total number of courses taken
Integer
4.4 avg
unique_course_prefixes
Number of different subject areas
Integer
e.g., ENG, MAT, HIS
Credit Metrics
Feature
Description
Type
Range
total_credits_attempted
All credits attempted
Float
0-200+
total_credits_earned
All credits earned
Float
0-200+
avg_credits_per_course
Average credits per course
Float
1-6
course_completion_rate
% of attempted credits earned
Float
0.0-1.0
Grade/Performance Metrics
Feature
Description
Type
Range
courses_with_grades
Number of graded courses
Integer
0-50+
average_grade
Mean GPA across all courses
Float
0.0-4.0
min_grade
Lowest grade received
Float
0.0-4.0
max_grade
Highest grade received
Float
0.0-4.0
grade_std_dev
Grade variability
Float
0.0-4.0
failing_grades_count
Number of grades < 2.0
Integer
0-20+
passing_rate
% of courses passed (≥2.0)
Float
0.0-1.0
Course Type Metrics
Feature
Description
Type
core_courses_taken
Number of core courses
Integer
gateway_math_courses
Gateway math courses taken
Integer
gateway_english_courses
Gateway English courses taken
Integer
corequisite_courses
Co-requisite courses
Integer
Delivery Method
Feature
Description
Type
online_courses
Number of online courses
Integer
face_to_face_courses
Number of in-person courses
Integer
hybrid_courses
Number of hybrid courses
Integer
pct_online
Percentage of courses online
Float (0-100)
Temporal Patterns
Feature
Description
Type
unique_academic_years
Years of enrollment
Integer
unique_academic_terms
Terms of enrollment
Integer
fall_courses
Courses taken in Fall
Integer
spring_courses
Courses taken in Spring
Integer
summer_courses
Courses taken in Summer
Integer
Instructor Metrics
Feature
Description
Type
courses_with_fulltime_instructors
Courses with full-time faculty
Integer
courses_with_parttime_instructors
Courses with part-time faculty
Integer
Other Institutions
Feature
Description
Type
enrolled_other_institutions
Concurrent enrollment count
Integer
7. COHORT PERFORMANCE METRICS (10 features)
Feature
Description
Type
GPA_Group_Term_1
GPA category for term 1
Float
GPA_Group_Year_1
GPA category for year 1
Float
Number_of_Credits_Attempted_Year_1
Year 1 credits attempted
Float
Number_of_Credits_Earned_Year_1
Year 1 credits earned
Float
Number_of_Credits_Attempted_Year_2
Year 2 credits attempted
Float
Number_of_Credits_Earned_Year_2
Year 2 credits earned
Float
Number_of_Credits_Attempted_Year_3
Year 3 credits attempted
Float
Number_of_Credits_Earned_Year_3
Year 3 credits earned
Float
Number_of_Credits_Attempted_Year_4
Year 4 credits attempted
Float
Number_of_Credits_Earned_Year_4
Year 4 credits earned
Float
8. GATEWAY COURSE COMPLETION (12 features)
Feature
Description
Values
Gateway_Math_Status
Math gateway status
"C" (Completed), "R" (Required), "N" (Not required)