Supervised Learning in R:
Using Deep Learning and Machine Learning Techniques to Predict Law School Admissions.
For this data science final project I trained, tuned, and tested neural networks and random forest models on a dataset of over 400,000 law school admissions observations. The response variable that I was interested in was was admissions outcome — e.g. whether the applicant was accepted, rejected, or waitlisted. I used over 30 predictor variables, the most influential being LSAT Score, High School GPA, and Acceptance Rate of School.
The entire report can be downloaded and viewed here.
Predictive Modeling in R:
Using NBA Rookie Performance to Predict Career Measures with Both Regression and Classification Modeling Techniques.
For this data science final project I used rookie years statistics (found in these datasets) to predict career measures. The response variables that I was interested in were Career VORP (overall career performance) and whether a player stayed in the league for five years or not (a binary TRUE or FALSE outcome). These response variables necessitated that I use both regression and classification modeling techniques, as well as set validation to evaluate the 40+ models in the final set.
The entire report can be downloaded and viewed here.
Data Exploration in R:
Examining Positional Shot Data for the 2014 World Cup to Understand Goal Likelihood and Shot Direction Relative to Shot Position.
For this data science final project I performed an exploratory data analysis on a positional soccer dataset. The dataset, available here, contained information on each shot from the 2012 Champions League competition. All the visuals were coded in R and the data cleaning and analysis was also all done in R.
The entire report can be downloaded and viewed here.