RStudio Projects
March Madness Predictor
This project aimed to predict the winner of the 2025 NCAA Men’s March Madness Tournament using a Multiple Linear Regression (MLR) model trained on KenPom performance metrics, including Net Rating, Offensive Rating, Defensive Rating, and Luck factor. After preprocessing the data for teams in the Round of 64 and creating feature differences for matchups, a logistic regression model was used to simulate game outcomes across each round. The model achieved high prediction accuracy—over 85% through the Sweet 16—and ultimately predicted Duke as the national champion. While the model was strong in performance-based prediction, future enhancements could include simulating randomness and training on historical brackets.
Predicting Covers with Random Forest and Poisson
In this project, I developed machine-learning and statistical models to predict daily reservation cancellations. After creating a cancellation count variable from expected and actual values, I trained both Random Forest regression and Poisson regression models to estimate future cancellation levels.
I evaluated each model using standard performance metrics (MAE, RMSE, residual diagnostics) and generated prediction visualizations. The Random Forest model captured nonlinear relationships well, while the Poisson model provided interpretable parameter insights for count-based outcomes.
This project highlights skills in feature engineering, model training, comparison, and forecasting.

Predicting Future Safety Incidents
This project uses R to develop a predictive analytics process that uses past data to look at and anticipate safety events at work. It uses multinomial logistic regression and random forest classifiers to figure out how likely it is that an incident would be severe based on factors like the date, the weather, whether or not PPE was worn, the tool used, and the time of day. The system has a scenario simulator that forecasts the most likely outcome for future occurrences, as well as data preparation, design of features, model training, and performance evaluation.
The project gives both understandable and useful information by combining visualizations of varying importance with confusion matrices. It is meant to help in planning for safety, lowering risks, and allocating resources in operational situations.
Code Output
🚨 Predicted Next Incident:
Date: 2025-06-20
Time of Day: Morning
Likely Incident Level: A2
Likely Cause: Human Error with a Ladder on a Clear day.