Python/Jupyter Notebook

Profitable App Profiles for the App Store and Google Play Markets

In this project, I analyzed datasets from the Apple App Store and Google Play to identify which types of free apps are most likely to attract large user bases. I cleaned and prepared the data by removing corrupted entries, handling duplicates, filtering non-English and non-free apps, and standardizing categories. I then used frequency tables and engagement metrics (ratings, installs) to evaluate which genres show the strongest demand.

My analysis showed that while both markets are saturated with Games, the most promising opportunities come from high-engagement utility categories such as Navigation, Reference, Social Networking, and Music. These genres attract large, consistent user bases with far less competition.

Salary Prediction Analysis: Feature Engineering and Linear Regression

I built a full machine learning pipeline to predict annual developer salaries using survey data. This included extensive data cleaning, handling missing values, filtering outliers, and converting multi-select text fields into numeric skill indicators. I engineered features for experience, education, job roles, organizational size, and key technologies, then applied ordinal and one-hot encoding.

After pruning weak predictors, I trained a linear regression model achieving ~0.51 R² and ~$33K MAE. I evaluated performance with predicted-vs-actual plots, residual analysis, and error breakdowns across salary bands and country groups. The project showcases my ability to transform messy real-world data into a structured, interpretable model-ready dataset.