Khulekani Mgenge
Data Scientist Portfolio

Data Scientist well versed in Statistical Analysis,A/B Testing,Machine Learning, SQL, R, Python,and Tableau @KhulekaniMgenge
.

TikTok Machine Learning Model

Machine learning model that can be used to determine whether a video contains a claim or whether it offers an opinion. With a successful prediction model,TikTok can reduce the backlog of user reports and prioritize them more efficiently.

Regression Model For Predicting Taxi Fares

Regression model for the New York City Taxi and Limousine Commission that will help predict taxi fares before the ride, based on data that TLC has gathered.This project will focus on the business need of developing a regression model.

Waze User Churn Classification Model

Building a machine learning model to predict user churn. Where Churn quantifies the number of users who have uninstalled the Waze app or stopped using the app. The ultimate goal for this project is to develop a machine learning (ML) model that predicts user churn.

Smart Device Analysis
in R Programming

The analysis of the project was focusing on one of the Bellabeat products and I was asked to analyze smart device data to gain insight into how consumers are using their smart devices. The report examine the business question: how do consumers use Bellabeat smart devices? The insight discovered in the analysis will be used help guide the company's marketing strategy.

NBA WEB SCRAPING
using Python

The project is about web scraping the NBA website in order to collect data from it for analysis process. I was able to
identified its HTML structure, and extracted relevant data.Applied data cleaning and preprocessing techniques to ensure data quality and integrity. Conducted data exploration and analysis to gain insights from the scraped data. Documented the entire process, including data sources, data collection methods, data cleaning, and analysis results, and provided recommendations for player improvements.

Data Exploration in SQL

The project demonstrate how Covid-19 Monitoring dashboard dataset was collected from ourworldindata.org and uploaded to a MySQL database for efficient storage. How data manipulation conducted to created relevant columns in the dataset whicch will be used to track COVID-19 cases and calculating percentage of infected people. SQL queries used for data exploration to understand dataset. View table was created for data to be visualized on the business intelligent tool such as tableau.

Tableau Projects

Portfolio of all Tableau analysis projects such as Covid-19 Monitoring Dashboard and Chicago cyclistic Bikeshare rides usage between annual members and casual riders.