End-to-end machine learning pipeline to predict customer churn in telecommunications — from raw IBM Telco data to a production-ready XGBoost model with SHAP explainability.
7,032 customer records after cleaning, spanning 21 features across demographics, services subscribed, billing details, and contract type.
Thematic analyses revealing behavioral, structural, and financial drivers of churn — each with a direct business implication.
Domain-driven features encoding business logic that raw columns cannot capture — consistently top SHAP contributors in the final model.
11 model variants with systematic hyperparameter tuning, class imbalance handling, and threshold optimization. Click a model to explore.
Recall on the minority (Churn) class is the primary metric. Missing a churner costs a full customer lifetime value.
SHapley Additive exPlanations on the best XGBoost model — global importance plus a live individual customer churn predictor.
Adjust customer attributes and watch the churn probability update in real time.
Directly derived from model findings — ranked by estimated business impact and ease of implementation.