
Contents:
- Motivation
- Objectives
- Approach
- Workflow
- Implementation
- Initial Data Analysis
- Churn Analytics
- Data Preparation
- RF Model
- GB Model
- AB Model
- LGBM Model
- Calibration Plots
- Feature Engineering
- Cluster Analysis
- Conclusions
- Related Links
Motivation
Cohort analysis [1-10] is a way to understand customer churn (aka attrition) representing the number or percentage of customers who don’t purchase additional products or services.
Today’s most successful companies address cohort by leveraging Machine Learning (ML) as part of Artificial Intelligence (AI) to build models that accurately predict churn and take action before a customer leaves [1-3]. Companies that are looking for a targeted and effective approach to reduce customer churn would do well to make use of the possibilities that ML/AI has to offer [4].
According to Glassbox [5], AI has the potential to boost rates of profitability by an average of 38% by 2035. In fact, we’ll create more data in the next 3 years than during the past 30 years—making data analysis even more difficult. AI can surface trends and patterns, revealing the big picture behind user behavior.
Key Benefits [4-6]:
- AI helps you reduce customer turnover
- AI enables you to anticipate negative changes in your customers’ behaviour
- Your data exploration tells you exactly which experiences/customers are at risk
- Approach dissatisfied customers in time because AI picks up on alarm signals
- Put an extra effort into loyal customers based upon the 80/20 principle
Objectives
Customer Churn is one of the most important and challenging problems for businesses such as Credit Card companies, cable service providers, SASS and telecommunication companies worldwide [7].
These problems are to be addressed by maximizing the rate ratio
max [ (Customer Retention Rate)/(Customer Churn Rate) ]
or
max (Customer Retention Rate)
while
min (Customer Churn Rate).
At the end of this study, we’ll be able to answer the following questions [10]:
- Which customers are churning
- Why they’re cancelling
- How to fix the problem
Churn analytics [8] is the process of measuring the rate at which customers quit the product, site, or service. It answers the questions “Are we losing customers?” and “If so, how?” to allow teams to take action. Lower churn rates lead to happier customers, larger margins, and higher profits. To prevent churn, teams must first measure it with analytics.
There are two types for churn. Customer churn is the rate at which you are losing specific customers/accounts. Revenue (aka MRR) churn measures the overall volume of recurring revenue lost in a given period:

Example: Large B2B product company
A large pool toys manufacturer sells its products through 700,000 resellers around the United States. Last month, it added 15,000 resellers and lost 9,000.
Monthly reseller churn rate: 9,000 / 700,000 = 1.3%
Approach
Any churn analysis consists of the following three steps [10]:
- Setup Churn Analytics Tools
- Find out why customers are churning
- Analyze churn by cohorts
The idea is to get a feel for which cohorts—or groups of customers with shared characteristics, such as when they subscribed to your product or where they shop—are leaving. Then use surveys and personal outreach to get insights into what’s driving those cohort members away so you can take proactive steps to retain them.
Workflow
- Import and install relevant Python libraries and packages
- Read input dataset as a table and check the content
- Churn Exploratory Data Analysis (EDA)
- Feature Engineering and Impact Analysis
- Train/Test Data Splitting, Sampling and Preparation
- Run Training Classifiers – RF, GB, Ada, and LGBM
- Compare ML performance QC metrics
- Prepare a full classification report
- Apply cluster PCA analysis (optional)
Implementation
Lets import and/or install the required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib
import warnings
warnings.filterwarnings(‘ignore’)
!pip install lightgbm
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import recall_score
from sklearn.metrics import precision_score
from sklearn.metrics import classification_report, accuracy_score
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_validate
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import plot_confusion_matrix
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import AdaBoostClassifier
from lightgbm import LGBMClassifier
and set the working directory /YOURPATH
import os
os.chdir(‘YOURPATH’)
Let’s read the input dataset
data = pd.read_csv(‘https://raw.githubusercontent.com/andhikaw789/Telco-Customer-Churn/main/Telco-Customer-Churn.csv’)
and check the content
data.head()

data.shape
(7043, 21) There are 21 columns (features) and 7043 rows (customers).
data[‘customerID’].duplicated().value_counts()
False 7043 Name: customerID, dtype: int64 There are no duplicates
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 7043 entries, 0 to 7042 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 customerID 7043 non-null object 1 gender 7043 non-null object 2 SeniorCitizen 7043 non-null int64 3 Partner 7043 non-null object 4 Dependents 7043 non-null object 5 tenure 7043 non-null int64 6 PhoneService 7043 non-null object 7 MultipleLines 7043 non-null object 8 InternetService 7043 non-null object 9 OnlineSecurity 7043 non-null object 10 OnlineBackup 7043 non-null object 11 DeviceProtection 7043 non-null object 12 TechSupport 7043 non-null object 13 StreamingTV 7043 non-null object 14 StreamingMovies 7043 non-null object 15 Contract 7043 non-null object 16 PaperlessBilling 7043 non-null object 17 PaymentMethod 7043 non-null object 18 MonthlyCharges 7043 non-null float64 19 TotalCharges 7043 non-null object 20 Churn 7043 non-null object dtypes: float64(1), int64(2), object(18) memory usage: 1.1+ MB
Let’s make the unwanted parsing to NaN values
data[‘TotalCharges’] = pd.to_numeric(data[‘TotalCharges’], errors=’coerce’)
data[‘TotalCharges’].replace(‘ ‘, np.nan, inplace=True)
data[‘TotalCharges’] = data[‘TotalCharges’].astype(float)
data.isna().sum()
customerID 0 gender 0 SeniorCitizen 0 Partner 0 Dependents 0 tenure 0 PhoneService 0 MultipleLines 0 InternetService 0 OnlineSecurity 0 OnlineBackup 0 DeviceProtection 0 TechSupport 0 StreamingTV 0 StreamingMovies 0 Contract 0 PaperlessBilling 0 PaymentMethod 0 MonthlyCharges 0 TotalCharges 11 Churn 0 dtype: int64 There’s only 11 missing in total charges that we got from replacing the spacing character to NaN values before.
Initial Data Analysis
Let’s begin the EDA phase and check the overall churn proportion
yy=data[‘Dependents’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing Dependents”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”Dependents”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_dependents_piechart.png’)

We can see that there is an unequal distribution of classes (Churn) in the training dataset. We face the imbalanced classification problem.
Let’s check the gender factor
yy=data[‘gender’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing Gender”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”Gender”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_gender_piechart.png’)

The plot shows that there is no gender gap in the dataset – both M and F are equally represented.
Let’s check the Partner feature
yy=data[‘Partner’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing Partner”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”Partner”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_partner_piechart.png’)

It is clear that the dataset is well balanced in terms of the partnership proportion.
Let’s check the Dependents factor
yy=data[‘Dependents’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing Dependents”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”Dependents”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_dependents_piechart.png’)

It is clear that the dependency ratio is not well balanced in the dataset.
Let’s look at the SeniorCitizen factor
yy=data[‘SeniorCitizen’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing SeniorCitizen”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”SeniorCitizen”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_seniorcitizen_piechart.png’)

We can see that senior citizens are underrepresented in the dataset.
Let’s check the PhoneService factor
yy=data[‘PhoneService’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing PhoneService”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”PhoneService”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_phoneservice_piechart.png’)

We can see that the percentage of PhoneService=0 is negligibly small in the dataset.
Let’s look at the PaymentMethod piechart
yy=data[‘PaymentMethod’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing PaymentMethod”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”PaymentMethod”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_payment_piechart.png’)

We can see almost equal representation of all available payment options such as electronic/mailed check, bank and credit card transfers.
Let’s consider the PaperlessBilling feature

This chart shows that the value PaperlessBilling=1 is dominant in the dataset.
Let’s check the Contract (month-to-month, one year and two years contract) feature
yy=data[‘Contract’]
plt.figure(figsize= (10,6))
fig = yy.value_counts(normalize = True).plot.pie(autopct=’%1.2f%%’)
plt.title(“Pie-chart showing Contract”, fontdict={‘fontsize’: 20, ‘fontweight’ : 5, ‘color’ : ‘Green’})
fig.legend(title=”Contract”,
loc=”center left”,
bbox_to_anchor=(1, 0, 0.5, 1))
#plt.show()
plt.savefig(‘telco_contract_piechart.png’)

As we can see, the case Contract=0 has more than 55% share in the dataset.
Let’s look at the piechart showing MultipleLines (no phone service and yes/no multiple lines)

It is clear that the percentage MultipleLines=1 is negligible compared to MultipleLines=0,2.
Let’s plot the histograms of tenure, MonthlyCharges, and TotalCharges
data.hist(column=’tenure’)
plt.savefig(‘telco_tenure_hist.png’)

data.hist(column=’MonthlyCharges’)
plt.savefig(‘telco_monthlycharges_hist.png’)

data.hist(column=’TotalCharges’)
plt.savefig(‘telco_totalcharges_hist.png’)

We can see the following dominant trends in our dataset: tenure<10 and tenure>60, MonthlyCharges<30, and TotalCharges<1000.
Churn Analytics
Now, in order to identify churn by different classes, we will group it by these classes and churn is ‘yes.’ Following that, we’ll utilize a count plot to estimate how many employees will quit the organization [1,2].
Let’s begin with Gender
plt.figure(figsize=(8,5), facecolor=’white’)
sns.set(style=’whitegrid’)
ax = sns.countplot(data=data, x=’gender’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Gender’)
for p in ax.patches:
ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=13)
#plt.show()
plt.savefig(‘telecom_gender.png’)

This plot shows that gender does not affect the customer churn.
Let’s look at the impact of Senior Citizen
plt.figure(figsize=(8, 5), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’SeniorCitizen’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Senior Citizen’)
for p in ax.patches:
ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
plt.savefig(‘telecom_seniorcitizen.png’)

It appears that senior citizens are likely to leave as customers.
Let’s identify attrition by Partner
plt.figure(figsize=(8, 5), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’Partner’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Partner’)
for p in ax.patches:
ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=13)
plt.savefig(‘telecom_partner.png’)

It is clear that there is a weak correlation between the customer churn and the Partner feature.
Let’s explore the Dependents feature
plt.figure(figsize=(8, 5), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’Dependents’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Dependents’)
for p in ax.patches:
ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
plt.savefig(‘telecom_dependents.png’)

There’s a small possibility that no dependents affects the customer to churn.
Let’s check the churn by Phone Service
sns.set(style=’whitegrid’)
plt.figure(figsize=(8, 5), facecolor=’white’)
ax=sns.countplot(data=data, x=’PhoneService’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Phone Service’)
for p in ax.patches:
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_phoneservice.png’)

Since the churn’s percentage proportions of “yes/no” phone service are the same, this feature doesn’t affect the customer churn.
Let’s look at the churn by Multiple Lines
plt.figure(figsize=(8, 5), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’MultipleLines’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Multiple Lines’)
plt.legend(loc=’upper right’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=8)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_multiplelines.png’)

We can see that there’s a slim chance that no multiple lines affects the customer churn.
Let’s look at InternetService
plt.figure(figsize=(8, 6), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’InternetService’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Internet Service’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_internetservice.png’)

It turns out that there’s a chance that the fiber optic internet service affects the churn.
Let’s look at the churn by Online Security
plt.figure(figsize=(8, 6), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’OnlineSecurity’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Online Security’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=8)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_onlinesecurity.png’)

This plot shows that there’s a possibility that no online security affects the customer churn.
Let’s check Online Backup
plt.figure(figsize=(10,7))
sns.set(style=’whitegrid’)
plt.figure(figsize=(8, 6), facecolor=’white’)
ax=sns.countplot(data=data, x=’OnlineBackup’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Online Backup’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=8)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_onlinebackup.png’)

We can see that there’s a chance that no online backup affects the customer churn.
Let’s look at Device Protection
sns.set(style=’whitegrid’)
plt.figure(figsize=(8, 6), facecolor=’white’)
ax=sns.countplot(data=data, x=’DeviceProtection’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Device Protection’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=8)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_deviceprotection.png’)

We can see that there’s a chance that no online backup affects the customer churn.
Let’s look at Tech Support
plt.figure(figsize=(8,6), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’TechSupport’, hue=’Churn’, saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Tech Support’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_techsupport.png’)

There’s a possibility that no tech support is linked to the customer churn.
Let’s examine Streaming TV
plt.figure(figsize=(8,6), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’StreamingTV’, hue=’Churn’,saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Streaming TV’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_streamingtv.png’)

This plot shows that streaming tv may not affect the customer churn.
Let’s examine Streaming Movies
plt.figure(figsize=(9,7), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’StreamingMovies’, hue=’Churn’,saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Streaming Movies’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’, fontsize=9)
plt.savefig(‘telecom_streamingmovies.png’)

It appears that streaming movies are weakly related to the customer churn.
Let’s check the churn by Contract
plt.figure(figsize=(8,7), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’Contract’, hue=’Churn’,saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Contract’)
for p in ax.patches:
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’)
plt.savefig(‘telecom_contract.png’)

It is clear that there’s a chance that the month-to-month contract is linked to the customer churn.
Let’s look at Paperless Billing
plt.figure(figsize=(8,6), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’PaperlessBilling’, hue=’Churn’,saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Paperless Billing’)
for p in ax.patches:
ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=13)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’)
plt.savefig(‘telecom_paperless.png’)

We can see that there’s a chance that paperless billing is related to the customer churn.
Let’s now consider Payment Method
plt.figure(figsize=(10,6), facecolor=’white’)
sns.set(style=’whitegrid’)
ax=sns.countplot(data=data, x=’PaymentMethod’, hue=’Churn’,saturation=1, alpha=0.9, palette=’bright’)
ax.set_title(‘Churn by Payment Method’)
for p in ax.patches:
#ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=10)
ax.annotate(f’\n{p.get_height()}’, (p.get_x()+0.2, p.get_height()), ha=’center’, va=’top’, color=’white’, size=13)
number = ‘{}’.format(p.get_height().astype(‘int64’))
ax.annotate(number, (p.get_x() + p.get_width()/2., p.get_height()), ha=’center’, va=’center’,
xytext=(0,5), textcoords=’offset points’, color=’black’, fontweight=’semibold’)
plt.savefig(‘telecom_payment.png’)

There’s a probability that electronic check payment method is linked to the customer churn.
Let’s explore the remaining features (Tenure and Monthly/Total Charges) by plotting their histograms
plt.figure(figsize=(12,5), facecolor=’white’)
plt.figure(facecolor=’white’)
sns.set(style=’whitegrid’)
sns.histplot(data=data, x=’tenure’, hue=’Churn’, binwidth=2, kde=True)
plt.title(‘Tenure’)
#plt.show()
plt.savefig(‘telecom_tenurehist.png’)

plt.figure(figsize=(11,5), facecolor=’white’)
plt.figure(facecolor=’white’)
sns.set(style=’whitegrid’)
sns.histplot(data=data, x=’MonthlyCharges’, hue=’Churn’, binwidth=2, kde=True)
plt.title(‘Monthly Charges’)
#plt.show()
plt.savefig(‘telecom_chargeshist.png’)

plt.figure(figsize=(13,5), facecolor=’white’)
plt.figure(facecolor=’white’)
sns.set(style=’whitegrid’)
sns.histplot(data=data, x=’TotalCharges’, hue=’Churn’, binwidth=100, kde=True)
plt.title(‘Total Charges’)
#plt.show()
plt.savefig(‘telecom_totalchargeshist.png’)

These three plots reveal the following trends typical for churned customers:
- Tenure = 0-2 months
- MonthlyCharges = 70-100
- TotalCharges < 200 (overlap with no-churn)
Key takeaways
- There is a relationship between the churn rate and the following features: senior citizen, fiber optic internet service, paperless billing, month-to-month contract, electronic check payment, and high monthly charges.
- There is a weak correlation between the churn rate and certain variables such as dependents, online security/backup, tech support, streaming tv/movies, and device protection.
Data Preparation
Let’s check the missing values
data_null = data[data[‘TotalCharges’].isnull()]
data_null[[‘tenure’, ‘MonthlyCharges’, ‘TotalCharges’, ‘Churn’]]

Let’s impute the missing values with 0 using fillna
data[‘TotalCharges’].fillna(0, inplace=True)
data_prep=data
Let’s proceed with encoding to change categorical variables into numerical ones. We will use OneHotEncoder for the “yes/no” variables
from sklearn.preprocessing import OneHotEncoder
ohe = OneHotEncoder(categories=[[‘Yes’, ‘No’]] ,handle_unknown=’ignore’, sparse = False)
cols = [‘Partner’,’Dependents’, ‘PhoneService’, ‘PaperlessBilling’, ‘Churn’]
for i in cols:
y=np.array(data_prep[i]).reshape(-1,1)
ohe.fit(y)
data_prep[i] = ohe.transform(y)
and LabelEncoder for other available categorical variables
from sklearn.preprocessing import LabelEncoder
lenc = LabelEncoder()
cols = [‘gender’, ‘MultipleLines’, ‘InternetService’, ‘OnlineSecurity’, ‘OnlineBackup’, ‘DeviceProtection’,’TechSupport’, ‘StreamingTV’, ‘StreamingMovies’, ‘Contract’, ‘PaymentMethod’]
for i in cols:
lenc.fit(data_prep[i])
data_prep[i] = lenc.transform(data_prep[i])
Let’s split the target variable and other variables
data_x = data_prep[[‘gender’, ‘SeniorCitizen’, ‘Partner’, ‘Dependents’, ‘tenure’, ‘PhoneService’, ‘MultipleLines’, ‘InternetService’, ‘OnlineSecurity’, ‘OnlineBackup’, ‘DeviceProtection’, ‘TechSupport’, ‘StreamingTV’, ‘StreamingMovies’, ‘Contract’, ‘PaperlessBilling’, ‘PaymentMethod’, ‘MonthlyCharges’, ‘TotalCharges’]].copy()
data_y = data_prep[‘Churn’]
Let’s split the training and test data by setting test_size=0.2, random_state=5
from sklearn.model_selection import train_test_split
from collections import Counter
x_train, x_test, y_train, y_test = train_test_split(data_x, data_y, test_size=0.2, random_state=5)
Counter(y_train)
Counter({1.0: 1483, 0.0: 4151})
Let’s scale the both training and test data using MinMaxScaler
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaled_train=np.array(x_train[[‘tenure’, ‘MonthlyCharges’, ‘TotalCharges’]]).reshape(-1,3)
scaler = scaler.fit(scaled_train)
x_train[[‘tenure’, ‘MonthlyCharges’, ‘TotalCharges’]] = scaler.transform(scaled_train)
scaled_test=np.array(x_test[[‘tenure’, ‘MonthlyCharges’, ‘TotalCharges’]]).reshape(-1,3)
x_test[[‘tenure’, ‘MonthlyCharges’, ‘TotalCharges’]] = scaler.transform(scaled_test)
Let’s resample the training data using SMOTE with the parameters sampling_strategy = 0.8, k_neighbors=5, random_state=5 (resampled data 1)
from imblearn.over_sampling import SMOTE
sm = SMOTE(sampling_strategy = 0.8, k_neighbors=5, random_state=5)
x_resample, y_resample = sm.fit_resample(x_train, y_train)
Counter(y_resample)
Counter({1.0: 3320, 0.0: 4151})
Also, we can define the parameter sampling_strategy = 0.66 (resampled data 2)
sm = SMOTE(sampling_strategy = 0.66, k_neighbors=5, random_state=5)
x_resample_2, y_resample_2 = sm.fit_resample(x_train, y_train)
Counter(y_resample_2)
Counter({1.0: 2739, 0.0: 4151})
RF Model
Let’s begin with RandomForestClassifier
Original data:
rf = RandomForestClassifier(random_state=5, criterion=’entropy’, n_estimators=18, max_depth=12)
rf.fit(x_train, y_train)
prediction = rf.predict(x_test)
print(confusion_matrix(y_test, prediction))
print(“Accuracy Random Forest: %.2f” % (accuracy_score(y_test, prediction)100) ) print(“Recall Random Forest:”,recall_score(y_test, prediction)100)
print(“Precision Random Forest:”,precision_score(y_test, prediction)*100)
[[915 108] [195 191]] Accuracy Random Forest: 78.50 Recall Random Forest: 49.48186528497409 Precision Random Forest: 63.87959866220736
Resampled data 1:
rf.fit(x_resample, y_resample)
prediction = rf.predict(x_test)
print(confusion_matrix(y_test, prediction))
print(“Accuracy Random Forest: %.2f” % (accuracy_score(y_test, prediction)100) ) print(“Recall Random Forest:”,recall_score(y_test, prediction)100)
print(“Precision Random Forest:”,precision_score(y_test, prediction)*100)
[[841 182] [130 256]] Accuracy Random Forest: 77.86 Recall Random Forest: 66.32124352331607 Precision Random Forest: 58.44748858447488
Resampled data 2:
rf.fit(x_resample_2, y_resample_2)
prediction = rf.predict(x_test)
print(confusion_matrix(y_test, prediction))
print(“Accuracy Random Forest: %.2f” % (accuracy_score(y_test, prediction)100) ) print(“Recall Random Forest:”,recall_score(y_test, prediction)100)
print(“Precision Random Forest:”,precision_score(y_test, prediction)*100)
[[859 164] [135 251]] Accuracy Random Forest: 78.78 Recall Random Forest: 65.02590673575129 Precision Random Forest: 60.48192771084337
Let’s plot the confusion matrix

Let’s plot the ROC curve
from sklearn.metrics import roc_curve
y_pred_proba = rf.predict_proba(x_test)[:,1]
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.plot([0,1],[0,1],’k-‘)
plt.plot(fpr,tpr, label=’Knn’)
plt.xlabel(‘FPR’)
plt.ylabel(‘TPR’)
plt.title(‘RF ROC curve’)
plt.savefig(‘roc_rf_curve.png’)

and get the score
from sklearn.metrics import roc_auc_score
roc_auc_score(y_test,y_pred_proba)
0.832993988016552
Let’s plot the KS statistic plot
import scikitplot as skplt
Y_test_probs = rf.predict_proba(x_test)
skplt.metrics.plot_ks_statistic(y_test, Y_test_probs, figsize=(10,6));

Let’s plot the Lift curve
#import scikitplot as skplt
skplt.metrics.plot_lift_curve(y_test, Y_test_probs, figsize=(10,6));

Let’s plot the Learning Curve
#import scikitplot as skplt
skplt.estimators.plot_learning_curve(rf, x_test, prediction,
cv=7, shuffle=True, scoring=”accuracy”,
n_jobs=-1, figsize=(6,4), title_fontsize=”large”, text_fontsize=”large”,
title=”RandomForestClassifier Learning Curve”);

We can see that the training score ~1.0 is the indicator of training data overfitting (low bias).
In principle, we can use the Randomised Search CV to identify the best estimates for the RF classifier and then use those estimates to built an improved RF classifier.
GB Model
Let’s train, test and validate the GradientBoosting Classifier
Original data:
gb_clf = GradientBoostingClassifier(random_state=5, learning_rate= 1, loss= ‘exponential’, max_depth= 1, max_features= 1)
gb_clf.fit(x_train, y_train)
predictiongnb = gb_clf.predict(x_test)
print(confusion_matrix(y_test, predictiongnb))
print(“Accuracy Gradient Boost: %.2f” % (accuracy_score(y_test, predictiongnb)100) ) print(“Recall Gradient Boost:”,recall_score(y_test, predictiongnb)100)
print(“Precision Gradient Boost:”,precision_score(y_test, predictiongnb)*100)
print(“”)
[[924 99] [173 213]] Accuracy Gradient Boost: 80.70 Recall Gradient Boost: 55.181347150259064 Precision Gradient Boost: 68.26923076923077
Resampled data 1
gb_clf.fit(x_resample, y_resample)
predictiongnb = gb_clf.predict(x_test)
print(confusion_matrix(y_test, predictiongnb))
print(“Accuracy Gradient Boost: %.2f” % (accuracy_score(y_test, predictiongnb)100) ) print(“Recall Gradient Boost:”,recall_score(y_test, predictiongnb)100)
print(“Precision Gradient Boost:”,precision_score(y_test, predictiongnb)*100)
print(“”)
[[813 210] [113 273]] Accuracy Gradient Boost: 77.08 Recall Gradient Boost: 70.72538860103627 Precision Gradient Boost: 56.52173913043478
Resampled data 2
gb_clf.fit(x_resample_2, y_resample_2)
predictiongnb = gb_clf.predict(x_test)
print(confusion_matrix(y_test, predictiongnb))
print(“Accuracy Gradient Boost: %.2f” % (accuracy_score(y_test, predictiongnb)100) ) print(“Recall Gradient Boost:”,recall_score(y_test, predictiongnb)100)
print(“Precision Gradient Boost:”,precision_score(y_test, predictiongnb)*100)
print(“”)
[[850 173] [125 261]] Accuracy Gradient Boost: 78.85 Recall Gradient Boost: 67.61658031088082 Precision Gradient Boost: 60.13824884792627
Let’s plot the confusion matrix

Let’s plot the ROC curve
#from sklearn.metrics import roc_curve
y_pred_proba = gb_clf.predict_proba(x_test)[:,1]
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.plot([0,1],[0,1],’k-‘)
plt.plot(fpr,tpr, label=’Knn’)
plt.xlabel(‘FPR’)
plt.ylabel(‘TPR’)
plt.title(‘Gradient Boost ROC curve’)
#plt.show()
plt.savefig(‘roc_gb_curve.png’)

and get the score
#from sklearn.metrics import roc_auc_score
roc_auc_score(y_test,y_pred_proba)
0.851618727809602
Let’s look at the KS statistic plot
Y_test_probs = gb_clf.predict_proba(x_test)
skplt.metrics.plot_ks_statistic(y_test, Y_test_probs, figsize=(10,6));

and plot the Lift curve

Let’s plot the Learning curve
#import scikitplot as skplt
skplt.estimators.plot_learning_curve(gb_clf, x_test, prediction,
cv=7, shuffle=True, scoring=”accuracy”,
n_jobs=-1, figsize=(6,4), title_fontsize=”large”, text_fontsize=”large”,
title=”Gradient Boost Classifier Learning Curve”);

That is the best Learning curve we have obtained so far.
AB Model
Let’s look at the AdaBoost Classifier
Original data:
ada = AdaBoostClassifier(random_state=5, learning_rate=0.5, n_estimators=50)
ada.fit(x_train, y_train)
predictionada = ada.predict(x_test)
print(confusion_matrix(y_test, predictionada))
print(“Accuracy Ada Boost: %.2f” % (accuracy_score(y_test, predictionada)100) ) print(“Recall Ada Boost:”,recall_score(y_test, predictionada)100)
print(“Precision Ada Boost:”,precision_score(y_test, predictionada)*100)
print(“”)
[[926 97] [173 213]] Accuracy Ada Boost: 80.84 Recall Ada Boost: 55.181347150259064 Precision Ada Boost: 68.70967741935485
Resampled data 1:
ada.fit(x_resample, y_resample)
predictionada = ada.predict(x_test)
print(confusion_matrix(y_test, predictionada))
print(“Accuracy Ada Boost: %.2f” % (accuracy_score(y_test, predictionada)100) ) print(“Recall Ada Boost:”,recall_score(y_test, predictionada)100)
print(“Precision Ada Boost:”,precision_score(y_test, predictionada)*100)
print(“”)
[[797 226] [109 277]] Accuracy Ada Boost: 76.22 Recall Ada Boost: 71.76165803108809 Precision Ada Boost: 55.069582504970185
Resampled data 2:
ada.fit(x_resample_2, y_resample_2)
predictionada = ada.predict(x_test)
print(confusion_matrix(y_test, predictionada))
print(“Accuracy Ada Boost: %.2f” % (accuracy_score(y_test, predictionada)100) ) print(“Recall Ada Boost:”,recall_score(y_test, predictionada)100)
print(“Precision Ada Boost:”,precision_score(y_test, predictionada)*100)
print(“”)
[[818 205] [120 266]] Accuracy Ada Boost: 76.93 Recall Ada Boost: 68.9119170984456 Precision Ada Boost: 56.475583864118896
Let’s look at the confusion matrix
#from sklearn.metrics import classification_report, confusion_matrix
cf_matrix=confusion_matrix(y_test, predictionada)
sns.heatmap(cf_matrix/np.sum(cf_matrix), annot=True,
fmt=’.2%’, cmap=’Blues’)
plt.savefig(‘telecom_ada_confusion.png’)

Let’s plot the ROC curve
#from sklearn.metrics import roc_curve
y_pred_proba = ada.predict_proba(x_test)[:,1]
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.plot([0,1],[0,1],’k-‘)
plt.plot(fpr,tpr, label=’Knn’)
plt.xlabel(‘FPR’)
plt.ylabel(‘TPR’)
plt.title(‘Ada Boost ROC curve’)
#plt.show()
plt.savefig(‘telecom_ada_roc_curve.png’)

and get the score
#from sklearn.metrics import roc_auc_score
roc_auc_score(y_test,y_pred_proba)
0.8481632301622273
Let’s construct the KS statistic plot

We also plot the Lift curve
skplt.metrics.plot_lift_curve(y_test, Y_test_probs, figsize=(10,6));

Let’s plot the Learning curve
#import scikitplot as skplt
skplt.estimators.plot_learning_curve(ada, x_test, prediction,
cv=7, shuffle=True, scoring=”accuracy”,
n_jobs=-1, figsize=(6,4), title_fontsize=”large”, text_fontsize=”large”,
title=”Ada Boost Classifier Learning Curve”);

We can see that the cross-validation score has relatively large confidence intervals.
LGBM Model
Let’s look at the LGBM Classifier.
Original data:
lgbm = LGBMClassifier(random_state=5, learning_rate= 0.05, n_estimators= 90, num_leaves= 20, boosting_type=’dart’)
lgbm.fit(x_train, y_train)
predictionlgbm = lgbm.predict(x_test)
print(confusion_matrix(y_test, predictionlgbm))
print(“Accuracy LightGBM: %.2f” % (accuracy_score(y_test, predictionlgbm)100) ) print(“Recall LightGBM:”,recall_score(y_test, predictionlgbm)100)
print(“Precision LightGBM:”,precision_score(y_test, predictionlgbm)*100)
print(“”)
[[936 87] [191 195]] Accuracy LightGBM: 80.27 Recall LightGBM: 50.51813471502591 Precision LightGBM: 69.14893617021278
Resampled data 1:
lgbm.fit(x_resample, y_resample)
predictionlgbm = lgbm.predict(x_test)
print(confusion_matrix(y_test, predictionlgbm))
print(“Accuracy LightGBM: %.2f” % (accuracy_score(y_test, predictionlgbm)100) ) print(“Recall LightGBM:”,recall_score(y_test, predictionlgbm)100)
print(“Precision LightGBM:”,precision_score(y_test, predictionlgbm)*100)
print(“”)
[[826 197] [116 270]] Accuracy LightGBM: 77.79 Recall LightGBM: 69.94818652849742 Precision LightGBM: 57.81584582441114
Resampled data 2:
lgbm.fit(x_resample_2, y_resample_2)
predictionlgbm = lgbm.predict(x_test)
print(confusion_matrix(y_test, predictionlgbm))
print(“Accuracy LightGBM: %.2f” % (accuracy_score(y_test, predictionlgbm)100) ) print(“Recall LightGBM:”,recall_score(y_test, predictionlgbm)100)
print(“Precision LightGBM:”,precision_score(y_test, predictionlgbm)*100)
print(“”)
[[869 154] [136 250]] Accuracy LightGBM: 79.42 Recall LightGBM: 64.76683937823834 Precision LightGBM: 61.88118811881188
Let’s plot the confusion matrix
#from sklearn.metrics import classification_report, confusion_matrix
cf_matrix=confusion_matrix(y_test, predictionlgbm)
sns.heatmap(cf_matrix/np.sum(cf_matrix), annot=True,
fmt=’.2%’, cmap=’Blues’)
plt.savefig(‘telecom_lgbm_confusion.png’)

Let’s look at the ROC curve
#from sklearn.metrics import roc_curve
y_pred_proba = lgbm.predict_proba(x_test)[:,1]
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.plot([0,1],[0,1],’k-‘)
plt.plot(fpr,tpr, label=’Knn’)
plt.xlabel(‘FPR’)
plt.ylabel(‘TPR’)
plt.title(‘LightGBM ROC curve’)
#plt.show()
plt.savefig(‘telecom_lgbm_roc.png’)

and get the score
#from sklearn.metrics import roc_auc_score
roc_auc_score(y_test,y_pred_proba)
0.8485127051899575
Let’s construct the KS Statistic plot
Y_test_probs = lgbm.predict_proba(x_test)
skplt.metrics.plot_ks_statistic(y_test, Y_test_probs, figsize=(10,6));

Let’s look at the Lift curve
skplt.metrics.plot_lift_curve(y_test, Y_test_probs, figsize=(10,6));

Let’s look at the Learning curve
#import scikitplot as skplt
skplt.estimators.plot_learning_curve(lgbm, x_test, prediction,
cv=7, shuffle=True, scoring=”accuracy”,
n_jobs=-1, figsize=(6,4), title_fontsize=”large”, text_fontsize=”large”,
title=”LightGBM Classifier Learning Curve”);

Calibration Plots
Let’s compare the Calibration curves of our classifiers
rf_probas = RandomForestClassifier().fit(x_train, y_train).predict_proba(x_test)
gbc_probas = GradientBoostingClassifier().fit(x_train, y_train).predict_proba(x_test)
abc_probas = AdaBoostClassifier().fit(x_train, y_train).predict_proba(x_test)
lgbmc_scores = LGBMClassifier().fit(x_train, y_train).predict_proba(x_test)
probas_list = [rf_probas, gbc_probas, abc_probas, lgbmc_scores]
clf_names = [‘RandomForest’, ‘GradientBoosting’, ‘AdaBoost’, ‘LGBM’]
skplt.metrics.plot_calibration_curve(y_test,
probas_list,
clf_names, n_bins=15,
figsize=(12,6)
);

We can see that GB (blue curve) is the best classifier in terms of calibration.
Feature Engineering
Let’s compute the correlation heatmap by defining the mask to set the values in the upper triangle to True
plt.figure(figsize=(18, 10))
mask = np.triu(np.ones_like(data_prep.corr(), dtype=np.bool))
heatmap = sns.heatmap(data_prep.corr(), mask=mask, vmin=-1, vmax=1, annot=True, cmap=’BrBG’)
heatmap.set_title(‘Triangle Correlation Heatmap’, fontdict={‘fontsize’:18}, pad=16);
plt.savefig(‘telcom_corrmatrix.png’)

Let’s run random forest to get feature importance
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators = 25).fit(x_train, y_train)
feats = x_train.columns
for feature in zip(feats, rf.feature_importances_):
print(feature)
('gender', 0.027237348338938393) ('SeniorCitizen', 0.020887518841213648) ('Partner', 0.022358221011906435) ('Dependents', 0.0192192077185847) ('tenure', 0.15460929865409162) ('PhoneService', 0.004699263026114435) ('MultipleLines', 0.023072448889206308) ('InternetService', 0.028156528258148645) ('OnlineSecurity', 0.04750854161214951) ('OnlineBackup', 0.02195234642958269) ('DeviceProtection', 0.03126612890688831) ('TechSupport', 0.04543420250489264) ('StreamingTV', 0.017003835114310445) ('StreamingMovies', 0.016991860057308194) ('Contract', 0.07936673266861258) ('PaperlessBilling', 0.026547872337756764) ('PaymentMethod', 0.051829192705564645) ('MonthlyCharges', 0.176231663037708) ('TotalCharges', 0.1856277898870221)
We can sort these values in the descending order
imp_df = pd.DataFrame({
“Varname”: x_train.columns,
“Imp”: rf.feature_importances_
})
imp_df.sort_values(by=”Imp”, ascending=False)

Let’s create and plot the list of important features with weights defined above
importances = rf.feature_importances_
weights = pd.Series(importances,
index=x_train.columns.values)
weights.sort_values()[-10:].plot(kind = ‘barh’)

Cluster Analysis
Let’s look at the Elbow plot
from sklearn.cluster import KMeans
skplt.cluster.plot_elbow_curve(KMeans(random_state=1),
x_test,
cluster_ranges=range(2, 20),
figsize=(8,6));

and check PCA component explained variances
from sklearn.decomposition import PCA
pca = PCA(random_state=1)
pca.fit(x_test)
skplt.decomposition.plot_pca_component_variance(pca, figsize=(8,6));

We can also look at the 2-D PCA projection
skplt.decomposition.plot_pca_2d_projection(pca, x_test, y_test,
figsize=(10,10),
cmap=”tab10″);

Let’s perform the Silhouette analysis
kmeans = KMeans(n_clusters=10, random_state=1)
kmeans.fit(x_train, y_train)
cluster_labels = kmeans.predict(x_test)
skplt.metrics.plot_silhouette(x_test, cluster_labels,
figsize=(8,6));

and compare the Silhouette coefficient values assigned to our 10 cluster labels defined above.
Conclusions
Model resampled data 2 | RF | GB | AB | LGBM |
Accuracy % | 78 | 79 | 77 | 79 |
Recall % | 65 | 67 | 69 | 65 |
Precision % | 60 | 60 | 56 | 62 |
ROC Score % | 83 | 85 | 85 | 85 |
The most dominant features are Total/Monthly Charges, tenure and contract.
We have identified 10 K-means clusters. The PCA explained variance ratio is 0.792 for first 8 componanets.
As a company grows, manually evaluating customer churn becomes difficult. Yet it’s important to regularly calculate and track churn metrics over time so you can spot and ameliorate problems.
The proposed Python sequence can help churn analytics teams analyze and understand the company’s churn rate from various perspectives while forecasting future churn for planning purposes. The ML/AI capability offers a strong churn analysis program that can help you model out future capital needs, make a workforce management plan and inform other essential business decisions.
Step one is complete. You know which customers are churning, and why. The next question is what do you do with everything you found? Lucky for you, there are 6 proven strategies to reduce churn.
Related Links
[1] https://medium.com/@andhikaw.789/telco-customer-churn-machine-learning-prediction-170f16ee2fa6
[2] https://medium.com/mlearning-ai/analyzing-ibm-employee-attrition-ec9b8b9f5b0e
[5] https://discover.glassbox.com/
[7] https://towardsdatascience.com/customer-churn-analysis-4f77cc70b3bd
[8] https://mixpanel.com/blog/churn-analytics/
[9] https://www.gainsight.com/glossary/what-is-customer-churn-analysis/
[10] https://baremetrics.com/blog/churn-analysis
