CUSTOMER BANK CHURN MODELLING WITH MACHINE LEARNING
CONTENT
CHAPTER 1
INTRODUCTION
OBJECTIVE
REASON
BENIFITES
CHAPTER 2
DATA DESCRIPTION AND ANALYSIS (EDA)
2.1 Variable Description
2.2 Dataset Loading and basic statics
2.3 Visualizing Data
2.4 Univariate Analysis
2.5 Bivariate Analysis
2.6 Multivariate analysis
2.7 Pair plot
CHAPTER 3
Distribution of Data
3.1 Computing Confidence interval and Histogram
3.2 Statistics test (ks test)
3.3 Plotting KDE and Q-Q plot
CHAPTER 4
Data Pre-processing Standardization
CHAPTER 5
Modelling Data with Machine Learning algorithm
CHAPTER 6
Model Performance Evaluation
CHAPTER 7
Parameter Tunning and Performance Evaluation
7.1 Logistic Regression hyper-parameter tunning
7.2 PCA and Random Forest Implementation
7.3 RandomForest Hyper-parameter tunning
CHAPTER 8
Neural network Implementation
8.1 Building Models of neural network
8.2 Visualizing the model
8.3 Performance Evaluation in Neural Network
8.4 Visualizing Accuracy and Loss
CHAPTER 1
INTRODUCTION
The objective of this project is to predict which bank customers will churn by means of machine learning modelling techniques.
It is strategically important for companies to manage relationships with their customers,in order to increase their revenues.In business “customer relationship management”(CRM) therefore aims at ensuring customers satisfaction.The companies tries to successfully apply CRM to their business in order to improve their retention power.
This technique led companies to identify with adequate advance which clients will leave and thereby can take necessary measure to prevent churn.
CHAPTER 2
DATA DESCRIPTION AND ANALYSIS (EDA)
2.1 Information about the variables and their types in the data
Surname : The surname of the customer
CreditScore : The credit score of the customer
Geography : The country of the customer(Germany/France/Spain)
Gender : The gender of the customer (Female/Male)
Age : The age of the customer
Tenure : The customer’s number of years in the in the bank
Balance : The customer’s account balance
NumOfProducts : The number of bank products that the customer uses
HasCrCard : Does the customer has a card? (0=No,1=Yes)
IsActiveMember : Does the customer has an active membership (0=No,1=Yes)
EstimatedSalary : The estimated salary of the customer
Exited : Churned or not? (0=No,1=Yes)
2.2 Dataset loading and basic stats
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
IMPORT FILES FROM DRIVE INTO GOOGLE-COLAB:
STEP-1: Import Libraries
Code to read csv file into colaboratory:
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
STEP-2: Autheticate E-Mail ID
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
STEP-3: Get File from Drive using file-ID
Get the file
downloaded = drive.CreateFile({‘id’:’19NVxVEhM_aP_l3wzpoRo-vVG0WDi-e_K’}) # replace the id with id of file you want to access
downloaded.GetContentFile(‘Churn_Modelling.csv’)
df=pd.read_csv(‘Churn_Modelling.csv’)
df
df.shape
(10000, 14)
- The data has 10000 rows( sample point) and 14 column(features)
- All feature columns name has been shown below
df.columns
Index([‘RowNumber’, ‘CustomerId’, ‘Surname’, ‘CreditScore’, ‘Geography’, ‘Gender’, ‘Age’, ‘Tenure’, ‘Balance’, ‘NumOfProducts’, ‘HasCrCard’, ‘IsActiveMember’, ‘EstimatedSalary’, ‘Exited’], dtype=’object’)
- The summary about data-frame which includes list of all columns with their data types and the number of non-null values in each column.
df.info()
df[‘Geography’].value_counts()
France 5014
Germany 2509
Spain 2477
Name: Geography, dtype: int64
So geography can be made a categorical variable for analysis
- So geography can be made a categorical variable for analysis.
df.describe()
2.3 DATA VISUALIZATION
- Visualizing missing data
df.isnull().sum()
RowNumber 0
CustomerId 0
Surname 0
CreditScore 0
Geography 0
Gender 0
Age 0
Tenure 0
Balance 0
NumOfProducts 0
HasCrCard 0
IsActiveMember 0
EstimatedSalary 0
Exited 0
dtype: int64
plt.figure(figsize=(12,8))
sns.heatmap(df.isnull(),cmap=’viridis’)
The above heatmap shows that there is no missing data present.
Outliers visualization of different feature
2.4 Univariate Analysis
plt.figure(figsize=(15,10))
sns.boxplot(data=df, x=’CreditScore’,y=’Geography’,hue=’Gender’)
- So it can be seen that Creditscore has lowerouter fence outliers and it is less in number.
plt.figure(figsize=(15,10))
sns.boxplot(data=df, x=’Age’,y=’Geography’,hue=’Gender’)
- Age category has outer fence outliers and it is more in number.
plt.figure(figsize=(15,10))
sns.boxplot(data=df, x=’Tenure’,y=’Balance’,hue=’Gender’)
plt.figure(figsize=(15,10))
sns.boxplot(data=df, y=’IsActiveMember’, x=’EstimatedSalary’,hue=’Exited’)
plt.figure(figsize=(15,10))
sns.boxplot(data=df,x=’EstimatedSalary’)
Scatter plot of features
sns.countplot(x=’Exited’, data=df,hue=’Gender’)
sns.countplot(x=’Exited’, data=df,hue=’Geography’)
So customers of Germany has churned the most and France has more number of not churned member.
2.5 Bi-variate Analysis
import seaborn as sns
sns.scatterplot(data=df, x=’Balance’,y=’Age’,hue=’Gender’)
df.columns
Index([‘RowNumber’, ‘CustomerId’, ‘Surname’, ‘CreditScore’, ‘Geography’, ‘Gender’, ‘Age’, ‘Tenure’, ‘Balance’, ‘NumOfProducts’, ‘HasCrCard’, ‘IsActiveMember’, ‘EstimatedSalary’, ‘Exited’], dtype=’object’)
sns.scatterplot(data=df, x=’Balance’,y=’Exited’,hue=’Gender’)
sns.scatterplot(data=df, x=’NumOfProducts’,y=’Exited’,hue=’Gender’)
plt.figure(figsize=(15,10))
sns.heatmap(df.corr().abs(),annot=True)
sns.scatterplot(data=df, x=’Age’,y=’Exited’,hue=’Gender’)
sns.scatterplot(data=df, x=’IsActiveMember’,y=’Exited’,hue=’Gender’)
sns.scatterplot(data=df, x=’Balance’,y=’NumOfProducts’,hue=’Gender’)
sns.scatterplot(data=df, x=’Balance’,y=’Exited’,hue=’Gender’)
2.6 Pair plot
sns.pairplot(data=df,hue=’Geography’)
2.7 Multivariate Analysis
plt.figure(figsize=(20,20))
import plotly.express as px
fig = px.scatter_3d(df, x=’IsActiveMember’, y=’Age’, z=’Exited’,#hue=’Gender’)
color=’Geography’)
fig.show()
CHAPTER 3
Distribution of Data
df1=df.drop(columns=[‘RowNumber’, ‘CustomerId’, ‘Surname’, ‘Geography’,’Gender’],axis=0)
df1
list1=list(df1.columns)
list1
['CreditScore',
'Age',
'Tenure',
'Balance',
'NumOfProducts',
'HasCrCard',
'IsActiveMember',
'EstimatedSalary',
'Exited']
3.1 computing confidence interval and Histogram
import numpy
from pandas import read_csv
from sklearn.utils import resample
from sklearn.metrics import accuracy_score
from matplotlib import pyplot
import seaborn as sns
load dataset
for i in list1:
x=df1[i]
print(i)
sns.kdeplot(x,shade=True,bw_adjust=100)
configure bootstrap
n_iterations = 1000
n_size = int(len(x))
run bootstrap
medians = list()
for i in range(n_iterations):
prepare train and test sets
s = resample(x, n_samples=n_size);
m = numpy.median(s);
print(m)
medians.append(m)
# plot scores
pyplot.hist(medians)
pyplot.show()
confidence intervals
alpha = 0.95
p = ((1.0-alpha)/2.0) * 100
lower = numpy.percentile(medians, p)
p = (alpha+((1.0-alpha)/2.0)) * 100
upper = numpy.percentile(medians, p)
print(‘%.1f confidence interval %.1f and %.1f’ % (alpha*100, lower, upper))
3.2 Statistics Test (K-S Test)
import numpy as np
import seaborn as sns
from scipy import stats
import matplotlib.pyplot as plt
for i in list1:
x=df1[i]
print(stats.kstest(x, ‘norm’))
KstestResult(statistic=1.0, pvalue=0.0) KstestResult(statistic=1.0, pvalue=0.0) KstestResult(statistic=0.8324498680518208, pvalue=0.0) KstestResult(statistic=0.6383, pvalue=0.0) KstestResult(statistic=0.8413447460685429, pvalue=0.0) KstestResult(statistic=0.5468447460685429, pvalue=0.0) KstestResult(statistic=0.5, pvalue=0.0) KstestResult(statistic=1.0, pvalue=0.0) KstestResult(statistic=0.5, pvalue=0.0)
3.3 Plotting KDE and Q-Q plot sequentially
import pylab
stats.probplot(df1[‘Age’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘Age’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘Tenure’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘Tenure’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘Balance’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘Balance’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘NumOfProducts’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘NumOfProducts’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘HasCrCard’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘HasCrCard’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘IsActiveMember’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘IsActiveMember’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘EstimatedSalary’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘EstimatedSalary’],shade=True,bw_adjust=100)
import pylab
stats.probplot(df1[‘Balance’], dist=”norm”, plot=pylab)
pylab.show()
sns.kdeplot(df1[‘Balance’],shade=True,bw_adjust=100)
CHAPTER 4
DATA PRE-PROCESSING AND STANDARDIZATION
In the previous chapter,we have that all the features are having bell shape PDF which resembles distribution close of normal ,but in Q-Q plot it can be widely seen that feature “Tenure”,”Balance”,”Number of Product” are having widely dispersed outlier.So in order to reduce outlier.So in order to reduce outlier impact and to have more reliable model we will use data standardization stretegy ,so that we will have feature with zero mean and unity standard deviation that will be robust to outlier.We will not use Normalization as it is not robust to outliers.
df.info()
So the data set is balanced and there is no missing value present in it.If there will be any missing values then we will impute that by median of that so to less impact on outliers.
df2=df.drop(columns=[‘RowNumber’, ‘CustomerId’, ‘Surname’],axis=0)
df2
Here we have dropped “RowNumber”,”CustomerId”, “Surname” column as this column is less informative in concern with prediction modelling.
df3=df2.loc[:,[‘Geography’,’Gender’]]
df3
df3.shape
(10000, 2)
pd.get_dummies(df3).shape
(10000, 5)
df6=pd.get_dummies(df3)
df6
df7=df6.drop(‘Gender_Male’,axis=1)
df7
df8=df.drop(columns=[‘RowNumber’, ‘CustomerId’, ‘Surname’,’Gender’,’Geography’],axis=0)
df8
result = pd.concat([df7, df8], ignore_index=False, sort=False,axis=1)
result1=result.iloc[:,:-1]
result1
DATA STANDARDIZATION
from sklearn import preprocessing
# Get column names first
names = result1.columns
# Create the Scaler object
scaler = preprocessing.StandardScaler()
# Fit your data on the scaler object
scaled_df = scaler.fit_transform(result1)
scaled_df = pd.DataFrame(scaled_df, columns=names)
scaled_df
X = scaled_df
y = result.iloc[:,-1]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=111,stratify=y)
CHAPTER 5
MODELLING DATA WITH ML ALGORITHM
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model, metrics
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=111,stratify=y)
#model.fit(X_train,y_train)
algorithm=[LogisticRegression(),GaussianNB(),RandomForestClassifier()]
models=[]
for i in algorithm:
model=i
model.fit(X_train,y_train)
b=model.predict(X_test)
from sklearn.metrics import r2_score
from sklearn.metrics import classification_report
classification_report(y_test,b)
print(i,classification_report(y_test, b))
import matplotlib.pyplot as plt
a=r2_score(y_test,b)
models.append([i,a])
df10=pd.DataFrame(models, columns = [‘model’, ‘Accuracy’])
df10
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, l1_ratio=None, max_iter=100,
multi_class='auto', n_jobs=None, penalty='l2',
random_state=None, solver='lbfgs', tol=0.0001, verbose=0,
warm_start=False) precision recall f1-score support
0 0.83 0.96 0.89 2389
1 0.57 0.22 0.31 611
accuracy 0.81 3000
macro avg 0.70 0.59 0.60 3000
weighted avg 0.77 0.81 0.77 3000
GaussianNB(priors=None, var_smoothing=1e-09) precision recall f1-score support
0 0.85 0.92 0.88 2389
1 0.54 0.36 0.43 611
accuracy 0.81 3000
macro avg 0.69 0.64 0.66 3000
weighted avg 0.79 0.81 0.79 3000
RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
criterion='gini', max_depth=None, max_features='auto',
max_leaf_nodes=None, max_samples=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=100,
n_jobs=None, oob_score=False, random_state=None,
verbose=0, warm_start=False) precision recall f1-score support
0 0.88 0.96 0.91 2389
1 0.74 0.47 0.57 611
accuracy 0.86 3000
macro avg 0.81 0.71 0.74 3000
weighted avg 0.85 0.86 0.84 3000
CHAPTER 6
MODEL PERFORMANCE AND EVALUATION
Plotting of AUC and ROC curve for different algorithm . We are going to plot AUC and ROC curve for different algorithm ,it is a plot between False Positive Rate vs True Positive Rate. The algorithm which has larger AUC value will have higher performance then the algorithm with lower AUC value.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model, metrics
from sklearn import svm
from sklearn.svm import SVC
svc = SVC
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve,auc
sc_x=StandardScaler()
X_train=sc_x.fit_transform(X_train)
X_test=sc_x.transform(X_test)
# SVM CLASSIFIER
model_svc=svc()
model_svc.fit(X_train,y_train)
y_pred_svm = model_svc.decision_function(X_test)
#GaussianNB
model_GNB=GaussianNB()
model_GNB.fit(X_train,y_train)
y_pred_GNB = model_GNB.predict_proba(X_test)
#RandomForestClassifier
model_RFC=RandomForestClassifier()
model_RFC.fit(X_train,y_train)
y_pred_RFC = model_RFC.predict_proba(X_test)
#logistic classifier
model_logistic=LogisticRegression()
model_logistic.fit(X_train,y_train)
y_pred_logistic = model_logistic.decision_function(X_test)
#plot ROC and compare AUC
from sklearn.metrics import roc_auc_score,auc
log_fpr,log_tpr,threshold=roc_curve(y_test,y_pred_logistic)
GNB_fpr,GNB_tpr,threshold=roc_curve(y_test,y_pred_GNB[:,1])
RFC_fpr,RFC_tpr,threshold=roc_curve(y_test,y_pred_RFC[:,1])
svm_fpr,svm_tpr,threshold=roc_curve(y_test,y_pred_svm)
#AUC
auc_svm=auc(svm_fpr,svm_tpr)
auc_log=auc(log_fpr,log_tpr)
auc_GNB=auc(GNB_fpr,GNB_tpr)
auc_RFC=auc(RFC_fpr,RFC_tpr)
plt.figure(dpi=100)
plt.plot(svm_fpr,svm_tpr,linestyle=’-’,label=’svm(auc=%0.3f’%auc_svm)
plt.plot(log_fpr,log_tpr,linestyle=’-’,label=’log(auc=%0.3f’%auc_log)
plt.plot(GNB_fpr,GNB_tpr,linestyle=’-’,label=’GNB(auc=%0.3f’%auc_GNB)
plt.plot(RFC_fpr,RFC_tpr,linestyle=’-’,label=’RFC(auc=%0.3f’%auc_RFC)
plt.xlabel(‘False positive rate’)
plt.ylabel(‘True positive rate’)
plt.legend()
plt.show()
It can be clearly seen that area under RFC > SVM >GNB >LOG. So it can be concluded that for this data RFC having lower miss classified point then any other algorithm.
y_train=pd.get_dummies(y_train)
y_train
CHAPTER 7
PARAMETER TUNNING AND PERFORMANCE EVALUATION
Here we are taking logistic regression for parameter tunning because it has less accuracy as compared to other algorithm .So we try to improve the accuracy by parameter tunning and thereby improving performance of model.
7.1 Logistic regression hyper-parameter tunning
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import warnings
estimators = []
estimators.append((‘SC’,StandardScaler()))
estimators.append((‘LR’,LogisticRegression()))
estimators
from sklearn.pipeline import Pipeline
model = Pipeline(estimators)
model.fit(X_train,y_train)
model.score(X_test,y_test)
from sklearn.model_selection import cross_val_score, StratifiedKFold
skf = StratifiedKFold(n_splits=10, shuffle=True,random_state=111)
results = cross_val_score(model, X, y, cv=skf)
print(results.mean())
from sklearn.model_selection import GridSearchCV
model.get_params()
pg = {‘LR__C’: [0.001,0.01,0.1,1.0],
‘LR__penalty’: [‘l1’, ‘l2’, ‘elasticnet’, ‘none’]}
gs_model = GridSearchCV(model, param_grid=pg, cv=10, verbose=2)
gs_model.fit(X_train,y_train)
gs_model.best_params_
gs_model.best_score_
7.2 PCA and RandomForest Implementation
from sklearn.decomposition import PCA
from sklearn.ensemble import RandomForestClassifier
steps = []
steps.append((‘PCA’,PCA(n_components=3)))
steps.append((‘RF’,RandomForestClassifier()))
model = Pipeline(steps)
model.fit(X_train,y_train)
model.score(X_test,y_test)
0.818
7.3 RandomForest Hyper-parameter tunning
from sklearn.model_selection import cross_val_score, StratifiedKFold
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(random_state = 42)
from sklearn.model_selection import RandomizedSearchCV
skf = StratifiedKFold(n_splits=10, shuffle=True,random_state=111)
results = cross_val_score(model, X, y, cv=skf)
print(results.mean())
from sklearn.model_selection import GridSearchCV
model.get_params()
# Number of trees in random forest
n_estimators = [int(x) for x in np.linspace(start = 200, stop = 2000, num = 10)]
# Number of features to consider at every split
max_features = [‘auto’, ‘sqrt’]
# Maximum number of levels in tree
max_depth = [int(x) for x in np.linspace(3,13, num =10)]
max_depth.append(None)
# Minimum number of samples required to split a node
min_samples_split = [2, 5, 10]
# Minimum number of samples required at each leaf node
min_samples_leaf = [1, 2, 4]
# Method of selecting samples for training each tree
bootstrap = [True, False]
# Create the random grid
random_grid = {‘n_estimators’: n_estimators,
‘max_features’: max_features,
‘max_depth’: max_depth,
‘min_samples_split’: min_samples_split,
‘min_samples_leaf’: min_samples_leaf,
‘bootstrap’: bootstrap}
print(random_grid)
rf = RandomForestRegressor()
# Random search of parameters, using 3 fold cross validation,
# search across 100 different combinations, and use all available cores
rf_random = RandomizedSearchCV(estimator = rf, param_distributions = random_grid, n_iter = 100, cv = 3, verbose=2, random_state=42, n_jobs = -1)
# Fit the random search model
rf_random.fit(X_train,y_train)
rf_random.best_params_
0.8219999999999998
{'n_estimators': [200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2000], 'max_features': ['auto', 'sqrt'], 'max_depth': [3, 4, 5, 6, 7, 8, 9, 10, 11, 13, None], 'min_samples_split': [2, 5, 10], 'min_samples_leaf': [1, 2, 4], 'bootstrap': [True, False]}
Fitting 3 folds for each of 100 candidates, totalling 300 fits[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=-1)]: Done 37 tasks | elapsed: 6.2min
[Parallel(n_jobs=-1)]: Done 158 tasks | elapsed: 18.6min
[Parallel(n_jobs=-1)]: Done 300 out of 300 | elapsed: 38.1min finishedRandomizedSearchCV(cv=3, error_score=nan,
estimator=RandomForestRegressor(bootstrap=True,
ccp_alpha=0.0,
criterion='mse',
max_depth=None,
max_features='auto',
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weight_fraction_leaf=0.0,
n_estimators=100,
n_jobs=None, oob_score=Fals...
iid='deprecated', n_iter=100, n_jobs=-1,
param_distributions={'bootstrap': [True, False],
'max_depth': [3, 4, 5, 6, 7, 8, 9, 10,
11, 13, None],
'max_features': ['auto', 'sqrt'],
'min_samples_leaf': [1, 2, 4],
'min_samples_split': [2, 5, 10],
'n_estimators': [200, 400, 600, 800,
1000, 1200, 1400, 1600,
1800, 2000]},
pre_dispatch='2*n_jobs', random_state=42, refit=True,
return_train_score=False, scoring=None, verbose=2)
CHAPTER 8
NEURAL NETWORK IMPLEMENTATION
import keras
#pip install keras,tensorflow
from keras.models import Sequential
from keras.layers import Dense
8.1 Building models of neural network
model = Sequential()
#input layer
model.add(Dense(units=12, input_dim=12, activation=’relu’))
#activation functions could be: sigmoid, relu, leaky relu, tanh
# 1st hidden layer
model.add(Dense(units=8, activation=’relu’))
# 2nd hidden layer
model.add(Dense(units=4, activation=’relu’))
# Output Layer
model.add(Dense(units=2, activation=’softmax’))
model.summary()
Model: “sequential_6” _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_28 (Dense) (None, 12) 156 _________________________________________________________________ dense_29 (Dense) (None, 8) 104 _________________________________________________________________ dense_30 (Dense) (None, 4) 36 _________________________________________________________________ dense_31 (Dense) (None, 2) 10 ================================================================= Total params: 306 Trainable params: 306 Non-trainable params: 0 _________________________
8.2 Visualizing the Model
#Visualizing the model
%pip install ann_visualizer
from ann_visualizer.visualize import ann_viz
from graphviz import Source
model.compile(optimizer=’adam’, loss=’categorical_crossentropy’, metrics=[‘accuracy’])
model.fit(X_train, y_train ,epochs=25, batch_size= 50)
#Prediction time
dm_pred = model.predict(X_test)
dm_pred.shape
(3000, 2)
dm_pred[0]
array([0.99096984, 0.00903017], dtype=float32)
np.max(dm_pred[0])
0.99096984
ann_viz(model,title=’Neural Network Model of churn prediction’)
graph_source=Source.from_file(‘network.gv’)
graph_source
8.3 Performance evaluation in Neural Network
test = []
for i in range(len(y_test)):
test.append(np.argmax(y_test.values[i]))
pred = []
for i in range(len(dm_pred)):
pred.append(np.argmax(dm_pred[i]))
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(test,pred))
print(classification_report(test,pred))
history = model.fit(X_train, y_train,epochs=25, batch_size= 50,validation_data=(X_test,y_test))
8.4 Visualize Accuracy and Loss
#To visualize accuracy and loss
import matplotlib.pyplot as plt
plt.figure(figsize=(10,8))
plt.plot(history.history[‘accuracy’])
plt.plot(history.history[‘val_accuracy’])
plt.title(‘Model Accuracy’)
plt.xlabel(‘Epochs’)
plt.ylabel(‘Accuracy’)
plt.legend([‘Train’,’Test’],loc=’lower right’)
plt.figure(figsize=(10,8))
plt.plot(history.history[‘loss’])
plt.plot(history.history[‘val_loss’])
plt.title(‘Model Loss’)
plt.xlabel(‘Epochs’)
plt.ylabel(‘Loss’)
plt.legend([‘Train’,’Test’],loc=’upper right’)
So we can see from above graph that training error is reducing with increase in epoch value i.e higher the number of training lower will be training loss.