In highly competitive markets, customer retention and the ability to understand them is essential. It is very much crucial for a very business with subscription-based products. Bad experiences may lead customers to choose from other plenty of providers even within one product category. The businesses have to deal with churn as it could end up affecting the material losses and thereby influence reputation along with policy decisions. In this project, we will discuss about the attributes that impact churn as well as explore the machine learning models.

Let us first know how to model the data and then understand the prediction using different codes.

**Preparing the data:**

The accuracy of the model depends when we collect relevant data. The data will be more relevant if we collect more data. Once the sufficient data collected for the analysis, the next step is *data preparation*. Before getting started with the prediction, we clean and transform the data. Cleaning the data ensure uniformity. It is an important step prior to processing and often involves reformatting data, making corrections to data and the combining of data sets to enrich the data. Then we pre-process the data and transform it by normalizing/scaling those data.

For this project, we will use Artificial Neural Network (ANN).

**Artificial Neural Network:**

The term “**Artificial Neural Network**” derived from the neural network of our nervous system. Human brain consist of 1000 billion neurons interconnected to one another that helps us get any stimulation, understand and intake various data and so on. In other words, human brain is made up of incredibly amazing parallel processors. To mimic our human brain and make decisions in a human-like manner, artificial neural network designed. It acts just like interconnected brain cells.

**Structure of Neural Networks:**

The structure of the neural network is very crucial to understand how it actually works. It consists of a large number of artificial nodes, which arranged in a sequence of layers, input layer, hidden layer and output layer.

**Input layer: **As the name suggests, it accepts several inputs that provided by the programmer. It represents the raw information that feed into the network.

**Hidden layer: **The hidden is present between input and output layer. There may be one more hidden layers present in one network. The actual processing is done in this layer. It takes the input and computes the weighted sum and includes a bias, represented in the form of a transfer function. The weighted total is then passed as an input to an activation function to produce the output.

**Output layer: **The output layer is the output of series of transformations done upon input.

Let us now walk through the codes.

Firstly, we have to import the necessary packages like pandas, NumPy, Matplot so that it can carry the necessary operations further.

import numpy as np import pandas as pd import matplotlib.pyplot as plt |

Now we have to upload or read the files/data-sets. For this, we need to use **read_csv .**

dataset = pd.read_csv(‘Churn_Modelling=4+&+5.csv’) |

Print the dataset.

print(dataset) |

Output:

Differentiate the dependent and independent values. “iloc” is used to **pick a particular cell or rows and columns by range**, within the order that they seem to appear in the data set.

x = dataset.iloc[:,3:13].values y = dataset.iloc[:,-1].values print(x.shape) print(y.shape) |

** Output**: (10000, 10)

(10000,)

__Explanation__: here x denotes independent variables and y denotes dependent variables.

Now, importing LabelEncoder to normalize labels such that they contain only values between zero and (n -1) classes.

from sklearn.preprocessing import LabelEncoder |

le1 = LabelEncoder() |

x[:,1] = le1.fit_transform(x[:,1]) print(x) |

Output:

**Explanation: ‘fit_transform’ means fitting the data and then transforming it. Here we transformed the second column.**

Now we will transform the third column with label encoder object as le2.

le2 = LabelEncoder() |

x[:,2] = le1.fit_transform(x[:,2]) print(x) |

Output:

We will use the train_test_split function from library scikit-learn, to divide our test and train dataset. Now let us look at how to split the dataset into subsets.

from sklearn.model_selection import train_test_split |

x_train, x_test, y_train, y_test=train_test_split(x,y,train_size=0.8,random_state=0) |

__Explanation__: 80% training, 20 % test and mentioning random_state means training and test data will be same every time, if we do not mention random_state then it will not be deterministic or different in next run.

Now we will take the training and test set X and fit our standard scalar object only on the columns containing independent values to normalize them.

from sklearn.preprocessing import StandardScaler sc = StandardScaler() x_train = sc.fit_transform(x_train) x_test = sc.fit_transform(x_test) |

__Importing Keras Python Library (ANN in action):__

Keras is a deep learning and high-level API implemented in Neural Network. It is used for developing and evaluating deep learning models. The Sequential model API is a way of creating deep learning models layer-by-layer and added to it.

Let us import Keras first. After we will add the first input layer and first hidden layer.

import keras from keras.models import Sequential from keras.layers import Dense |

ANN_model = Sequential() ANN_model.add(Dense(units=6,input_dim=10,kernel_initializer=’uniform’, activation=’relu’)) |

__Explanations__**: Units: it denotes the output size of the layer, normally average of no of node in input layer.**

__Kernel_initializer__**: The initializer parameter tells Keras tells a way to initialize the values of weights of Keras layer, weight matrix and our bias vector.**

__Activation__**: Element wise activation function to be used in the dense layer [Rectified Linear Unit (ReLU)].**

__Input_dim__: For the first layer only, number of input independent variable. Only for first hidden layer.

Now creating middle layer and final layer.

ANN_model.add(Dense(units=6,kernel_initializer=’uniform’,activation=”relu”)) ANN_model.add(Dense(units=1,kernel_initializer=’uniform’,activation=”sigmoid”)) ANN_model.compile(loss=”binary_crossentropy”,optimizer=”adam”,metrics=[‘accuracy’]) |

ANN_model.add(Dense(units=6,kernel_initializer=’uniform’,activation=”relu”)) ANN_model.add(Dense(units=1,kernel_initializer=’uniform’,activation=”sigmoid”)) ANN_model.compile(loss=”binary_crossentropy”,optimizer=”adam”,metrics=[‘accuracy’]) |

__Explanations__*: ***Last layer activation function is different from previous one. Here normally use ‘sigmoid’ for Boolean. **

__Optimizer__**: Update the weight parameters to minimize the loss function.**

__Loss function__**: Acts as guides to the piece of ground telling optimizer if it is taking possession in the right direction. It also tells us how ****the predicted output differs from the actual output**.

__Metrics__: A metric function is the way in which you weight the importance of different characteristics in the results.

We call fit() which is able to train the model by slicing the info into batches of ‘batch_size’, and repeatedly iterating over the whole dataset for a given no. of epochs.

ANN_model.fit(x_train, y_train, batch_size = 50, epochs = 50) |

Output:

Runs until epoch 50/50.

This time we can get any specific prediction like ANN_model.predict(x_test) or pass the same shape and normalized array to get new prediction.

y_pred = ANN_model.predict(x_test) y_pred = y_pred > 0.5 y_pred_new = []for i in y_pred: if i == True: y_pred_new.append(1) elif i == False: y_pred_new.append(0) y_pred = np.array(y_pred_new) |

A much higher way to evaluate the performance of a classifier is to appear at the confusion matrix. Every row in a confusion matrix represents an actual category, while each column represents a foretold category.

So now, we will import and work out Confusion Matrix from ScikitLearn.

from sklearn.metrics import confusion_matrix |

cm = confusion_matrix(y_test, y_pred) cm |

**Output***: *array ([[1557, 38], [281, 124]], dtype=int64)

From ScikitLearn, importing and computing accuracy_score, which returns “accuracy classification score”.

from sklearn.metrics import accuracy_score |

accuracy_score(y_test, y_pred) |

__Output__*: *0.8405

__Explanation__*: The output indicates that your prediction is 84% accurate.*