Skip to main content
All CollectionsTutorials
How to build a clustering model using the Telecom Churn dataset (CSV method)
How to build a clustering model using the Telecom Churn dataset (CSV method)

This article will show you set by step how to build a clustering model using the CSV-formatted telecom churn demo dataset

Updated over 5 months ago

This tutorial assumes you've been through Getting Started and relies on the Telecom Churn dataset. A more comprehensive version of this tutorial is available on our blog.

STEP 1: Create a dataset

First you should connect a dataset in the G2M platform. Pick one of the demo data files above. The following steps will assume you picked the Telecom Churn dataset. Create a CSV dataset using these steps.

STEP 2: Create a model

Next, create a model associated with the dataset you just created. Pick the Clustering Model option when creating your model. You can come back later and try a different model type. Create a model using these steps.

STEP 3: Explore the data used by your model

Select the model you just created in the Models page then click Explore in the side navigation bar. You will need to load the Telecom Churn CSV file. Click here if you want to learn more about loading CSV files in the G2M platform. Once it is loaded you can click on the Explore Variables step and review the variables in your dataset. Once you are done click on the Next Steps step. You are now ready to start the Training phase.

STEP 4: Train your model

Click Train in the side navigation bar. In the Select Variables step you will need to select the variables you want to use with your model. Use the "Select all filtered and valid as independent variables" in the bulk action dropdown then click Apply. In the customerID row double-click on the variable type and change it to RecordIndex. You should now have 1 index variable, and 17 independent variables selected.

In the Select Algorithm step select PCA K-Means (simple method) for now. You can come back later and pick a different algorithm. Keep other settings unchanged.

In the Train Model step, click Start to start training your model. Click here to learn how data can be encoded prior to processing by the analytics engine.

Once training is complete you are ready to review results; click the Review Cluster Metrics step to do so.

You may end up with slightly different results due to random sampling of the data when splitting your dataset into a training set and a validation set. After reviewing results click on the Next Steps step to complete the Training phase.

STEP 5: Predict using your model

You are now ready to predict new passenger records using your model! To do so follow these steps.

STEP 6: Wrap-Up

Once you are done with this exercise, feel free to delete your model by going to the Models page, selecting your model, clicking on the ellipsis in the top right corner of the model card and selecting Delete Model. Once your model is deleted you will be able to also delete your dataset.

Did this answer your question?