Predicting Customer
Churn using Azure Databricks 

showcasing a customer churn model using the power of Spark 

Recorded June 2019


In today's complex, omnichannel retailing market, maintaining a high customer retention rate is critical to success. Understanding when a customer may be at risk to break ties with your organization could help you take a more targeted approach to relationship management, effectively plan for future financial impact, and even prevent the loss of customers in the first place. 

Using the right tools, it is possible to accurately predict customer churn by analyzing historical data from previous and existing clients. 

In this webinar, the BlueGranite team demonstrates the value of cloud-based technologies for customer churn prediction featuring Azure Databricks - Apache Spark cluster technologies - to create an extremely fast and efficient solution built collaboratively between data scientists and data engineers using mix of product and customer data.

Additionally, we explore how data sets can be enriched to identify root causes of churn so that campaigns and conversations can be created to not only prevent churn, but also to potentially re-acquire dissatisfied customers.

Learn more about BlueGranite's work in the Retail Industry on our Retail and CPG Analytics page. 

 Webinar SUMMARY

  • Learn how to greatly improve your organization's machine learning  capabilities for customer churn models using a highly scalable, cloud-based platform
  • Review data transformations for preparing customer datasets - how to prepare your data for customer churn analysis 

  • Review how to set up easier operationalization (making APIs or scheduling jobs) in a collaborative data engineering and modeling environment for multiple team members to see and interact with at once

  • See how data engineers and data scientists working together can bring their own preferred programming language to the solution - R, SQL, Python, or Scala

  • Review a solution architecture using Azure Databricks for a parallelized data solution, combining data from multiple sources at once, structured or unstructured, with parallelized model training (trying different algorithms and doing a faster hyperparameter sweep and cross-validation)

 Webinar Details

  • Recorded June 2019


Dr. Colby Ford, AI Architect

colby-fordComing from a background in mathematics, statistics, and bioinformatics, Colby combines this expertise to bring Data Science to everyone. He utilizes R and Python and puts Machine Learning to work to gain insight from data. Outside of BlueGranite, Colby is an avid pianist and genomics researcher. Check out Colby’s website at

Bret Myers, Senior Consultant

BretMyers-2Bret has expertise in data warehouse design and development, the SQL Server BI Stack, and Microsoft Office Professional applications. Bret obtained his Bachelor of Science in Computer Science from Michigan State University and has worked as a BI developer since 2012. He also enjoys working with industries such as healthcare, retail, manufacturing, and nonprofits.







How much energy should we generate next quarter to meet demand?



Based on previous usage history at this location, how many gallons of water with a new customer will likely use per month?



How will a cold Winter affect natural gas usage compared to the past few years?





How many of this item do I expect to sell this month?



Given that it takes a while to get a delivery, how much of each item should I order?

Customer Traffic

Customer Traffic

How many workers should I schedule based on projected customer traffic today?


Information Technology

Infastructure Utilization

Infastructure Utilization

What level of bandwidth utilization should we expect each day this month?

System Outages

System Outages

Given past history of outages, when is the next outage likely to occur?

Personnel Support

Personnel Support

Given the current trending numbers of employees, how many do we suspect we will have to support next year?