Mar 13, 2019

Improve Predictive Healthcare Models with Azure Databricks

Jacque Carlson Posted by Jacque Carlson

Microsoft Azure Resources Advance Digital Transformation Within Healthcare

Healthcare organizations struggle with several daunting issues; many affect healthcare quality. Low unemployment, high demand for healthcare providers, and ever-changing healthcare epidemics are a few of the challenges facing healthcare industry leaders. Finding ways to optimize their workforce and workplace is imperative. For many organizations, Microsoft’s data analytics resources are becoming essential to the process.

When healthcare organizations are unable to accurately predict patient volume or demand, it impacts care quality. Healthcare providers find it difficult to spend adequate time with patients; this undercuts the quality of patient care, increases job burnout among providers, and results in job turnover.

Microsoft Azure Resources Advance Digital Transformation Within Healthcare

Healthcare administrators who can predict patient demand and provide necessary staffing levels can positively impact overall productivity. Accurately anticipating healthcare demands bolsters the most vital objectives: patient satisfaction, quality of care, satisfaction and retention of providers, and improvement in healthcare costs and revenues.

Data-driven methods, such as predictive models, are replacing traditional practices of determining providers’ work schedules. The ongoing development, management, and renewal of models that predict patients’ length of hospitalization and the necessary staffing needs for a given patient load can improve the efficiency, cost, and quality of healthcare.

Microsoft’s rich ecosystem of advanced data analytic tools and technologies can provide healthcare organizations with digital transformation opportunities to help control healthcare costs and improve patient care outcomes. Specifically, by providing resources to develop value-based care models faster and more efficiently, these analytic tools and technologies can be key to reducing costs.

Two Microsoft Azure resources – Azure Machine Learning service and Azure Databricks – can advance the development and utility of predictive healthcare models. When using them in combination, a healthcare organization can see faster returns on data-driven initiatives and gain deeper insights.

Azure Machine Learning Service & Azure Databricks

Predictive models provide the data-driven and technology-enabled forecasting that can successfully address many of the predictive needs of healthcare organizations. However, a key bottleneck in predictive model development is the time and organization required to work through the iterative model development process. Several important steps in the model development pipeline can become time and effort-intensive processes; they include feature engineering and selection, algorithm selection, tuning of model parameters, model performance evaluation, and the cyclical process of working through all the combinations of these features, algorithms, parameters, and evaluation metrics.

Building predictive models for patient demands (e.g., length of stay in the hospital) begins with gathering historical health records and running the data through as many of the permutations of the machine learning pipeline as necessary to achieve optimum results. This is where the Workspace feature in Azure Machine Learning (AML) service comes into play.

“Azure Machine Learning service provides a cloud-based environment to develop, train, test, deploy, manage, and track machine learning models.”
Microsoft Documentation

Specifically, the Workspace feature available in the AML service provides a well-designed organizational structure for the building of predictive models. A key capability of the AML Workspace is its ability to reduce the time to develop predictive models by providing a space that organizes model development (model experiments) and the evaluation of model performance (model metrics).

Data scientists can work from many Python-supported environments (e.g., their local machines, data science virtual machines, Azure Notebooks, or cloud-based compute resources) and use their favorite open-source machine learning libraries. Azure Databricks is an Apache-powered, unified analytics, cloud-based computing resource. A custom version of the AML service SDK has been created specifically for Azure Databricks. The combination of these Azure resources provides data scientists with added functionality to their machine learning experimentation, testing, and model evaluation processes.

workspaceA view of the Azure Machine Learning service Workspace.

Using Azure Databricks and an anonymized demo data set of patient health records, a machine learning pipeline was developed to predict patient length of stay in the hospital. The pipeline included data cleaning, exploratory data analysis, feature engineering, algorithm selection, and tuning of algorithm parameters. Model training, testing, and evaluation were organized and managed using the AML Workspace feature within the Azure portal.

Within the modeling code, several of the azureml.core modules are imported, and code is written to log parent-child runs of the different model permutations (or experiments). These data points and metrics are then sent to the identified AML Workspace.

model code

Code to run the model experiments within an Azure Databricks notebook:

databricks notebook

In one command cell, several models are run and evaluated. These models and the associated metrics are then logged and tagged within the AML Workspace that is accessed via the Azure portal. This organization allows for faster, more organized, and less error-prone machine learning pipeline development.


metricsWithin a given experiment, model evaluation run times, algorithm choices, evaluation metrics, graphs, and more can be saved. Here are the model evaluation metrics for one child run (permutation) using a linear regression algorithm.

The Workspace feature within the AML service provides a well-structured DevOps utility that reduces the data scientist’s workload of having to track each permutation of selected features, chosen algorithm, parameter selections, and model evaluation.

The support features within the AML service extend beyond model development support. By ingesting clean training data and building the modeling pipeline automatically, the automated machine learning feature within the service takes control of the development process, further reducing the time and effort involved in model development (see Microsoft’s documentation for more information).

Predictive Analytics in Healthcare

Healthcare organizations can use targeted predictive models to achieve tangible outcomes, such as higher staff morale, improved patient care, and lower healthcare costs. Models that provide accurate predictions (e.g., patient demand) and clear actionable insights (e.g., appropriate staffing schedules, purchase of sufficient medical supplies) position healthcare organizations for increased efficiency and the delivery of high-quality care in today’s cost-competitive markets. Predictive models such as these can be developed by capitalizing on the speed and organization of model development with Azure Machine Learning service’s Workspace and the analytic power of Azure Databricks.

Contact BlueGranite to learn more about how your healthcare organization can capitalize upon the data-driven insights of predictive analytics.
AI Strategy for Business
Jacque Carlson

About The Author

Jacque Carlson

Dr. Jacque Carlson is a Data Scientist at BlueGranite. As a former college professor in Research Psychology, her expertise is in methodology, statistics, and human behavioral questions. She is an enthusiastic promoter of well-structured, data-driven decision making. When not at BlueGranite, you can find Jacque outside; she is an avid trail runner and hiker.

Latest Posts

AI Strategy for Business