Proving the Business Case for the Internet of Things

Open source software makes machine learning accessible to healthcare workers

Steve Rogerson
December 6, 2016
Health Catalyst believes its repository of healthcare-focused open source software can make machine learning accessible to the thousands of healthcare professionals who possess little or no data science skills but who share an interest in using the technology to improve patient care.
Use of machine learning and predictive analytics to improve health outcomes has so far been limited to highly trained data scientists, mostly in the top academic medical centres.
However, by making its central repository of proven machine learning algorithms available for free, enables a large, diverse group of technical healthcare professionals to use machine-learning tools to build accurate models quickly. The site provides one central spot to download algorithms and tools, read documentation, request features, submit questions, follow the blog and contribute code. was started by Health Catalyst, a Utah-based data warehousing, analytics and outcomes improvement company that is contributing on-going support to the open source community. The company has used to build predictive models that drive its clients' outcomes improvement efforts and span across the product lines. Models include a predictive model for central line associated blood stream infection, readmission models for COPD and other chronic conditions, schedule optimisation, and financial predictions such as patient propensity to pay.
"Machine learning and artificial intelligence are going to transform healthcare," said Dale Sanders, executive vice president of Health Catalyst. “We are seeing amazing results and yet we are barely getting started. We are applying it to the reduction of patient harm events, care management, hospital-acquired infections, revenue cycle management, patient risk stratification, and more. With machine learning, the data are talking to us, exposing insights that we've never seen before with traditional business intelligence and analytics.”
He said that by open sourcing, the company hoped to facilitate industry wide collaboration and advance the adoption of machine learning, making it easy for healthcare organisations to learn from and enhance these tools together, without the need for a team of data scientists.
“All of us have seen what open source software has achieved in other industries and we want to be a part of that in healthcare," he said. makes it easy to create predictive and pattern recognition models using a healthcare organisation's own data. The open source repository features packages for two common languages in healthcare data science – R and Python. These packages are designed to streamline healthcare machine learning by simplifying the workflow of creating and deploying models, and delivering functionality specific to healthcare.
Both packages provide an easy way to create models on a health system's own data. This includes linear and random forest models, ways to handle missing data, guidance on feature selection, proper performance metrics, and easy database connections.
"We believe that machine learning is too helpful and important to be handled solely by full-time data scientists," said Sanders. "The new tools in enable BI developers, data architects and SQL developers to create appropriate and accurate models with healthcare data, without hiring a data scientist. These tools will democratise machine learning in a realm that needs it most, because everyone benefits when healthcare is made safer, more efficient and effective. And we are not just being altruistic here. By submitting our tools and algorithms to the open source community, we and our clients will benefit from the collective intelligence that exists beyond our team of data scientists."
Health Catalyst is a data warehousing, analytics and outcomes-improvement company that helps healthcare organisations of all sizes perform clinical, financial and operational reporting and analysis.