Three Key Actions for High-Quality Data in Healthcare

Katie McCurdy’s article “The Limits of Health Data Aggregation” accurately points out the flaws in our current healthcare data repositories. She describes the “holy grail” mentality that if we could achieve interoperability and data aggregation, we could gain insights that would allow us to improve the diagnosis and treatment for all patients. But we know that electronic medical record datacoding and billing dataclinical registry dataadministrative data, etc., all suffer from poor data quality. Unfortunately, the combination of human error, accentuated by the volume model in healthcare, and poorly designed fragmented systems, means that just making data available and interoperable will not be sufficient to gain meaningful insights from healthcare datasets—garbage in, garbage out. After working with many data scientists and engineers for over a decade, we have learned about the need for decentralized, curated data in the context of each definable care process, which is achieved by engaging the front-line clinical team. There are at least three key actions for high-quality data in healthcare:

  1. Focus on “What Matters” – For data to be meaningful, it requires context (a whole, definable care process). It is important to determine the most crucial patient factors, treatment factors, and outcomes to be measured relative to the specific care process that is being analyzed. We collect way too much data for each patient problem in healthcare. Probably less than 10% of the data collected have any meaningful impact on patient outcomes, and those meaningful data will differ depending on the context. Getting rid of the noise and finding what matters the most leads to a high-quality dataset. And by the way, documenting only what really matters for each care process would significantly decrease the administrative burden of data entry for healthcare workers, which is a major contributing factor to burn-out and job dissatisfaction.

  2. Curate the data with a small, diverse team – As mentioned, healthcare data, as it exists, is of poor quality. To determine “what matters” and assess and improve the quality of the data, a small, diverse team made up of the people most engaged in the specific care process being analyzed will help to improve the quality of the data. This team can help determine “what matters,” where anomalies and obvious errors exist, and they can make suggestions about how to improve measurements. Even patients can contribute towards improvement; past patients of our hernia program who had wound complications made suggestions about how to measure those complications that were a much better determination of the value of care they received than the current CDC method for measuring wound complications.

  3. Facilitate periodic feedback loops – Gaining insights from data is not magic; it requires intentional data curation and the use of appropriate data visualization and analytical tools. These forms of data analysis tools are not intended to prove that one treatment is better than another; they are designed to help gain insights for learning and applying those insights to improve measured outcomes. After an analysis is done, the front-line clinical team can use the insights gained and make attempts to improve measurements and outcomes. Additional data is then collected and analyzed again. This is repeated over time to gain more valuable insights and develop a dataset where the quality of the data improves with each feedback loop.


By: Bruce Ramshaw, MD | Co-Founder & CEO of CQInsights