Comparing the three major approaches to healthcare

Pdf File 1,831.16 KByte, 15 Pages

White Paper

Comparing the Three Major Approaches to Healthcare Data Warehousing:

A Deep Dive Review


by Steve Barlow

Co-founder & SVP Client Operations

Health Catalyst?

Health systems are being asked to deliver better care and boost productivity while simultaneously reducing costs and waste due to the U.S. government's Affordable Care Act (ACA). Government mandates, subsidies, and health insurance marketplaces are all part of the mix of incentives to increase compliance for the ACA and ultimately improve the delivery of healthcare and reduce Medicare spending.

The task to improve healthcare presents a significant challenge to providers, health systems, and payers. But according to the Institute for Healthcare Improvement (IHI), if health systems focus on achieving the objectives of a framework called the Triple Aim, they will be able to optimize their performance and meet the government's requirements. As stated by the IHI's website, the Triple Aim objectives are as follows:

Improve the patient experience of care (including quality and satisfaction)

Improve the health of populations

Reduce the per capita cost of healthcare

Copyright ? 2017 Health Catalyst

Before health systems can focus on achieving the Triple

Aim, however, they need to redesign their system with

the following components listed on the IHI's website: (1) focus on individuals and families, (2) redesign primary care services and structures, (3) focus on population health management, (4) implement a cost control platform, and (5) achieve system integration and execution.

A healthcare-specific enterprise data warehouse provides complete and accurate information from across

Changing health systems to fit this new model is possible.

an entire organization.

But it is critical for health systems to choose the most

appropriate data warehouse for healthcare's specific

needs as they redesign their systems. This is because

the traditional approach of cobbling together reports that pull in data from

the many various source systems is too costly and time-consuming. A better

approach is for health systems to increase their reliance on complete and

accurate information from across the enterprise-wide data ecosystem of their

organization, which requires a healthcare-specific data warehouse. Once

this data warehouse is implemented, health systems will be able to store and

mine their enormous amounts of data to achieve the Triple Aim.


A 2014 report from the Commonwealth Fund reveals that the U.S. healthcare system is the most expensive in the world and ranks last for access, efficiency, and equity in comparison to 10 other nations. The 10 nations from the report include Australia, Canada, France, Germany, the Netherlands, New Zealand, Norway, Sweden, Switzerland, and the United Kingdom. In addition, a 2013 report from the Centers for Disease Control and Prevention (CDC) states that personal healthcare expenditures in the United States total $2.3 trillion, with expenditures for hospital care accounting for 31.5 percent and physician and clinical services accounting for 20 percent of all national healthcare expenditures. Yet despite the high cost of U.S. healthcare, Americans are not any healthier than citizens of other industrialized nations, nor do they enjoy greater longevity. Consider these facts about the costs of healthcare and mortality:

The 2013 report U.S. Health in International Perspective compared the life expectancy of Americans to the citizens of 17 high-income peer countries from Western Europe, Australia, Japan, and Canada. The findings showed that life expectancy for American males ranks last, and life expectancy for American females ranks next to last.

Preterm-related causes of death accounted for 35 percent of infant deaths in 2009 as stated on the CDC website.

Copyright ? 2017 Health Catalyst


At least 44,000 and perhaps as many as 98,000 Americans die in hospitals each year as a result of medical errors according to the book To Err Is Human: Building a Safer Health System.

Medicare could save at least $12 billion per year by reducing preventable readmission cases that are readmitted within 30 days according to the 2007 report to Congress Promoting Greater Efficiency in Medicare.

These facts highlight merely a few of the problems healthcare is facing. To respond to these issues, there are many data warehouse choices being developed and marketed to health systems. Knowing which model will be the most effective and provide the best return on investment (ROI) can be difficult until the advantages and disadvantages of each option are understood. Then, with this knowledge, health systems will be able to make a well-informed decision about their investment.


Currently there are three main types of data warehouses from which health systems can choose to store and mine their data. The data warehouse models are as follows: the enterprise model, the independent data mart model, and the late-bindingTM architecture model. While all three models offer a data warehouse solution, some have unique attributes that make them ideal for healthcare.

1. Enterprise Model

Bill Inmon, called the "Father of Data Warehousing" on his website,

developed the enterprise model for data warehouses. This is a complex, top-

down design that includes the construction of a big centralized data

warehouse from the outset of the planning stages. By using the enterprise model approach, it is necessary to

determine in advance all of the data elements anyone would

ever need to use for data analysis, such as safety and patient satisfaction data. Analysts are forced to make lasting decisions about the data model in the beginning without being able to plan for changes in the short- or long-term. And then they need to structure the database accordingly, which can take months or even years to complete.

In the healthcare analytics environment, the enterprise model is difficult, expensive, and time-consuming to construct.

For certain industries, such as manufacturing, banking, and retail, or when there is a need to design a new transaction processing system, this model may be appropriate. But in the healthcare analytics environment, the enterprise model is difficult, expensive, and time-consuming to construct because data

Copyright ? 2017 Health Catalyst


FINANCIAL SOURCES (e.g. EPSi, Lawson, PeopleSoft)


(e.g. API Time Tracking)



Patience Provider Provider

Bad Debt

Charge House Keeping

Cost Facility


Survey Census




Time Keeping

Catha Lab



more transformation

less transformation

enforced referential integrity

Figure 1: The traditional enterprise model requires significant transformation of the data model from the source systems (dark blue ovals) as it is loaded into the enterprise data warehouse (light blue rectangles).

architects must first build a comprehensive blueprint of various data elements, such as medications, labs, and billing data, for example. To further complicate the process, the blueprint is not necessarily based on data that has already been captured -- just the forecasted data needs. As a result, the data model may include superfluous data elements. Additional limitations of the enterprise model approach are listed below.

Delayed Time to Value

The enterprise model approach creates additional expense and time for the health system because of the considerable transformation required to force fit the data into the net new data model. This delayed time to value is a significant downside of the enterprise model approach. For example, complex calculations and derivations tend to add increased work and time to an analyst's job. In comparison, other data models allow for inputs to a calculation to be loaded directly into a data warehouse, allowing for greater flexibility and faster delivery of reports.

Copyright ? 2017 Health Catalyst


Obscures Data Quality Issues

The enterprise model has rigid acceptance criteria for the data, causing the need to clean and scrub the data each time it is loaded from the primary system into the secondary system. This method obscures data quality issues and delays the improvement of data quality issues at the primary source. For example, suppose a health system wants to measure gestational age for mothers because studies have shown that inducing labor before 39 weeks increases the risk of complications. What the health system will discover as it goes to pull the data from its EMR is that the data has been captured in many different ways. Some entries may show "39.2," "39 weeks and 2 days," "39W and 2D," or "thirty-nine weeks and two days." The variations go on and on, making it impossible to easily and accurately use the data without a significant cleansing effort.

Data Is Bound Early

Top-down data warehouse models require early binding of the data. (Data binding is a technique in which raw data elements are mapped to conceptual definitions.) Early binding means the data is mapped into a predefined data model as it is brought into the warehouse, which limits the ability to make changes to the data in the future. For situations where data rules are relatively static, nonvolatile, and do not frequently change, early binding may be appropriate. Industries that employ early binding models include manufacturing, communications, retail, and financial services.

When health systems try to bind every data element to business rules early, however, they face a time-consuming and expensive approach to data warehousing. It is also difficult to make changes to the data. This is because business rules and vocabulary standards in healthcare are among the most complex in any industry, and they undergo almost constant change, resulting in high volatility.

In fact, there are only a limited number of core data elements that should be bound early because they are fundamental to almost all analytic use cases. Because they are fundamental and not volatile, it is appropriate to bind those

Figure 2: This illustration details the six points in a data warehouse where data can be bound to rules and vocabularies. Rules and vocabularies with low volatility can be bound at points 1 and 2 (early binding);

points 4 and 5, are appropriate for those with high volatility.

Copyright ? 2017 Health Catalyst


data elements early. All other data elements should be bound late (i.e., when clinicians are trying to solve a problem) because of their volatility. For example, length of stay (LOS) in a hospital may sound straightforward on paper, but surgeons might define LOS as point of incision to discharge from the post-anesthesia care unit (PACU), and cardiologists might define it as emergency department (ED) arrival to discharge. Because the LOS definition will change for different use cases, the objective is to bind it later. The following examples show which types of data are volatile and which types are not volatile.

Volatile data that should be bound late:

Calculating length of stay (LOS)

Attributing a primary care provider to a particular patient with a chronic disease

Calculating revenue (or expense) allocation and projections to a department or physician

Data definitions of general disease states for patient registries

Defining patient exclusion criteria for disease and/or population management

Defining patient admission, discharge, and transfer rules

Nonvolatile data that may be bound early:

Facility identifier

Provider identifier Patient identifier Gender Date Time of arrival Boundless Scope

When there is boundless scope and complexity with a data warehouse project, the final product may miss the mark or completely fail to deliver the functionality leadership expects.

Healthcare business processes can be complex, and teams can spend anywhere from six months to multiple years mapping their organization's information systems to a single enterprise-wide data model. To account for this intricacy, the data model can become enormous in scope and complexity and may miss the mark in terms of having the functionality leadership expects. In fact, Gartner estimated in 2005 that as many as 50 percent of data warehouse projects

Copyright ? 2017 Health Catalyst


Download Pdf File