February 7, 2024
/
6 minutes

Why Data Product Design is More Important Than Ever for Building Data Warehouses

Product Design
Data Culture
Data Modeling
Sam Abraham
Marketing Manager

Data Warehouses are complex to implement, no matter which approach you choose — Kimball, Inmon, Data Vault.

There’s no denying the need for good engineering and a relatively modern data stack. Most organizations realize this and have invested in capable data engineers.

But it is not the technical implementation that brings the most value from the project.

And bad engineering is not why most data warehouse projects fail.

Data Warehouses are — and were always — intended to support business decisions.

Ralph Kimball, one of the OGs, identifies a failure to do so as the number one pitfall when implementing a data warehouse.

“Neglect to acknowledge that data warehouse success is tied directly to user acceptance. If the users haven’t accepted the data warehouse as a foundation for improved decision making, then your efforts have been exercises in futility.”

And a further three more in his top 10 are related to bridging the gap between business and tech.

Are Data Warehouses Fulfilling Their Promise of Data-Based Decisions?

We say no.

It’s not just us. More and more data teams are realizing that data projects are too far away from business impact.

Screenshots of social media posts from data experts on the gap between data teams & business results.

The feeling goes beyond anecdotal evidence. A 2023 Gartner report said that “Chief Data and Analytics Officers need to create value and make an impact NOW.”

We don’t think the world of data has changed drastically in the nine months (as of February 2024) since that report was published.

The report added: “​​Current data and analytics governance practices are insensitive to business context, making them inadequate for responding quickly to opportunities. Data and analytics leaders must modernize existing governance practices to materialize return on investment.”

This applies to most data investments, including data warehouses.

Companies choose to build data warehouses for two reasons:

1. Reporting requirements.

These are often mandatory for compliance reasons, and the additional benefit of business analytics is something companies are willing to invest in right now.

2. Data-driven decision-making.

This goes beyond the dashboard. Analytics currently focuses on tracking historical and current data. But the idea of data-driven is only realized when you make accurate predictive models.

There is a massive gulf between what existing data warehouses can do towards predictive analytics and models that can accurately impact business decisions.

Few businesses that have invested heavily in “data” have been able to see a return on investment on the predictive side of things.

Most are not prepared to take advantage of the recent developments in generative AI and LLMs, and will not catch up for many years.

Why Technology Won’t Turn the Tide

Ironically, investments in the modern data stack have been to solve usability issues through technology solutions. This has resulted in better ways to work with data.

But continuing to improve the tech side is trying to build something that helps data engineers build something that helps data scientists/analysts build something that helps business analysts build something that’s useful for a decision maker.

The answer always seems to be a better, faster way to move and organize data. Automations, databases in the cloud, agile software tools, and so on.

It’s another top 10 Kimball pitfall, to “become overly enamored with technology and data rather than focussing on the business’s requirements and goals.”

Maybe the key to solving the problem is simpler, and also more complicated.

It’s simple because it's what humans have always done — talk to each other.

It’s complicated because businesses are complicated and modeling a useful process is complicated.

A infographic representation of a data warehouse project.

What Makes for a Good Data Warehouse?

We won’t go over data engineering. There’re incredibly talented people who’ve already solved pretty much every possible problem.

But not all data warehouses are planned the same. They might be built well, but the blueprint does not support business needs.

Memes.

The objective and approach is what determines the right business outcome for a data project.

Ending up with a recognizable and relatively accurate schema does not mean the data warehouse can become the basis for more accurate reporting or predictive analytics.

As we’ve already said, all data warehouses are complex projects for a large enough company. There is no easy way to come in, unify systems and business domains to deliver universal results for everyone who needs data.

You need to build towards simplifying processes and making data products scalable.

Why Consulting with Business Leaders is Important in a Data Warehouse Project

Currently, far too often, the project team structure and decisions flow in one direction:

- A CXO or a domain expert demands a data warehouse to either meet regulatory requirements or to support their business needs.

- The demand is passed on to a technical head like the CIO, CDO or even the CTO.

- A technical team with a project manager is assigned the task, who then gathers requirements (with the business experts involved).

- The requirements don’t go beyond the data points. There is, at best, a basic understanding of what the business needs.

- The tech/data team then starts building the warehouse, often piece by piece within an Agile Development framework.

- The tech/data team cannot provide visibility to the project manager in a way that gives them an idea of what the result is going to be. They’re technically adept and the project is probably well-engineered.

- This then means that the PM cannot inform the end user or the CXO of the probable success of the project with any confidence. Again, technically, they’re on track.

- The key stakeholders don't have visibility of the end product through the lifetime of the project.

- Data projects often seem like a black box because the end result is not guaranteed until the last stage.

- Technically, everything is fine. And that's the point. This is not a technology problem.

So, after a decade of massive data warehouse investments, poor results are putting pressure on CXOs and data teams to provide better business-oriented results.

Data Modeling as the Bridge Between Business & Technology

It’s unlikely that you’re not familiar with data modeling. Conceptual data modeling ignores technical aspects of data engineering and data warehouse architecture to focus on capturing business needs.

It’s the semantic layer that is primarily influenced by the domain expert, while the data modeler plays the supporting role.

Learn (a lot) more on Ellie’s YouTube channel.

Such an approach gives you three advantages:

1. You understand the business’s pain points, something technology cannot do.

2. You take all the investments in technology and provide business value.

3. You can tailor solutions to a business unit while still being able build data warehouses that support multi-billion dollar companies.

The pushback to data modeling when building data warehouses was based on a feeling that it was moving away from agile development.

Data architects and engineers saw modeling as a long process that forced them to gather requirements in one go for a massive project, often taking weeks or months to complete.

But this is no longer the case. With the right tools, you can break up large data warehouse projects into more manageable sizes.

Each of these models will be entirely focussed on delivering business value.

The key is to have reusable concepts and components so that the smaller models can then be stitched together to form an effective model of the entire domain, like the logistics or financial departments of Fortune 2000 companies (and eventually the entire organization).

A tool like Ellie does this in the background, weaving together the various parts of the business without you having to make these connections.

Reusable entities and a common data glossary ensure that the platform builds a massive representative model that connects business domains within a multi-billion dollar company while you focus on smaller, more manageable projects.

Most importantly, this process will be driven by business stakeholders and follow real world requirements.