January 8, 2025
/
4 MINUTES

Anatomy of a Good Data Model: Bridging Reality and Data

Data Culture
Data Modeling
Data Industry
Hannu Järvi
Co-Founder & Chief Success Officer

Operational data will reflect its original purpose as information gathered during a business process, carrying the baggage of that purpose.

You should be familiar with, for instance, software that uses:

  • Denormalized structures for quick access and aggregation, such as enabling a customer service agent to retrieve a full 360-degree customer view during a call.
  • Hypernormalized structures for high configurability, such as ERP localization for regional tax rules, languages, and compliance requirements.

While they serve their purpose in an operational context, the same structures might inhibit analytics.

A good data model must transform how the logical structure of operational data is represented to suit analytical needs, without altering the data itself—since it represents reality.

We’ll focus on structured data at this point. There’s incredible amounts of unstructured data being generated, but one problem at a time.

Dealing with the Real World Data

Business processes capture granular data within a structure tailored specifically to support the application logic running those processes. While this purpose-built design works seamlessly within its intended context, it can appear unintuitive—or even nonsensical—when viewed outside that domain.

And, of course, every application is different.

So how would you represent multiple operational systems or processes as a universally applicable data model that accurately reflects reality?

My understanding is that your data modeling process should assist you.

You should be able to:

  • Transform the records of a single table into a universal normalized structure.
  • Transform the schema of several tables—or multiple interconnected schemas—into a universal normalized structure.
  • Generalize the approach to other types of data objects (e.g. JSON).

Most importantly, you need to be able to give this data a real-world meaning to make it universally applicable.

It’s not simple. It’s not about explaining each column as something. It’s not about cataloging tables, columns, and other data assets

It’s about identifying the business intent of specific parts of the data to make the data interoperable.

Your business domains should be able to look at online customer data and in-store customer data  with some common frame of reference to analyze profitability.

You also need to be able to go beyond the basic layer of unit economics.

An Interoperable Data Modeling Process (for Enterprises)

Smaller businesses can get away with data models that aren’t intrinsically interoperable. It’s not advisable but sometimes speed over everything else.

Enterprise business models cannot afford to be the same.

Here are some things I keep in mind when planning enterprise data products and the models that support them.

  • A large data model must be broken down into parts, requiring a partitioning strategy that aligns with the business goal.
  • The partitioning is done step-by-step, breaking down scope through prioritization and delegation.
  • In each submodel, the scope is managed by prioritizing which “things” and “events” are important from the perspective of the overall business goal.
  • Large-scale data modeling efforts must bring together professionals from diverse backgrounds. So you need simple, shared practices to integrate their contributions.
  • Enforce these shared practices through templating and guided workflows.
  • A model composed of partitions must meet the same characteristics of a good data model.
  • The practice of model(ing) governance is still immature. Building a good model from the contributions of individuals and teams requires experienced people.
  • You must be able to review a large data model. You need to be able to review the enterprise design as well as the details, and have the two aligned. Changes at the design level must result in updates to details, while updates to details must fit the overall design.