Operational data will reflect its original purpose as information gathered during a business process, carrying the baggage of that purpose.
You should be familiar with, for instance, software that uses:
While they serve their purpose in an operational context, the same structures might inhibit analytics.
A good data model must transform how the logical structure of operational data is represented to suit analytical needs, without altering the data itself—since it represents reality.
We’ll focus on structured data at this point. There’s incredible amounts of unstructured data being generated, but one problem at a time.
Business processes capture granular data within a structure tailored specifically to support the application logic running those processes. While this purpose-built design works seamlessly within its intended context, it can appear unintuitive—or even nonsensical—when viewed outside that domain.
And, of course, every application is different.
So how would you represent multiple operational systems or processes as a universally applicable data model that accurately reflects reality?
My understanding is that your data modeling process should assist you.
You should be able to:
Most importantly, you need to be able to give this data a real-world meaning to make it universally applicable.
It’s not simple. It’s not about explaining each column as something. It’s not about cataloging tables, columns, and other data assets
It’s about identifying the business intent of specific parts of the data to make the data interoperable.
Your business domains should be able to look at online customer data and in-store customer data with some common frame of reference to analyze profitability.
You also need to be able to go beyond the basic layer of unit economics.
Smaller businesses can get away with data models that aren’t intrinsically interoperable. It’s not advisable but sometimes speed over everything else.
Enterprise business models cannot afford to be the same.
Here are some things I keep in mind when planning enterprise data products and the models that support them.