how to calculate value of your data

Let's get to the point where effort creates agreed value...
I'm always very encouraged when professionals from different architectural disciplines can converge on common ground. This can be rare event, so when it does happen, I like to call it out. Such an event has happened recently with a contact coming from the Business Architecture discipline, namely Robert DuWors, with us both trying to put some metrics around the measurement of data value in our respective areas of expertise.

I believe that Business Architecture and Information Architecture are the two core pillars of the architecture of an enterprise. But practitioners of these interconnected disciplines can frequently rub badly against each other, each side devaluing the other's methods and approaches. So, to reach agreement across the two on what constitutes enterprise value of our efforts is a happy place to be.

What came out of those discussions was this equation:

DataValue (Consensus ((scope)(impact)(appetite)(intersection)), Epistemology, Content(Semantics,(who)(when)(why)(where)(what)(how)))

...and together, we agreed that this represents the value of a specific item of data to the enterprise from both Information and Business perspectives. Now, of course, this may be refined over time, but it already contains most of the aspects that together, Robert and I believe are key to this metric...

So, what does this equation gives us? It's in 3 major sections, which I will call Horizons...

The numeric value these generate (you can chose your own scale as long as they are applied consistently and with lack of bias for any particular horizon) can then be used as a point of decision where values under an agreed limit are deemed as not able to return sufficient value to the enterprise in respect to the specific effort required to achieve that value.

A subjective judgement call will need to be made on a case by case basis...but it means that general low cost effort can be applied to the majority and focus placed on the big ticket items, regardless of what makes them a big ticket item...priorities can be set, but the idea is you get everywhere eventually.

Part 1: Consensus Horizon : Consensus ((scope)(impact)(appetite)(intersection))

Robert DuWors originally stated this as a function FrameOfReference(), however this subsequent version we agreed was expanded to enable the concept of consensus....or a frame of reference that can be agreed across all interested parties across the entire enterprise. This could also include external parties key to the enterprise.

The four elements are:

  • Scope: whether the consensus is global, industry, tribal or localized, the wider the scope the bigger the value
  • Impact: the comparative level of business benefit that this data enables, how important the is data to the enterprise
  • Appetite: the comparative level of business risk, appetite and obligation compliance in play
  • Intersection: the comparative scale of reusability and coverage

The consensus horizon is critical in establishing where value can be recognized as worth the effort of increased data or business management attention, the product of the four consensus aspects creates the consensus horizon metric. We use the product here because all four elements are equally critical.

Part 2: Business Communication Horizon : Epistemology

Both Robert DuWors and I agreed that business language in all its natural forms is the carrier for all business communications and therefore a metric should be included that represents the level of understanding of an item of data from this perspective.

We used the term Epistemology because this includes all aspects of natural language and will see that Semantics is covered in the third horizon because it is more syntactical and closer to context. To score high on this horizon, you must have a complete understanding of terminology...synonyms, translations, antonyms, variations, acronyms etc.

Part 3: Content Horizon : Content(Semantics,(who)(when)(why)(where)(what)(how))

This was expanded from a simplified version Content (Semantics, Context) into this fuller form. When we say "Semantics" here, we mean the level to which we can or have provided non-ambiguous name and meaning to an item of data, regardless of any consensus required to do so...that is covered in the Consensus Horizon...or the business language used...the Business Communication Horizon.

This level is then added to the product of context to derive a content score...addition here because even if its context definition is poor, the Content Horizon still has some intrinsic value in terms of its nomenclature and meaning.

Context is defined here as the product of the levels of understanding of the six basic contextual questions, namely Who, Where, What, When, Why and How...being a product...a low understanding in any of the 6 areas significantly reduces ability to demonstrate full knowledge of context and therefore the full contextual value.

A context score of zero, therefore, has a high probability if data management is poorly executed or executed with narrow or partial focus. This, we thought, was essential to ensure full data management discipline is applied to this process.

Context therefore represents a score that signifies the level and scope of classification and knowledge management applied, and together with Semantics provides a level of knowledge about the data value.

Using this value equation...

So, to make use of this equation, you need to capture a score for each of its elements for every item of data as you model or classify your data estate. Set acceptable limits (absolute or tiered) and the order or precedence or rules of priority between the horizons and plug this into your architectural governance process...then prioritize your work on the highest scorers downwards until you have covered everything.

This should be a continuous process, baselined and kept up to date through every business and digital transformation.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

About the author 

Robert Vane

Robert Vane is the co-founder of the Q6FSA Method for Global Information Management, a freelance full enterprise scope data architect with over 25 years experience of getting it all wrong, now dedicated to solving the foundational root causes of failure within the information management space and getting it all right.

You may also like:

How to Become a Data Science Freelancer

George Firican


Data Governance in 2024

Data Governance in 2024
5 Steps to Achieve Proactive Data Observability – Explained Over Beers