The 4 main roles of metadata

I love metadata because of the benefits it offers. In fact, I never “metadata” \met a data\ I did not find useful (geeky joke, I know). If you don’t have a metadata management program or initiative in place, there are a lot of reasons why you should consider investing into it. In fact, the following 4 main roles of metadata should provide you with a glimpse into the benefits of managing it:

1. Classification

Data can have a lot of characteristics by which it can be grouped or classified. Why do you care? Because having these categories will allow you to organize it and manage it. Data can be classified by each and any of the following:

Subject – Ex: financial data, student data, fundraising data, health data, product data, etc.
Usage – Ex: transactional, analytical, regulatory, etc.
Time – Ex: live and current data, historical, predictive, etc.
Content – Ex: geo-spatial data, machine data, structured vs unstructured data, etc.
Scope – Ex: enterprise, external, departmental, master data, etc.

Managing data based on classification or groups allows you to apply the same standards, procedures and processes, as well as data stewards and owners. Though you can have the same data falling under multiple groups which adds another layer of complexity into how it should be managed. For example you can have the same metadata indicating it’s transactional, that falls under GDPR, it’s enterprise wide, as well as live and unstructured health data. Usually these groups can be placed within a hierarchy to determine which classification should take precedence over others.

If you want to know about the 3 classification groups to help with GDPR, please read our other article, too.

2. Description

Describing the data helps you understand both of its logical and physical aspects. Described data should include:

Data meaning – Business definitions, data modeling entities and attributes
Data structure – Description of data objects (entities, tables, records, etc.), their logical groupings and relationships
Data content – The types of data such as date, currency, text, number, etc.
Data values – What values are allowed, what reference data is available, what patterns or value ranges should it follow, what constraints should it meet, etc.
Data lineage – What is the data source, how was the data created, derived, and/or calculated, how was it transformed, etc.

Without this description you will be treading water in collecting data, integrating it with the internal or external systems, maintaining it, or deriving useful information out of it.

3. Guidance

Metatada can serve as a guide to any technical or business user to find the data they need, through search engines, or other processes. This guidance metadata can be comprised of:

Keywords – This could be any metadata described so far
Taxonomies – Yet another example of how classification helps
Date/ time stamps – Usually automatically added at the table or row level
Associated reports, processes, people – Knowing where data surfaces, who the data users or data stewards are, how data is captured and transformed could serve as a good starting point for finding what you need
Synonyms, aliases, related terms

Providing your stakeholders with the guidance to find the data they need for reporting, analyzing, testing, prototyping, troubleshooting, etc, saves time and makes better use of available resources.

4. Control

Metadata can provide the necessary knowledge to figure out what controls should be enforced upon the data and what data should be controlled. It enforces constraints due to:

Regulatory compliance & internal policies
Retention & archival
Privacy & security
Service levels & business requirements
Technical requirements

These controls help ensure compliance with internal and external rules and regulations, policies, and business and technical requirements.

Conclusion

These metadata items are not mutually exclusive. From the examples above you might have already identified how taxonomy helps with the classification role, as well as providing control and guidance. A single metadata item can serve multiple roles and it is this fact that increases its value.

Metadata plays several crucial roles in managing and understanding data effectively. Here are four main roles:
Descriptive Role:
Administrative Role:
Structural Role:

Well done!

Given all the things you can do with metadata, the one thing that confuses many practitioners and confounds many data management initiatives is captured in the last sentence of George Firican’s article:

“A single metadata item can serve multiple roles and it is this fact that increases its value.”

Take out the word ‘metadata’ and that statement is true of any data item. The problem, of course, is maintaining the integrity of the values when they are used – and reused – in many different places. The answer is to register individual values *once* and assign an identity, a semantic class, and declare equivalent forms in a *context independent* catalog. For example, “John O’Gorman” has an ID of 2a8f33b4147cc7900; a semantic class of Person; and declared equivalents of “O’Gorman, John” and “John D. O Gorman”.

Then, using the article’s four main roles as a guide, I can use that identifier as metadata anywhere.

Share0

Tweet0

About the author

George Firican

George Firican is the Director of Data Governance and Business Intelligence at the University of British Columbia, which is ranked among the top 20 public universities in the world. His passion for data led him towards award-winning program implementations in the data governance, data quality, and business intelligence fields. Due to his desire for continuous improvement and knowledge sharing, he founded LightsOnData, a website which offers free templates, definitions, best practices, articles and other useful resources to help with data governance and data management questions and challenges. He also has over twelve years of project management and business/technical analysis experience in the higher education, fundraising, software and web development, and e-commerce industries.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
tve_leads_unique	1 month	This cookie is set by the provider Thrive Themes. This cookie is used to know which optin form the visitor has filled out when subscribing a newsletter.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_1Z635JPV9L	2 years	This cookie is installed by Google Analytics.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
AE_AB_COOKIE	1 year	No description
DEVICE_INFO	5 months 27 days	No description
loglevel	never	No description available.
tl_4829_4830_26	1 month	No description
tl_4829_4840_30	1 month	No description
tl_4829_4941_41	1 month	No description
tve_secret	1 year	No description available.

The 4 main roles of metadata

1. Classification

If you want to know about the 3 classification groups to help with GDPR, please read our other article, too.

2. Description

3. Guidance

4. Control

Conclusion

George Firican

Human in the Loop AI: Why It’s Often Just a Checkbox

The 6 layers of AI governance: A practical AI governance framework

How AI Is Reinventing MDM and Data Governance

From fragmented data to planetary-scale systems: why FSA/MEBS represents a step-change in enterprise modeling

Optimizing retail operations through a practical data strategy

You may also like:

4 skill sets needed for a successful data steward

The pros and cons of the 4th industrial revolution

7 principles of data quality management