what is a data domain

Determining your data domains is an important part of your data strategy. So what is a data domain? 

It actually can mean a couple of things, depending if we look at it from the point of view of data management and database management, or if we look at it from the point of view of data governance. Or think of it as looking at it from the technical side or the business side. And you might say, "George why do we care about both? Let's just focus on the data governance side". Well, I think that you need to be aware of both.

Even if you're in data governance, you need to understand the technical side because otherwise when you'll be talking to those technical data stewards and data custodians and IT, they might use the term differently than you. Even when you're talking to vendors, I think it's good to understand both views of the terms. Yes, I know, it's frustrating for the same term to have different meanings, but working in data governance it's something you'll get used to as that's one of the things that data governance will  try to do, clarify these differences.

Data domain (database management)

From a database management point of view, or better yet, a data modeling point of view, a data domain represents the collection of values that a data element may contain. A better way to understand this is through an example. Imagine an online form with a drop down field that we might encounter in a form that we fill in. Let's take that as the gender. 

When we click on that drop-down we might be getting some options, such as the following:

  •  Male
  • Female
  • Non-binary
  • Not specified

Of course there could be other options, depending on your definition for gender. That's not the point. The idea is that we would have these fixed options. When we record this in a table of a database, the value assigned to gender can only be one of these 4 values. So we say that the data domain for the gender column is "male", "female", "non-binary", or "not specified".

GENDER_TABLE

MALE

FEMALE

NON-BINARY

NOT SPECIFIED

Data domain (data governance)

From a data governance perspective, data domain means something else. Here, a data domain is "a logical grouping of items of interest to the organization, or areas of interest within the organization".

You can think of data domains as high-level categories of data for the purpose of assigning accountability and responsibility for the data. By the way, a data domain is also called "subject area", or a "data concept" so you might encounter either. Within data governance, they both refer to the same thing. 

Just to note that some are refer to the data domain to mean the same thing as a data set. That's not accurate as a data domain can contain multiple data sets as long as those data sets represent the same area of interest within the organization.

If this is still clear as mud, let's look at some examples. 

Data domain examples

  • Customer
  • Product (or Service)
  •  Location
  • Vendor (or Supplier)
  • Transaction (or Order, or Sale)
  • Legal

An average organization would have anywhere between 5-10 AND they aren't always these, though these are usually the most common ones. In the end it really depends on the industry that you're part of.

Let's look at some industry specific data domains. 

In the education sector, you might have:

  • Student
  • Research
  • Faculty
  • Alumni
  • Advancement

In the healthcare sector, you might have:

  • Patient
  • Facility
  • Medical procedure

In the insurance sector, you might encounter:

  • Provider
  • Member

In any of these sectors you could also have some of the previous data domains as well. So for example, I'm sure that all 3 sectors would all have "Location", "Transaction", and "Legal" as data domains.

Data sub-domain

There's also the concept of a data sub-domain. Typically each data domain will have anywhere between 3 to 10 data sub-domains.

What is a sub-domain? It's simply a way to divide that data domain even further into other categories.

There are some considerations, though:

  • The sub-domain is unique
  • There's a 1 to 1 relationship between these data domain and data sub-domain
  • It inherits the characteristics

Data sub-domain examples

Let me provide you with some sub-domain examples to some of the data domains mentioned above.

Customer

  • Individual
  • Corporation
  • Government
  • Charity
  • Group
  • Household

Vendor

  • Vendor specification
  • Pricing
  • Service level agreement

Location

  • Site
  • Geographical area
  • Building
  • Office
  • Warehouse
  • Outdoor space

Conclusion

What you should remember is that these data domains, and sub-domains, are a way of grouping the most important data of an organization and they go across business units & systems. So for the same domain you might have different stakeholders from different lines of business and departments and the data can be found in different systems, can be produced by different systems, or consumed by different systems.

That being said, the reality can also be a bit more complicated and when data doesn’t perfectly slot into one subject area or another, data can be associated with more than one domain. This is not a recommended approach, but it sometimes is unavoidable.


  • Would you provide an example where there is not a 1 to 1 relationship between the data domain and data sub-domain?

    • There should always be a 1 to 1 relationship, but for example, you could have a “Marketing” sub-domain which would encompass marketing related data (ex: web analytics, different types of conversion rates and reach, marketing segments, and even a bunch of data analytical scores). This could be a sub-domain of Customer or Product. It could go under either.

  • Thanks for this George, as always very clearly and simply explained so easy to understand. Love your YouTube channel as well.

  • Very clear and useful. Than you!!

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    About the author 

    George Firican

    George Firican is the Director of Data Governance and Business Intelligence at the University of British Columbia, which is ranked among the top 20 public universities in the world. His passion for data led him towards award-winning program implementations in the data governance, data quality, and business intelligence fields. Due to his desire for continuous improvement and knowledge sharing, he founded LightsOnData, a website which offers free templates, definitions, best practices, articles and other useful resources to help with data governance and data management questions and challenges. He also has over twelve years of project management and business/technical analysis experience in the higher education, fundraising, software and web development, and e-commerce industries.

    You may also like:

    Is data governance the right role for you?

    George Firican

    01/30/2023

    Data management framework 101

    Data management framework 101
    >