data quality and data governance

We often hear of data quality and data governance belonging together. That one cannot have good data quality without data governance. And that by doing data governance we achieve data quality. How so? What does that mean?
What is the relationship between data governance and data quality? Or are they the same thing? Let's find out.

Data quality management

Let's look at data quality first. Data Quality or Data Quality Management to be more exact is focused on ensuring the data adheres to our data quality dimensions. In other words, that data is:

  • Complete
  • Valid
  • Accurate
  • Timely
  • Consistent
  • Etc.

Data quality has a few dimensions and I will cover that in a separate article. But to clarify, data quality management ensures that our data adheres to these dimensions.

As Dr. Peter Aiken puts it, data quality ensures our data is "fit for purpose". Or To put it simply, data quality management ensures that we have data of good quality, data that is clean.

Let's expand our understanding of data quality management and look at it from the point of view of DAMA international. Data quality is one of the 11 data management domains identified by the Data Management Association International:

DAMA-DMBOK2 Data Management Framework

Data architecture
Data modeling & design
Data storage & operations
Data security
Data integration & interoperability
Document & content management

Reference & master data

Data warehousing & business intelligence

Metadata

Data quality

Data governance

DAMA wheel data quality

According to DAMA, Data Quality Management consists in “the planning, implementation and control of the activities that apply quality management techniques to data, in order to assure it is fit for consumption and meets the needs of data consumers.” 

Data Quality actually has a role in most of the other data management domains and the other way around. Think about it... you can't have good data quality if you want to ensure data security, metadata's relationship with data quality is also a two-way street, data architecture will also play a role in the quality of data, but also the other way around, and so on. We can go around the DAMA wheel and find a data quality influence in each one of those areas. So you need an enabler, a connector to ensure all these dm practices come together. That connector is Data Governance.

Data quality with or without data governance

I think this comes to no surprise that there are different definitions on what data governance is. Check out my other article on data governance to go over that or the one about data governance and data management to get more details and a better understanding on what each one of these is. But if you'd like to take a shortcut, data governance is “the discipline which provides the necessary policies, processes, standards, roles and responsibilities needed to ensure that data is managed as an asset.”

By now you might say, "Ok, I kind of get it, but we have data quality without having data governance." If that's the case I think there's 2 possible realities:

  1. There is no data governance
  2. Data governance exists, but it is undercover

Let's look at these two cases in more detail.

1. There is no data governance

If data governance is inexistent, then we'll probably encounter one, some, or all of these cases:

1. Data quality is not enterprise-wide: the data quality initiative/ program is most likely note enterprise wide. Even if it's focused around an enterprise system such as an ERP or a CRM, that does not mean data quality is enterprise-wide. Sure, it might have a wide reach across the enterprise, but it is localized to a system and that's an issue. It is an issue especially in larger organizations because there might be data quality rules in place that were created only with the input of the stakeholders of this enterprise system, but then this could affect those that are not stakeholders or users of the system. 

2. Data quality efforts are localized: if data quality is not localized to a system, it is probably localized to a particular department or departments. And that can create other issues which I won't go into details right now, but some of them are outlined in the following points.

3. There's a lack of common standards: data quality standards might be created, but only as they pertain to the needs of a particular department, business unit, or system. As soon as other departments or systems get onboarded into the efforts of the data quality program, there will be conflicts in terms of these standards which will require workarounds or complete changes.

4. There are no clear roles and responsibilities: there might be assigned resources which take responsibility for cleansing data and for maintaining it, but the type of resources differ from one team to another. There might also be unclear of who owns what and who has ownership when conflicts in standards, definitions, and priorities happen.

5. Data quality management is mostly reactive: this is mostly the sign of data quality program in its early stages, but also one without data governance. Data quality issues are identified and dealt with in a reactive way, not always tacking the root cause of the problem.

2. Data governance exists, but it is undercover

You might actually say that "well, we're not in such a bad place as you've just described it above". You might actually have:

  • A data quality policy
  • Data quality standards
  • Data quality metrics & KPIs
  • Defined roles and responsibilities
  • Defined processes, procedures, etc.

Then you probably have what I like to call "undercover data governance". You probably have a lot of the pieces of data governance, but without that defined data governance organizational framework, without formalizing data governance. You have more data governance elements than you think you do, which is good, it's a good place to be in as you can use this momentum and work already done to start formalizing data governance.

Data quality and data governance relationship

I think I've already made it clear from the previous section that data governance and data quality exist in a symbiotic relationship. They are two sides of the same coin. You can't have good data quality without data governance, and a data governance implementation must be really ineffective to not address data quality.

There is actually quite a bit of an overlap between data quality and data governance, such as in:

Data rules

Data standards

Data auditing

Data validation

Ongoing evaluation

Data enhancement

Data quality dimensions

Metrics & KPIs

Reporting

Prioritization

Ongoing improvement

Processes & procedures

Communication & change management

data quality and data governance overlap

Data governance describes who needs to do what, to what data, under what conditions, and what processes, procedures, tools, and overall best practices to use. So a lot will beneficially impact data quality, but not only. The standards, metrics, roles and responsibilities, data rules and so on will benefit data quality, hence the overlap, but not just data quality. There's a direct benefit to master data management, data accessibility, data integration, metadata management, business intelligence, even data security, and so forth and so on.

Of course, there are also areas only pertaining to data quality such as: data profiling, data matching, root cause analysis, and data cleansing. As there are on the data governance side: data accessibility, data compliance, data policies, roles & responsibilities.

Conclusion

Many times data quality is one of the drivers of data governance and that's the initial focus of a data governance program hence maybe the confusion between the two. But again they are not the same. They are two sides of the same coin and you can't have one without the other.


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

About the author 

George Firican

George Firican is the Director of Data Governance and Business Intelligence at the University of British Columbia, which is ranked among the top 20 public universities in the world. His passion for data led him towards award-winning program implementations in the data governance, data quality, and business intelligence fields. Due to his desire for continuous improvement and knowledge sharing, he founded LightsOnData, a website which offers free templates, definitions, best practices, articles and other useful resources to help with data governance and data management questions and challenges. He also has over twelve years of project management and business/technical analysis experience in the higher education, fundraising, software and web development, and e-commerce industries.

You may also like:

How to Become a Data Science Freelancer

George Firican

12/19/2023

Data Governance in 2024

Data Governance in 2024
>