When starting a data governance program, one needs to do a data governance assessment and understand current status, challenges and priorities. This as-is analysis will help you put together a business case for data governance and understand what you should first tackle.
In this data governance assessment we'll uncover:
- The main pain points
- Who our stakeholders are
- The as-is technical, information, and data landscape
Assessment: main pain points
In the ad-hoc assessment you want to first understand the pain points. I personally like to note these pain points from a data perspective, but at the same to understand what's the impact to the business. So looking at this from a data perspective I recommend categorizing your assessment by 3 streams:
- Data acquisition/ creation
- Data maintenance
- Data dissemination
I recommend this order because at a high level this is the process that data goes through. Before I cover each one of these, please note that you can also include "data destruction" and data archival as a stream, but those pain points don't tend to be as dire as the other streams.
When to include "data destruction"
I recommend including the 4th stream that of "data destruction/ data archival" if your main driver for your data governance program is a regulatory compliance. Most likely you would need to have your data governance program's immediate focus on data retention policies and procedures, the right to be forgotten in the case of GDPR and so on.
How to gather data for your assessment?
I would start in an informal fashion through meetings and interviews, and this could be in person or via email or a Zoom/Teams session or a phone call. You can also run a survey which has a more formal connotation and a lot easier to analyze the data collected.
Some might prefer job shadow someone for a few days. Even though that's insightful, it can take quite a bit of time and it can also be a bit nerve wracking for the person that you're shadowing.
My preferred method is by organizing some drop-in sessions and workshops where stakeholders come in and share their challenges as they pertain to these streams. Here are some high level recommendations to have a successful workshop series:
- Include stakeholders from different business functions and departments
- Depending on the size of the organization, conduct more than one workshop
- Provide people with an agenda and the necessary materials at least 2 weeks ahead
- Ask them ahead of the workshop to start thinking of their current challenges
Example of main pain points
Here are some examples of main pain points, from a data perspective, that you might gather from your stakeholders:
- Multiple data sources
- Manual processes
- Lack of standards
- No defined processes
- Redundant efforts
- No data validation
- Missed opportunities
- Mostly manual
- No data classification
- Data integration issues
- Little or no data cleansing
- De-duplication issues
- Not timely enough
- Lack of data accessibility
- No "golden record"
- Data outputs are not repeatable
- Invalid, incomplete, inaccurate, inconsistent data
- Data cleansing happens after the data pull
- Lack of definitions
How to use collected information
As part of the same workshop through which this information was collected, ask stakeholders to prioritize those issues that should be tackled first. You can then tally out the votes and see from the working group's perspective, which should be tackled first. The groups are not the ultimate decision maker though, but it's good to get them involved even though you might come back to them and say, "Well, the lack of standards is high on your list but in order to tackle that we need to address this other thing first". It's all about getting them involved and get them onboard to what the next action items should be.
The next on the list of the ad-hoc assessment is the people. You should try and gain an understanding of
- Who the sponsors are
- Those that are most affected,
- The champions
How will you identify all of these? Through the same methods that I've mentioned before in identifying the pain points. Through those meetings and workshops for the most part.
When you're doing this data governance assessment you might already have a sponsor that's tasked you with putting this together, but there are many instances when that's not the case and the sponsor is to be determined. A data governance program, in order to succeed needs to have a sponsor. But guess what? There are programs that have multiple sponsors and that can be even better. If you have 3 major lines of businesses, it's great to have a sponsor from each line. It's like having your program endorsed three times. Yes, there could be political challenges between the 3 and that's why securing good sponsors is key.
The most affected
Here I'm referring to identifying the most affected stakeholders by the current status of your data and the lack of a data governance program. For this it's good to look at who is creating the data, who is managing it, who is ensuring its data quality, security and so on. But also who is consuming information that's based on data, who's analyzing it, who has business processes that is dependent on data and ultimately who is complaining the most?
These are people you want to have in those workshops mentioned above. You should also keep engaging them as part of your data governance implementation and well after.
Ah, the champions. These tend to be the unsung heroes that go above and beyond and care about the quality of the data. Please note that these are not always part of IT, it can be people on the business side that take it upon themselves to manually correct the data they work with. These champions are people that have a lot of business knowledge and understanding why things are the way they are and how they should be. These are people that tend to create workarounds for the current situation in order to meet their needs.
Yes, these tend to be data stewards, but without having the title of data steward nor the responsibilities in their job description. These champions are people that are already voicing the importance of managing and governing data. These are your biggest supporters. They don't need any convincing as to why data governance is needed and moreover they will be there to help promote it. So make a list of these individuals.
Assessment: technical, information, and data environment
This last part of the assessment aims to understand the technical, information, and data environment. It's not very important to understand all the details at this point, but it's a nice to have as it can help during your data governance program implementation. I recommend breaking this down by:
- Data sources/ systems
- Data management and data governance tools
- Other artifacts
Data sources/ systems
Here is good to be aware of what systems and databases are currently in your environment. There are different asset management tools that can track all this, but if there isn't one you can always track it in a spreadsheet and here is a template that I use.
Besides this I also recommend a data flow. A high level diagram such as the one below could give you an understanding of the different data sources and systems in your ecosystem and how they are interacting with one another.
If you don't explore this at this point, this is an artifact that you can put together at a later date as it will help you identify some of the technical data stewards, system and business owners that should be engaged for various projects.
Data management and data governance tools
The next on the list are those data management and data governance tools such as the business glossary, data dictionaries, data catalog. Do they exist? For the most part, probably not, but is there anything on
- Data lineage
- Data profiling
- Data classification
- Reporting tools
- Data visualization tools
- Data security and so on
This is just to have a high level awareness that is just a nice to have. Ok, I think this is a good time for me to reinforce this message that data governance is a business function and I know that all these listed above are technical, but they just support the application and enforcement of data governance.
Lastly any other artifacts, such as a report catalog, data model, scorecards, data quality standards. Most likely these won't exist, but it's good to confirm.
You also want to know how much information is documented and how much lives in people's heads. If things are documented, how are they managed and who is maintaining them. Chances are that most of these, if they exist, are being maintained by those champions you've identified, but not shared broadly.
For this ad-hoc data governance assessment, you don't need to do these sequentially, one after the other. You usually start to do all of these in parallel. Lastly, if you have some funding to put into the assessment, I also recommend to do a data profiling exercise to complement all of your findings. This will add in that extra level of detail and note some of the data quality issues at a database level.
Do you want to learn more?
Practical Data Governance: Implementation - online course
Learn how to implement a data governance program from scratch or improve the one you have.
Other data governance assessment method
This is not the only way of doing a data governance assessment. Another way is through a governance maturity model. If you'd like to learn more about the data governance maturity model, check out the following resources:
- Learn how to select a data governance maturity model
- Uncovering data governance maturity models - a free webinar
- 5 main reasons to leverage a data governance maturity model
- The series of articles on specific data governance maturity models
- Learn all about Data Governance Maturity Models - Online Course