Fixing data quality issues is only part of the solution. In order for these not to reoccur, one needs to identify the root cause and prevent from further creating poor data quality. As we know, for every effect there is a cause, and that’s the basis of the root cause analysis. The chain of relationship between the cause and its effect can vary in length, but as one moves along the chain, the cause an effect becomes finer and finer until you get to the root. There are different techniques to identify the root cause, 2 of which I’ve covered earlier: the 5 whys technique and the fishbone diagram.
This week I’ll go over the barrier analysis.
A root cause analysis technique to help identify both the pathways through which a hazard (in this case a cause of poor data quality) can affect the quality of the data and a measure through which the quality of the data can be maintained.
There are four basic elements in the barrier analysis:
- Target = In the context of data quality, it represents the desired level of quality for the chosen data set
- Hazard\ Threat = This is the way in which the target can be harmed. In the context of data quality, this represents the agent that can adversely affect the desired state of data quality.
- Barrier = A prevention method between the hazard and the target in order for the hazard to not have an undesired effect on the target. This can be active (i.e. it’s protective nature needs the actions of an agent – such as a Data Quality Coordinator) or passive (i.e. no additional action on the part of any agent is required – such as an address cleansing API). Note: in some versions, a Barrier is passive and a Control is a synonym for an Active Barrier.
- Pathway = A route or mechanism through which a hazard can undesirably affect a target.
The barrier analysis is tied to the Swiss cheese model which is a barrier analysis with multiple barriers – each represented by a slice of Swiss cheese 🧀. Why Swiss cheese? Because each potential risk of failure within a barrier, or control, is like a hole in a Swiss cheese.
When to use
- To determine the causes for poor data quality along with the data lineage
- When you want to create an inventory of the sources of data and data lineage from the perspective of data quality
- When you need to identify what countermeasures failed to prevent undesired change
- A simple technique to learn, which does not require training
- Findings can be easily transformed into corrective action recommendations
- Works well with other methods and techniques, such as the Pareto analysis, fishbone diagram and the 5 whys
- Poses a risk of promoting linear thinking
- Can be subjective and dependent on the views and knowledge of the participants
- The findings might not be repeatable if you go through the same exercise with other stakeholders
How much is poor data quality costing your organization? Here’s how you can estimate that in 5 simple steps
Steps to develop it
- Identify the main target: Identify the main data quality issue that you would like to uncover its root cause for.
- Gather main stakeholders: Once the data quality issue is identified, identify the main stakeholders affected by the issue or taking part of the processes creating and preventing the issue.
- Identify the barriers: Start documenting the barriers you are aware of that prevent the issue from happening, but also what barriers might be in place that facilitate the issue. For simplicity, you can also determine the categories they can fall under. For example: training, tool, data source, process, standard.
- Determine solutions: Go through each of these barriers and understand the cause and effect of each, which might identify further barriers, but also solutions. Here are some of the questions which should be addressed for each barrier:
- Did the Barrier perform its intended function under normal operating conditions?
- Did the Barrier perform its intended function under the upset or faulted conditions?
- Did the Barrier mitigate the Hazard severity?
- Was the Barrier design adequate?
- Was the Barrier developed to meet the desired specifications?
- Was the Barrier maintained?
- Has the Barrier yielded the desired test results?
- Review the barrier analysis results independently
- Don’t solely use this technique to determine the root causes, but in conjunction with others
- Classic whiteboard/ flipchart
- Visio diagram
- Word document
Great post on how to use barrier analysis for improved data quality! Here are a few actionable tips to help organizations improve their data quality:
Identify the barriers: Start by identifying the barriers to data quality, such as outdated systems, lack of standardization, and poor data management practices.
Assess the impact: Evaluate the impact of these barriers on the accuracy and reliability of your data. This will help you prioritize which barriers to address first.
Develop a plan: Develop a plan to address the barriers and improve data quality. This may involve implementing new technologies, changing processes, or providing training to employees.
Implement and monitor: Implement the plan and monitor its effectiveness. Continuously monitor and assess the quality of your data to ensure that it remains accurate and reliable.
Collaborate: Collaborate with other departments and stakeholders to ensure that everyone is working towards the same goal of improving data quality.
By following these tips, organizations can effectively use barrier analysis to improve their data quality and make better informed decisions. Thank you for sharing your knowledge and expertise with us!