The importance of identifying and addressing the root cause of a data quality issue should never be overlooked. In this series of articles I will cover the most important techniques which help you uncover the root cause. This week will focus on the 5 whys, root cause analysis technique.
Definition
Iterative interrogative technique to determine the root cause of a particular issue.
Description
Very popular in the world of lean development, this questioning technique keeps asking the “Why?” question to go deeper into the causes of the issue until it reaches the root cause. The answer to the fifth “why” usually uncovers a broken process, procedure, or policy.
Fun fact
Developed in the 1950s by Taiichi Ohno, the architect of the Toyota Production System. Ohno encouraged his team to dig into each problem that arose until the root cause was identified. He usually stated: “Ask ‘why’ five times about every matter.”
When to use
- While conducting a workshop to identify the possible causes of a simple issue
- If you need to isolate a single root cause, not multiple
- When you can easily identify the stakeholders and subject matter experts tied to the issue
- When you want to get an initial insight and a starting point for at least one cause of an issue
Pros
- A simple tool which does not require training
- Works well with other methods and techniques, such as the fishbone diagram
- If you need to isolate a single root cause, not multiple
- When you can easily identify the stakeholders and subject matter experts tied to the issue
Cons
- The answers might not be repeatable if you go through the same exercise with other stakeholders
- Most effective when answers come from stakeholders involved in at least one step of the process
- There’s a tendency to single out one root cause, even if there might be multiple
- Does not work well for complex problems – those need a more detailed analysis technique
Do you like working on data quality improvement projects? Here are the 3 data quality projects a data steward should work on.
Steps to develop it
1. Gather main stakeholders: Once the data quality issue is identified, identify the main stakeholders affected of the issue or taking part of the processes creating the issue. For example, if the issue is “there are too few customer emails”, the stakeholders might be: data stewards, IT, marketing, finance, etc. – depending on the processes through which this data is collected, maintained and disseminated.
2. Select a session leader: Each session should randomly select a leader in order to ask the 5 whys. You will see that asking this question more than once on the same issue can start to seem aggressive. The purpose of not always having the same leader is to defuse any potential tension as much as possible. All they need to do is ask the question and take notes. Alternatively they can designate someone else to take the notes. Sometimes a facilitator will be beneficial for some of the more difficult topics.
3. Ask “why?” five times: Each question might offer multiple answers. You can choose to go down deeper with the next “Why?” question into each of these answers or select the one it seems to be the biggest culprit. If the data quality issue persists, revisit the other answers in a future session. Make sure the answers are:
- based on facts and knowledge
- based on processes, not people – For example, you don’t want the answer to be “Because John does it that way.”
4. Determine solutions: Go through the answers of the deepest levels and come up with the corrective actions. Responsibilities will be assigned as part of your internal data quality stewardship and procedure.
Example
Here is a 5 “Why?”s example identifying the root cause of why there are no more customer emails in your CRM database.
Tips
- You don’t need to stop at 5. You can ask “why?” a few more times until you got to the root of the problem
- Don’t jump to conclusions once you hear each answer. Instead, move quickly to the next “why?”
- Instead of the simple “why?” question, you can ask “why do you think this is happening?”
- Pick stakeholders who know the process very well in order to get the best answers
Tools
There are no specific tools you need to use to document your finding as simple note-taking would be enough. It is usually recommended to use a physical whiteboard and marker. If you identify multiple root causes, you can use the fishbone diagram to visualize them. Furthermore, you can use Pareto analysis to help identify the top portion of causes that need to be addressed.