As part of a sustainable data quality program, you need to identify the issues of your bad data. Otherwise you will keep spinning your wheels and using your resources to constantly correct the issue, but never addressing the cause(s). There are many data quality root cause analysis techniques you can use and I will start covering the most important ones in future articles. This one will focus on the fishbone diagram, in the context of data quality.
To get you started, I’m offering a free fishbone diagram template focused on bad address data, at the end of this article.
Tool used to identify the possible causes for an issue, in our case a data quality issue.
Ishikawa diagram, herringbone diagram, cause-and-effect diagrams, or Fishikawa
The diagram focuses on the multiple root causes for one data quality issue. Each root cause or reason for bad data quality is added to the diagram and grouped into categories to identify and classify these causes. The end result looks like a fishbone, hence its name. The contour of the fish is optional.
The basic concept was first used in the 1920, but the actual fishbone diagram was popularized in the 1960s by Kaoru Ishikawa who pioneered quality management processes in the Kawasaki Shipyards. The main issue is recorded on the right side of the page because traditional Japanese script reads down a vertical column from right to left across the page.
When to use the fishbone diagram
- While conducting a workshop to identify the possible causes of the issue
- If the team’s thinking is not going towards a root cause
- A highly visual brainstorming tool which can spark further examples of root causes
- Quickly see if the root cause is found multiple times in the same or different causal tree
- Allows you to see all causes simultaneously
- Good visualization for presenting issues to stakeholders
- Complex issues might yield a lot of causes which might become visually cluttering
- Interrelationships between causes are not easily identifiable
Avoid the pitfalls of bad data quality. Here are the 4 myths about data quality everyone thinks are true.
Steps to develop it
1. State the data quality issue: This is the issue for which you will determine the root causes. Note it on the right of the diagram (i.e. the head of he fish). Draw a horizontal arrow from the left of the diagram to the right, pointing it to the data quality issue. This is the fishbone spine.
2. Determine the categories: Brainstorm the main categories of the root causes. If you need a helping hand, start with the following: tools, employees, processes, standards, data sources. Point each category into the fishbone spine.
3. Add root causes: Brainstorm the main reasons for bad data quality and point them to the category they pertain to. If they pertain to multiple categories, add them in each one.
4. Add root sub-causes: Identify why each cause happens and add them as sub-causes.
Here is a fishbone diagram example covering the different root causes of bad addresses. The boxed red words represent the groupings (ex: Tools, Employees, Data sources) and the “boxless” red words represent the main root causes (ex: Free text fields, No training, No ISO adoption).
- Make sure the causes and sub-causes are based on facts, not opinions or speculations
- Even though you can have as many categories as you need, it is recommended not to go into more than 10, if possible
- You can go more than 2 levels deep with your causes
- To identify each root cause, you can apply the “5 whys? technique”
There are several tools you can use to draw a fishbone diagram, besides the classic whiteboard or flipchart and marker:
- MS Visio – FREE template/ example provided above
- Realtime board