What is fault tree analysis?
This quick guide provides an overview of the basic concepts in fault tree analysis technique, as it applies to data quality. For some more well-known and useful root cause analysis techniques, please check out the:
The fault tree analysis is a top-down, deductive failure analysis that analyzes the undesirable state of a scheme using Boolean logic to combine a sequence of lower-level occurrences.
Fault Tree Diagram, Negative Analytical Tree -though technically, the Fault Tree Analysis outputs the diagram/ tree
The technique is used mainly in aerospace, engineering and high-hazard industries, but also in software engineering for debugging purposes and determining data quality issues and their causes. The main output is a Fault Tree Diagram (FTD). It is a top-down approach to show the pathways within a system that can lead to a foreseeable, undesirable failure – in our case, a data quality issue. The pathways connect contributory events and conditions, using standard logic symbols (AND, OR, etc.). At the very basic level, the constructs in a fault tree diagram are:
- gates/ conditions/ logic gates (all synonyms), and
This is oversimplified, but you get the idea. By the way, there are a few other gate types and diagram elements that can be used. If you’re interested in learning more, you can check out the Fault Tree Analysis: A Bibliography from the NASA Scientific and Technical Information (STI) Program.
The basic concept was developed at Bell Telephone Laboratories in 1962 by H.A. Watson, under contract for the US Air Force for use with the Minuteman system. Fun fact within a fun fact: Minuteman system refers to the Minuteman I Intercontinental Ballistic Missile (ICBM) Launch Control System. All the fail safe needed to be in place for this. The technique was later adopted and extensively used by Boeing and the rest is history.
When to use the fault tree analysis
- When needed to understand the logic leading to the data quality issue
- To show compliance with the data quality requirements
- Prioritize the resolution of the causes leading to the top event, i.e. the data quality issue
- If you need to create a diagnostic processes for a data quality resolution
- A highly structured and graphical representation of causes and events leading to the data quality issue
- Can effectively be used for analysis of recurrent and persistent data quality issues, because such issues tent to have common causes
- Good visualization for presenting issues to stakeholders
- If a wrong cause is identified, subsequent causes in the tree might be erroneous or invalid and time is wasted exploring that branch of the tree
- If there are too many branches and levels, it might be hard to keep track.
Avoid the pitfalls of bad data quality. Here are the 4 myths about data quality everyone thinks are true.
Steps to develop it
1. State the data quality issue: This is the issue for which you will determine the causes. Ideally you will have a different fault tree diagram for each system/ process that you want to examine
2. Determine top level faults: Brainstorm the main categories with the subject matter experts in the system/ process.
3. Identify causes for top level faults: Brainstorm the main reasons for bad data quality and point them to the above top levels
4. Identify next levels: For each cause, see if the tree goes deeper and keep on adding levels.
- To see what to focus on the most, try to add probabilities of occurance to each event
- Besides the tree, make sure you also take plenty of notes in order to capture the details and the context of each finding
- To go in deeper and have a more granular approach, you can apply the “5 whys? technique”
There are several tools you can use to draw a fault tree diagram, besides the classic whiteboard or flipchart and marker:
- MS Visio – FREE template/ example provided above