GenSynth Documentation

Root Cause Analysis

The Dynamic Confusion Matrix acts as the gateway for Root Cause Analysis. Here, you can navigate to different scenarios and analyze the root causes identified by GenSynth Explain for the decisions the model made about data samples it has clustered within a given scenario. It provides a high-level overview showing you:

  • Scenarios where your model has made correct decisions (along the diagonal elements of the matrix).

  • Scenarios where your model made incorrect decisions (in the non-diagonal elements of the matrix) , making it easy to identify scenarios where the model does not perform as well.

Each element of the matrix is associated with a specific case scenario where the model makes a particular decision (the predicted class) while a human makes a given decision (the true class).

image21.png

On the right side of the matrix you can see the percentage of correct decisions made by the model with respect to the total number of decisions made by a human towards each ground truth ('true') class (e.g., recall). Along the bottom of the matrix you can see the percentage of correct decisions made by the model with respect to the total number of decisions made by the model towards each predicted class (e.g., precision). Ideally, you will have a perfect diagonal (100% correct decisions). You can tell immediately which scenarios have better accuracy scores as their elements are the most darkly coloured (the darker the colour, the higher the accuracy). Darkly coloured non-diagonal elements indicate scenarios with higher concentrations of incorrect decisions, making it easier to identify and investigate case scenarios where the model does poorly.

Note

If your matrix is big enough that you can’t see all of it within the presentation window, you can pan through the elements by holding down the left or right mouse button and dragging the content into the window.

In theDynamic Confusion Matrix on theRoot Cause Analysis tab, hover over an element in the matrix to see a pop-upwith details about the predicted class, the true class, and how accurately the model makes decisions in this case scenario.

image23.png

If you click on the element, you are shown the group of samples GenSynth Explain has clustered into that particular case scenario. The sidebar gives you the True Class, the Predicted Class, and the number of samples shown. You can also choose the size of the samples you’d like to view.

Tip

If you attempt to select an element in the Dynamic Confusion Matrix and receive this error message, move the slider in the Explain section of the New Job tab to a higher number or percentage:

No samples were selected for root cause analysis for this element. To improve the number of samples that undergo analysis, please increase the sample ratio for root cause analysis.

Root Cause Analysis for Classification Models

When you select the scenario you want to analyze from the Dynamic Confusion Matrix, you will see a grid of samples GenSynth Explain has grouped into the selected scenario. This example shows samples pertaining to the scenario of ‘cats (class 13) correctly identified as cats’ .

image18.png

If you turn on Show overlay of Root Causes, GenSynthwill place a visual overlay onto the samples that highlights the key factors (i.e., root causes) that were most influential to the decision made by the model.

image13.png

Clicking on one of the displayed samples will lead you to a detailed analysis screen for that particular sample. This example shows the detailed analysis screen for a misidentified sample for class 13 (cats). The sample was incorrectly predicted by the model as class 48 (toasters).

image14.png

With the root cause visual overlay turned on (which can be turned on with the Show overlay of Root Causes toggle), you can see that the model focused on the region of the sample that had glare on the sort-of-toaster-shaped container and not the cat at all, thus leading to the misclassification.

image12.png

Knowing root causes helps to mitigate errors and biases inherent in the model and create one that is more accurate and dependable.

Using Prediction Confidences

The detailed analysis screen also provides you with more quantitative information about the model’s prediction confidences for an individual sample.

You can also sort how you want to view the confidence: alphabetically by class name, by confidence when including the root cause, and by confidence when excluding the root cause. If you have a large number of samples, you can choose to view the top 5, 10, or 25 classes the model considered for identification.

In this image, where the bird was incorrectly identified, you can see that when the model is focused on the identified root causes when making a decision, the model thought it was a cat (purple bar) with a fairly high prediction confidence; when the model is not focused on the identified root causes when making a decision, it predicted it was a frog (red bar) with a slightly higher prediction confidence.

image25.png

In this image, where the bird is correctly identified, you can see that when the root cause is included when making a decision, it is correctly identified as a bird with a very high confidence level. However, without the root cause, it was even more confident, incorrectly, that it was a dog.

image17.png
Root Cause Analysis for Object Detection Models

GenSynth Explain’s root cause analysis for object detection models is quite similar to that for classification models. However, while root cause analysis for classification models uses a visual overlay and prediction confidence visualizations to illustrate the key factors with the greatest influence over a model’s decision for a sample, root cause analysis for object detection models uses bounding box overlays to visually identify and illustrate the various scenarios where the model generates different results than those identified bya human labeller. This gives greater insight into a model’s decision about the key areas of influence where objects of interest are located in the sample (along with what these objects are) and how they compare to a human’s decision.

image7.png

Tip

The red boxes are the objects that a human has identified and their ‘true classes’, and the purple boxes are the objects that the model has identified, and their ‘predicted classes’.

When you select the scenario you want to analyze from the Dynamic Confusion Matrix, you will see a grid of samples that have been clustered by GenSynth Explain for this particular scenario.

Note

Since a single sample may contain multiple scenarios, Gensynth Explain will automatically display the sample with all exhibited scenarios.

You will also see the option Show Bounding Boxes. If you turn this option on, GenSynth will visually overlay ground truth bounding boxes for the objects of interest identified by a human, as well as the predicted bounding boxes for the objects identified by the model. This will allow you to identify errors and discrepancies between the decisions made by the model and the decisions made by a human for a given scenario.

Tip

If you have selected a cell with “Predicted Class: Unmatched” from the Dynamic Confusion Matrix, turning on the Bounding Boxes option will only show you the Ground Truth Bounding Boxes. Similarly, if you have selected a cell with “True Class: Unmatched” from the Dynamic Confusion Matrix, turning on the Bounding Boxes option will only show you the predicted class bounding boxes if the true class is unmatched.

This example shows all samples that GenSynth Explain clustered together where the model made mistakes by not detecting and identifying the cats in the scene as a human would, with the ground truth bounding boxes displayed.

image10.png

Clicking on one of the displayed samples will lead you to the detailed root cause analysis where you will find:

  • The ground truth and predicted bounding boxes overlaid on the selected sample.

  • The labels and confidences associated with each bounding box.

  • A list of all the different scenarios occurring in the sample, and options to enable and disable the display of bounding boxes for the individual scenarios.

image11.png

In this example, GenSynth Explain provides insights into three different scenarios happening in the sample:

  • The model correctly detected and identified the laptop.

  • The model did not recognize the cat lying inside the laptop.

  • The model did not recognize the chair in the background of the scene.

In the next example, GenSynth Explain reveals that the model correctly detected and identified the cat with extremely high confidence (0.986), but the bounding box identified by the model for the cat did not include the cat’s tail, and as such differed significantly enough from the bounding box identified by the human to be “unmatched”.

image15.png

Tip

If the labels overlap each other, you can click on a label to bring it to the front to read it more easily.

Bounding Box Options

To access the bounding box options, click the menu buttonimage22.png in the top-left corner, once you have selected the sample you’re interested in.

From here you can choose which of the different scenarios occurring in the sample you’d like to see, and whether or not you want the Predicted Class, the True Class, or both for those particular scenarios.

image24.png

This example shows the seven different scenarios occurring within the sample, with only the bounding boxes for correctly identified cats (i.e., true class: cat, predicted class: cat) being selected for display in the visual overlay.

Note

Even though the Show Box Label boxes are checked for the other options, no labels will appear without a Bounding Box.

image3.png

Tip

If the model did not identify a class, you will only be able to select the bounding box for the True Class. No option will be available for the Predicted Class since the model did not make a prediction for it.

Using Custom Labels

You can upload a comma-separated-value (CSV) file with the class labels you would like to see in your matrix. The filename must end in.csv or.txt.

Warning

Uploaded custom labels are shared by all users and will replace any labels uploaded by other users.

image1.png

Tip

Using custom labels makes it easier to remember which class or other type of label is which.

image3.png

Example:

image19.png

At this time, if you wish to remove the custom labels, you must upload a blank .csv or .txt file (one that only has an identifying number and no text labels) to overwrite the labels.

image3.png

Example:

image20.png