Enhance ML Models Explainablity using SHAP Visualization Tool

Overview

Machine learning explainability has become increasingly popular as it allows us to understand what happens in our model from start to end. These algorithms have become increasingly prevalent in decision-making for various industries.

Before moving further, we must know why we need the model explainability. We need to understand the Black Box Model and White Box Model initially.

Freedom Month Sale — Upgrade Your Skills, Save Big!

Up to 80% OFF AWS Courses
Up to 30% OFF Microsoft Certs

Act Fast!

Black Box Model vs. White Box Model

The black box model can be viewed in terms of its inputs and outputs without any knowledge/understanding of its internal workings. Implementing such models is “opaque” i.e., we, as clients, cannot understand what is happening behind the scenes.

Black-box models, such as deep-learning (deep neural network), boosting, and random forest models, are highly non-linear by nature and harder to explain. With black-box models, users can only observe the input-output relationship.

White-box models are models in which one can clearly explain how they behave, how they produce predictions, and what the influencing variables are. Two key elements make a model white box: features must be understandable, and the ML process must be transparent. Some white box models are linear and decision/regression tree models.

Model Explainability

Explainability in machine learning means that you can explain what happens in your model throughout i.e., from input to output. Explainability is the degree to which one can understand the cause of a predicted result or the decision that the model made.

However, there is a step further that we must be able not only be able to explain what happened but also why, i.e., how causal relationships were picked up between the predictor and the predicted variables.

Suppose we have a prediction model, then it must be understood what parameters the model is taking into account or if the model contains any bias. So, it is necessary once the model is deployed in the real world. Then, the model developers can explain the model. Therefore, Explainability is about understanding ML models better, how they make decisions, and why.

Importance of Explainability

Explainability connects the tech and non-technical teams, improving knowledge exchange and giving all stakeholders a better understanding of product requirements and limitations.

But there are at least five more reasons why ML Explainability is important:

Accountability:

Model explanation gives organizations more control over their ML models, as even a non-tech guy can understand which feature affects the results.

2. Trust:

In critical domains like healthcare, it is of utmost importance to gain the trust of industry experts who do not know how a model predicts.

This is possible with model explainability.

3. Performance:

Understanding how the model behaves can help us improve its performance because that is how you know what can be done to optimize the model.

4. Enhanced control:

Knowing the decision-making process of the model makes it easy to control. It provides the ability to act upon hidden flaws rapidly.

Let us now look forward to the SHAP technique.

SHAP

SHAP (SHAPley Additive exPlanations) is an effective tool for a deep understanding of what is happening behind the scenes in a black box model by visualizing its output.

shap

Source: https://shap.readthedocs.io/en/latest/

Shapley’s value assumes that we can compute the value of the surplus with or without each analyzed factor. The algorithm estimates each factor’s value by assessing its contribution values. In the case of Machine Learning, the surplus is a result of our algorithm, and co-operators are different input values. The goal of SHAP is to explain the prediction by computing the contribution of each feature to the final result.

SHAP procedure can be applied, e.g., using a dedicated Python shap library. We can choose from three different explainers.

TreeExplainer – for trees-based algorithms.
DeepExplainer – for the deep learning algorithms.
KernelExplainer – for most other algorithms.

The prediction starts from the baseline. The baseline for Shapley values is the average of all predictions. Each feature value is a force that either increases or decreases the prediction. Each feature pushes the model output from the base value to the model output. Features pushing the prediction higher are shown in red, and those pushing the prediction lower are in blue.

Advantages of SHAP:

The method is solidly grounded in mathematics, so we can be certain that it is unbiased.

Disadvantages of SHAP:

Computation time: The number of possible combinations of the features exponentially increases as more features are added.

Implementation

Step 1 – Installing the package.

Step1

Step 2 – Importing the required libraries and the Boston housing dataset and passing it to the xgboost regressor.

Step2

Step 3 – A glimpse of the dataset.

Step3

Step 4 – Using the SHAP functions to get model parameters contribution.

Step4

Step 5 – Features contributing the prediction higher are shown in red, and those contributing the prediction lesser are in blue. There could be several ways to visualize such results. Below is a waterfall plot visualization.

Step5

Conclusion

Model explainability is an important technique that could help non-tech teams or domain experts understand how the ML/DL model works. This article focused on understanding the SHAP framework for model explainability. In future articles, we will dig deep into SHAP and understand how the SHAPley values are calculated.

Freedom Month Sale — Discounts That Set You Free!

Up to 80% OFF AWS Courses
Up to 30% OFF Microsoft Certs

Act Fast!

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. Why do we need Model Explainability?

ANS: – With model explainability, we can understand the model in and out, out of all the features, which feature affects the result, and to what extent. With Model explainability, we can make our model representable to industry experts and get insight into whether the model is working fine or not.

2. What are different Model explainability techniques?

ANS: – Different model explainability techniques are listed below:

SHAP: Shapley Additive Explanations.
LIME: Local Interpretable Model-Agnostic Explanations.
Prediction Difference Analysis (PDA).
TCAV: Testing with Concept Activation Vectors.

WRITTEN BY Parth Sharma

Parth works as a Subject Matter Expert at CloudThat. He has been involved in a variety of AI/ML projects and has a growing interest in machine learning, deep learning, generative AI, and cloud computing. With a practical approach to problem-solving, Parth focuses on applying AI to real-world challenges while continuously learning to stay current with evolving technologies and methodologies.