Comet Launches Prompt Management Tools to revolutionize your Large Language Model workflow.

# Model Interpretability Part 3: Local Model Agnostic Methods Source: datarevenue

If you haven’t already had a read of the other parts in this series, check them out:

To recap from Part 1:

Local Interpretability aims to capture individual predictions. It focuses on the specific understanding of a data point and be exploring the feature space around it. This allows us to understand the model’s decisions, allowing for better interpretability.

## Local Methods

Local interpretability cares little or not at all about the structure of the model, and is treated as a black-box model. Understanding the distribution of data and its feature space at a local level, rather than a global can give us a more accurate explanation.

In this article, I will be going through three different types of local model agnostic methods.

# Local Surrogate (LIME)

If you read Part 2 of the Model Interpretability series, you will remember Global Surrogate. Global Surrogate is a type of interpretable modeling that is trained to approximate the predictions of a black-box model.

However, Local Surrogate, also known as LIME, which stands for Local Interpretable Model-agnostic Explanations is different from Global Surrogate. Where Global aims to explain the whole model, local trains interpretable models to approximate the individual predictions.

The idea of LIME originates from a paper published in 2016: “Why Should I Trust You?” Explaining the Predictions of Any Classifier, in which the authors perturb the original data points, feed these data points into the black-box model, and then make observations on the outputs.

The method then uses those new data points and weights them, using it as a function of their proximity to the original point. Each of the original data points can be explained with the new trained explained model.

The learned model can be used as a good approximation of the machine learning model predictions locally, calling this type of accuracy local fidelity.

This can be mathematically expressed as:

• X measures how close the explanation is to the predictions of the original model
• L is the minimum loss, such as the mean squared error
• f stands for the original model, for example, XGBoost model
• g stands for the explanation model for instance x
• πx is the proximity measure used to define how large the neighborhood around instance x is that we consider for the explanation.
• Ω(g) is the complexity of the model
• G stands for the family of possible explanations

## An example:

Let’s look at the bike sharing dataset, which can be downloaded from UCI Machine Learning Repository. The dataset contains daily counts of rented bicycles from Capital-Bikeshare, a bicycle rental company in Washington D.C. It also includes data on weather and seasonal information, where the overall goal is to predict how many bikes will be rented depending on the weather and day.

In this example, a random forest has been trained with 100 trees on the classification task, aiming to answer this question: Is there a particular day where the number of rental bikes is above the trend-free average, based on weather and calendar information?

The results show that the warmer temperature and good weather features have a positive effect on the prediction. The x-axis is defined as ‘effect,’ this is the weight times by the actuarial feature value, calculating the feature effect.

• The LIME method works for tabular data, text, and images.
• LIME is easy to use and is implemented in Python, using the lime library and R, using the lime package and iml package.
• They make human-friendly explanations, for example when using short trees, the results are short, easy to explain, and contrastive.
• The fidelity measure, which measures how well the interpretable model approximates the black box predictions helps us to understand how effective and accurate the interpretable model is in explaining the black box predictions.

• The current methodology of sample data points is by using Gaussian distribution, however, this ignores the correlation or link between particular features. This is important and can produce unlikely data points, which are then used to learn local models.
• Accuracy of the explanations. Repeating the sampling process can show how the explanations for each can come out different to the previous. Therefore, it is difficult to say that we can trust the accuracy of the explanation.
• The method is still going through the development phase, therefore there are many problems that need to be solved before it can be safely applied.

# Individual Conditional Expectation (ICE)

Our second local method is Individual Conditional Expectation, which is very similar to the Partial Dependence Plot (PDP). However, instead of plotting an average which PDP does, ICE displays one line per instance that shows how the instance’s prediction changes as a feature changes.

PDP is a global method, as it focuses on the overall average, not on the specific instances. ICE is more intuitive than PDP due to its locality, exploring how each line represents the prediction for one instance if a feature varies. The overall aim of ICE is to explain what happens to a prediction in the model if the feature changes.

Another way to remember the difference between PDP and ICE is that PDP is the average of the lines of an ICE plot.

ICE plot shows the dependence between the target function and a particular feature of interest, where it visualizes the dependence of the prediction on a feature for each sample; one line per sample. Only one feature of interest is supported for ICE plots.

An ICE plot has the ability to unravel the curve that PDP fails to do so. Each ICE curve shows the predictions of the feature value for an instance when the features have been varied. When presented in a single plot, we are able to see the relationships between the subsets of the instances and the differences in how the individual instances behave.

## An example:

To keep this blog consistent, using the same dataset from the LIME example: The bike sharing dataset, from UCI Machine Learning Repository. Using the underlying prediction model a random forest, which has been trained with 100 trees, aims to predict how many bikes will be rented depending on the weather and the day.

The ICE plots shown below are the predicted bicycle rentals based on weather conditions. Looking at each feature, all the curves follow the same course, lacking to present any obvious interactions.

Therefore, from this interpretation, we can explore using PDP to see if we can get a better explanation between the features and the predicted number of bicycles.

• ICE curves are able to uncover heterogeneous relationships, unlike PDP.
• ICE curves are easier to understand, in comparison to PDP. Where one line represents the predictions for one instance.

• ICE curves cannot display more than one feature. Anything more than one feature would need the drawing of several overlaying surfaces, which would make it difficult to interpret anything.
• If ICE curves are overlapping one another, they become overcrowded; making it useless for model interpretability.
• Just like PDP, if the feature which is of interest has correlations with other features; there is a possibility of invalid data points, due to joint feature distribution.

# Shapley Values

Shapley Values aims to explain with a machine learning model produces the outputs it does. Shapley value was named in honor of Lloyd Shapley, a concept that was borrowed from the cooperative game theory literature.

Shapley Values were originally used to fairly attribute a player’s contribution to the end result of a game. For example, if there are a set of players which each collaborate to create some values, we can then measure the total outcome of the game. Shapley values represent the marginal contribution of each player to the end result. In a simpler example, it is the split of a bill between friends; it tells us how to distribute the “payout” fairly among the features.

The Shapley value of a feature is the contribution its value has to the payout, which is then weighted and summed over all the possible feature value combinations. This can be expressed as:

• S refers to the subset of the features used in the model
• p is the number of features
• x is the vector of feature values that will be explained
• valx(S) is the prediction for feature values in set S, which are marginalized over features that are not included in set S

The Shapley Value method satisfies the following properties: Efficiency, symmetry, dummy, and additivity. It is the only model which has these properties which work together to define a fair payout.

• Efficiency. The prediction and the average predictions are fairly distributed among the feature values of the instance. Whereas, other methods such as LIME do not promise a fair distribution between the features.
• Explanation. Shapley values are very popular due to their full explanation as it is based on theory and the distribution of the effects of the features is fair. It also allows the comparison of subset/single data points to the predictions.

• Computational Power. The Shapley value method is computationally expensive as there is a range of possible coalitions as well as the absence of features. This increases the computation of random instances increasing the variance.
• Features. Shapley value uses all features, which may not be the explanation that some are looking for. Some tasks require explanations using selective features, such as methods like LIME.
• Shapley values do not make predictions but return a value per feature. This means that you cannot make predictions using Shapley.
• Correlated Features. Shapley value can be difficult due to the inclusion of unrealistic data instances due to the features having correlation, similar to permutation-based interpretation methods.

# SHAP

SHAP is based on the game theory Shapley Values to explain the output of any machine learning model. It differs from Shapley Values due to its kernel-based estimation approach. The aim of SHAP is to explain the prediction of an instance, which is done by computing the contribution of each feature to the prediction.

Shapley values distribute the predictions fairly among the features. Each player of the game can be considered as an individual feature or a group of features. SHAP combines both LIME and Shapley Values in one, and can be expressed as:

• g refers to the explanation model
• Z′ ∈{0,1}M refers to the coalition vector
• M refers to the maximum coalition size
• ϕj∈R refers to the feature attribution for a feature j

## An example:

Using the cervical cancer dataset which explores and indicates the risk factors of whether a woman will get cervical cancer. The below SHAP figures explain the features and their correlation to two women from the cervical cancer dataset:

The baseline, which is the average predicted probability is 0.066. The first woman, which is the first SHAP plot has a low predicted risk of 0.06. The second woman, which is the second SHAP plot has a high predicted risk of 0.71.

For the first woman, factors such as STD have balanced out the effects of age. For the second woman, factors such as age and the year of smoking have increased her predicted cancer risk.

# Conclusion:

If you have kept up to date with this series of Model Interpretability, we have covered:

• Model Interpretability Part 1: The Importance and Approaches
• Model Interpretability Part 2: Global Model Agnostic Methods
• And now Model Interpretability Part 3: Local Model Agnostic Methods

If you would like to know more about Model Interpretability, I would highly recommend reading Interpretable Machine Learning by Christoph Molnar. He has given me the guidance and better understanding to write this three-part series about Model Interpretability. 