What’s Changed

Perturbation-based Attribution Methods (#145)

Thanks to @nfelnlp, this version introduces the PerturbationAttributionRegistry and the OcclusionAttribution (occlusion) and LimeAttribution (lime) methods, both adapted from Captum's original implementations.

Our implementation of Occlusion (Zeiler and Fergus, 2014 estimates feature importance by replacing each input token embedding with a baseline (default: UNK) and computing the difference in output, producing coarse-grained attribution scores (1 per token).
LIME (Ribeiro et al. 2016) trains an interpretable surrogate model by sampling points around a specified input example and using model evaluations at these points to train a simpler interpretable ‘surrogate’ model, such as a linear model. We adapt the implementation by Atanasova et al. for usage in the generative setting.

Attribute `bitsandbytes` Int8 Quantized Models (#163)

Since the 0.37 release of bitsandbytes, efficient matrix multiplication backward is enabled for all int8-quantized models loaded with 🤗 Transformers. In this release we support attributing int8 models with attribution methods relying on a backward pass (e.g. integrated_gradients, saliency). In the following simple example, we attribute the generation steps of a quantized GPT-2 1.5B model using the input_x_gradient method, with the whole process requiring less than 6GB of GPU RAM:

import inseq
from transformers import AutoModelForCausalLM

hf_model = AutoModelForCausalLM.from_pretrained("gpt2-xl", load_in_8bit=True, device_map="auto")
inseq_model = inseq.load_model(hf_model, "input_x_gradient", tokenizer="gpt2-xl")
out = inseq_model.attribute("Hello world, this is the Inseq", generation_args = {"max_new_tokens": 20})
out.show()

Contrastive and Uncertainty-weighted Attribution (#166)

This release introduces two new pre-registered step functions, contrast_prob_diff and mc_dropout_prob_avg.

contrast_prob_diff computes the difference in probability between a generation target (e.g. All the dogs are barking loudly) and a contrastive alternative (e.g. All the dogs are crying strongly) at every generation step, with the constraint of having a 1-1 token correspondence between the two strings. If used as attributed_fn in model.attribute, it corresponds to the Contrastive Attribution setup by Yin and Neubig, 2022. See
mc_dropout_prob_avg computes an uncertainty-weighted estimate of each generated token's probabilities using n_mcd_steps of the Monte Carlo Dropout method. If used as an attributed function instead of vanilla probability it can produce more robust attribution scores at the cost of more computation.

See this tutorial in the documentation for a reference on how to register and use custom attributed functions.

Multilingual MT and Factual Information Location Examples (#166)

Inseq documentation contains two new examples:

Attributing Multilingual MT Models shows how to use Inseq to attribute the generations of multilingual MT models like M2M100 and NLLB, which require setting target language flags before generation.
Locating Factual Knowledge in GPT-2 shows how layer-specific attribution methods can be used to obtain intermediate attributions of language models like GPT-2. Using the quantized and contrastive attribution approaches described above, the example reproduces some observations made by Meng et al. 2022 on the localization of factual knowledge in large language models.

All Merged PRs

🚀 Features

Add OcclusionAttribution and LimeAttribution (#145) @nfelnlp
bitsandbytes compatibility (#163 (#163) @gsarti

🔧 Fixes & Refactoring

Demo Release Changes (#166) @gsarti
Fix EOS baseline for models with pad_token_id != 0 (#165) @gsarti

👥 List of contributors

This release wouldn't have been possible without the contributions of these amazing folks. Thank you!

@gsarti @nfelnlp and @lsickert

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0: Perturbation-based methods, Int8 backward attribution, contrastive step function and more

What’s Changed

Perturbation-based Attribution Methods (#145)

Attribute `bitsandbytes` Int8 Quantized Models (#163)

Contrastive and Uncertainty-weighted Attribution (#166)

Multilingual MT and Factual Information Location Examples (#166)

All Merged PRs

🚀 Features

🔧 Fixes & Refactoring

👥 List of contributors

Contributors

v0.4.0: Perturbation-based methods, Int8 backward attribution, contrastive step function and more

What’s Changed

Perturbation-based Attribution Methods (#145)

Attribute bitsandbytes Int8 Quantized Models (#163)

Contrastive and Uncertainty-weighted Attribution (#166)

Multilingual MT and Factual Information Location Examples (#166)

All Merged PRs

🚀 Features

🔧 Fixes & Refactoring

👥 List of contributors

Contributors

Attribute `bitsandbytes` Int8 Quantized Models (#163)