Demo Release Changes (#166)

* Fix layer attribution inputs * Rename AttentionRegistry -> InternalsRegistry * Registered contrast_prob_diff and mc_dropout_prob_avg, refactored step functions * Add aggregate_output argument to CLI attribute * Added GradientSHAP * Added attribute docs, general doc fixes * Fix TensorWrapper __repr__ * small fixes to basic attention docstrings * more small typo fixes * Added supported methods to readme * Added MMT & GPT-2 Fact Probing tutorials * Bump version to 0.4.0 --------- Co-authored-by: Ludwig Sickert <[email protected]>
inseq-team · Feb 27, 2023 · cb8080c · cb8080c
1 parent 8d1f602
commit cb8080c
Show file tree

Hide file tree

Showing 39 changed files with 1,388 additions and 1,012 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -27,7 +27,7 @@ repos:
         language: system
 
   - repo: https://github.com/charliermarsh/ruff-pre-commit
-    rev: 'v0.0.240'
+    rev: 'v0.0.252'
     hooks:
       - id: ruff
 

diff --git a/README.md b/README.md
@@ -23,12 +23,21 @@
 
 Inseq is a Pytorch-based hackable toolkit to democratize the access to common post-hoc **in**terpretability analyses of **seq**uence generation models.
 
+- Documentation: [https://inseq.readthedocs.io](https//inseq.readthedocs.io)
+- Paper: **Coming soon!**
+- PyPI Package: [https://pypi.org/project/inseq](https://pypi.org/project/inseq)
+- MT Gender Bias Demo: [oskarvanderwal/MT-bias-demo](https://huggingface.co/spaces/oskarvanderwal/MT-bias-demo)
+
 ## Installation
 
 Inseq is available on PyPI and can be installed with `pip`:
 
 ```bash
+# Install latest stable version
 pip install inseq
+
+# Alternatively, install latest development version
+pip install git+https://github.com/inseq-team/inseq.git
 ```
 
 Install extras for visualization in Jupyter Notebooks and 🤗 datasets attribution as `pip install inseq[notebook,datasets]`.
@@ -94,33 +103,88 @@ model.attribute(
 
 ![GPT-2 Attribution in the console](https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/inseq_python_console.gif)
 
-## Current Features
+## Features
 
 - 🚀 Feature attribution of sequence generation for most `ForConditionalGeneration` (encoder-decoder) and `ForCausalLM` (decoder-only) models from 🤗 Transformers
 
-- 🚀 Support for single and batched attribution using multiple gradient-based feature attribution methods from [Captum](https://captum.ai/docs/introduction)
-
-- 🚀 Support for basic single-layer and layer-aggregation attention attribution methods with one or multiple aggregated heads.
+- 🚀 Support for multiple feature attribution methods, sourced in part from [Captum](https://captum.ai/docs/introduction)
 
-- 🚀 Post-hoc aggregation of feature attribution maps via `Aggregator` classes.
+- 🚀 Post-processing of attribution maps via `Aggregator` classes.
 
 - 🚀 Attribution visualization in notebooks, browser and command line.
 
-- 🚀 CLI for attributing single examples or entire 🤗 datasets.
+- 🚀 Attribute single examples or entire 🤗 datasets with the Inseq CLI.
 
 - 🚀 Custom attribution of target functions, supporting advanced use cases such as contrastive and uncertainty-weighted feature attributions.
 
 - 🚀 Extraction and visualization of custom step scores (e.g. probability, entropy) alongsides attribution maps.
 
-## Planned Development
+### Supported methods
 
-- ⚙️ Support more attention-based and occlusion-based feature attribution methods (documented in [#107](https://github.com/inseq-team/inseq/issues/107) and [#108](https://github.com/inseq-team/inseq/issues/108)).
+Use the `inseq.list_feature_attribution_methods` function to list all available method identifiers and `inseq.list_step_functions` to list all available step functions. The following methods are currently supported:
 
-- ⚙️ Interoperability with [ferret](https://ferret.readthedocs.io/en/latest/) for attribution plausibility and faithfulness evaluation.
+#### Gradient-based attribution
 
-- ⚙️ Rich and interactive visualizations in a tabbed interface using [Gradio Blocks](https://gradio.app/docs/#blocks).
+- `saliency`: [Saliency](https://arxiv.org/abs/1312.6034) (Simonyan et al., 2013)
+
+- `input_x_gradient`: [Input x Gradient](https://arxiv.org/abs/1312.6034) (Simonyan et al., 2013)
+
+- `integrated_gradients`: [Integrated Gradients](https://arxiv.org/abs/1703.01365) (Sundararajan et al., 2017)
+
+- `deeplift`: [DeepLIFT](https://arxiv.org/abs/1704.02685) (Shrikumar et al., 2017)
+
+- `gradient_shap`: [Gradient SHAP](https://dl.acm.org/doi/10.5555/3295222.3295230) (Lundberg and Lee, 2017)
+
+- `discretized_integrated_gradients`: [Discretized Integrated Gradients](https://aclanthology.org/2021.emnlp-main.805/) (Sanyal and Ren, 2021)
+
+#### Internals-based attribution
+
+- `attention`: [Attention Weight Attribution](https://arxiv.org/abs/1409.0473) (Bahdanau et al., 2014)
+
+#### Perturbation-based attribution
+
+- `occlusion`: [Occlusion](https://link.springer.com/chapter/10.1007/978-3-319-10590-1_53) (Zeiler and Fergus, 2014)
+
+- `lime`: [LIME](https://arxiv.org/abs/1602.04938) (Ribeiro et al., 2016)
+
+#### Step functions
 
-- ⚙️ Baked-in advanced capabilities for contrastive and uncertainty-weighted feature attribution.
+Step functions are used to extract custom scores from the model at each step of the attribution process with the `step_scores` argument in `model.attribute`. They can also be used as targets for attribution methods relying on model outputs (e.g. gradient-based methods) by passing them as the `attributed_fn` argument. The following step functions are currently supported:
+
+- `logits`: Logits of the target token.
+- `probability`: Probability of the target token.
+- `entropy`: Entropy of the predictive distribution.
+- `crossentropy`: Cross-entropy loss between target token and predicted distribution.
+- `perplexity`: Perplexity of the target token.
+- `contrast_prob_diff`: Difference in probability between the target token and a foil token used for contrastive evaluation as in [Contrastive Attribution](https://aclanthology.org/2022.emnlp-main.14/) (Yin and Neubig, 2022).
+- `mc_dropout_prob_avg`: Average probability of the target token across multiple samples using [MC Dropout](https://arxiv.org/abs/1506.02142) (Gal and Ghahramani, 2016).
+
+The following example computes contrastive attributions using the `contrast_prob_diff` step function:
+
+```python
+import inseq
+
+attribution_model = inseq.load_model("gpt2", "input_x_gradient")
+
+# Pre-compute ids and attention map for the contrastive target
+contrast = attribution_model.encode("Can you stop the dog from crying")
+
+# Perform the contrastive attribution:
+# Regular (forced) target -> "Can you stop the dog from barking"
+# Contrastive target      -> "Can you stop the dog from crying"
+out = attribution_model.attribute(
+    "Can you stop the dog from",
+    "Can you stop the dog from barking",
+    attributed_fn="contrast_prob_diff",
+    contrast_ids=contrast.input_ids,
+    contrast_attention_mask=contrast.attention_mask,
+    # We also visualize the corresponding step score
+    step_scores=["contrast_prob_diff"]
+)
+out.show()
+```
+
+Refer to the [documentation](https://inseq.readthedocs.io/examples/custom_attribute_target.html) for an example including custom function registration.
 
 ## Using the Inseq client
 
@@ -149,6 +213,14 @@ inseq attribute-dataset \
   --hide
 ```
 
+## Planned Development
+
+- ⚙️ Support more attention-based and occlusion-based feature attribution methods (documented in [#107](https://github.com/inseq-team/inseq/issues/107) and [#108](https://github.com/inseq-team/inseq/issues/108)).
+
+- ⚙️ Interoperability with [ferret](https://ferret.readthedocs.io/en/latest/) for attribution plausibility and faithfulness evaluation.
+
+- ⚙️ Rich and interactive visualizations in a tabbed interface using [Gradio Blocks](https://gradio.app/docs/#blocks).
+
 ## Contributing
 
 Our vision for Inseq is to create a centralized, comprehensive and robust set of tools to enable fair and reproducible comparisons in the study of sequence generation models. To achieve this goal, contributions from researchers and developers interested in these topics are more than welcome. Please see our [contributing guidelines](CONTRIBUTING.md) and our [code of conduct](CODE_OF_CONDUCT.md) for more information.

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -25,9 +25,9 @@
 author = "The Inseq Team"
 
 # The short X.Y version
-version = "0.3"
+version = "0.4"
 # The full version, including alpha/beta/rc tags
-release = "0.3.4.dev0"
+release = "0.4.0"
 
 
 # Prefix link to point to master, comment this during version release and uncomment below line