-
Notifications
You must be signed in to change notification settings - Fork 304
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
* Add Sphinx Documentation
- Loading branch information
Showing
25 changed files
with
2,007 additions
and
644 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
name: Deploy Docs | ||
on: | ||
push: | ||
branches: | ||
- sphinx-documenation | ||
|
||
permissions: | ||
contents: write | ||
|
||
jobs: | ||
docs: | ||
name: Generate Website | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Install Poetry | ||
uses: snok/install-poetry@v1 | ||
- name: Set up Python | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: "3.9" | ||
cache: "poetry" | ||
- name: Install dependencies | ||
run: poetry lock && poetry install --extras docs | ||
- name: Build | ||
run: poetry run sphinx-build docs/source docs/build | ||
- name: Add model table | ||
run: | | ||
poetry run python -m transformer_lens.make_docs | ||
mv model_properties_table.md docs/source/ | ||
sed -i '1s/^/# Model Properties Table\n\n/' docs/source/model_properties_table.md | ||
- name: Remove .doctrees | ||
run: rm -r docs/build/.doctrees | ||
|
||
- name: Upload to GitHub Pages | ||
uses: JamesIves/github-pages-deploy-action@v4 | ||
with: | ||
folder: docs/build | ||
clean-exclude: | | ||
*.*.*/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,4 +13,6 @@ _modidx.py | |
.ipynb_checkpoints | ||
env | ||
dist/ | ||
.coverage | ||
docs/build | ||
.coverage | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = source | ||
BUILDDIR = build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
|
||
# Transformer-Lens Docs | ||
|
||
|
||
This repo contains the [NEW website]() for [TransformerLens](https://github.com/website_address_add_later.). This site is currently in Beta and we are in the process of adding/editing information. | ||
|
||
The documentation uses Sphinx. However, the documentation is written in regular md, NOT rst. | ||
|
||
If you are modifying a non-environment page or an atari environment page, please PR this repo. Otherwise, follow the steps below: | ||
|
||
## Build the Documentation | ||
|
||
Install the required packages: | ||
|
||
Need to use python 3.9 (3.10 has an issue with napoleon extension) and below 3.8 doesn't have sphinx suppourt. | ||
``` | ||
poetry install --extras docs | ||
``` | ||
|
||
Using api doc to make the rst files | ||
|
||
```bash | ||
poetry run sphinx-apidoc -f -o docs/source . | ||
|
||
# make the model tables file | ||
poetry run python -m transformer_lens.make_docs | ||
mv model_properties_table.md docs/source/ | ||
if [[ "$OSTYPE" == "darwin"* ]]; then | ||
sed -i '' '1s/^/# Model Properties Table\n\n/' docs/source/model_properties_table.md | ||
else | ||
sed -i '1s/^/# Model Properties Table\n\n/' docs/source/model_properties_table.md | ||
fi | ||
cd docs | ||
|
||
# build the docs from source | ||
poetry run sphinx-autobuild -b dirhtml ./source build/html | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
@ECHO OFF | ||
|
||
pushd %~dp0 | ||
|
||
REM Command file for Sphinx documentation | ||
|
||
if "%SPHINXBUILD%" == "" ( | ||
set SPHINXBUILD=sphinx-build | ||
) | ||
set SOURCEDIR=source | ||
set BUILDDIR=build | ||
|
||
%SPHINXBUILD% >NUL 2>NUL | ||
if errorlevel 9009 ( | ||
echo. | ||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx | ||
echo.installed, then set the SPHINXBUILD environment variable to point | ||
echo.to the full path of the 'sphinx-build' executable. Alternatively you | ||
echo.may add the Sphinx directory to PATH. | ||
echo. | ||
echo.If you don't have Sphinx installed, grab it from | ||
echo.https://www.sphinx-doc.org/ | ||
exit /b 1 | ||
) | ||
|
||
if "%1" == "" goto help | ||
|
||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
goto end | ||
|
||
:help | ||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
|
||
:end | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# For the full list of built-in configuration values, see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Project information ----------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information | ||
|
||
project = 'TransformerLens' | ||
copyright = '2023, Neel Nanda' | ||
author = 'Neel Nanda' | ||
release = '0.0.0' | ||
|
||
# -- General configuration --------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration | ||
|
||
extensions = ['sphinx.ext.autodoc', | ||
'sphinxcontrib.napoleon', | ||
'myst_parser', | ||
"sphinx.ext.githubpages"] | ||
|
||
source_suffix = { | ||
".rst": "restructuredtext", | ||
".md": "markdown", | ||
} | ||
|
||
templates_path = ['_templates'] | ||
exclude_patterns = [] | ||
|
||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output | ||
|
||
html_theme = "furo" | ||
html_title = "TransformerLens Documentation" | ||
html_static_path = ['_static'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
|
||
## Citation | ||
|
||
Please cite this library as: | ||
``` | ||
@misc{nandatransformerlens2022, | ||
title = {TransformerLens}, | ||
author = {Nanda, Neel}, | ||
url = {https://github.com/neelnanda-io/TransformerLens}, | ||
year = {2022} | ||
} | ||
``` | ||
(This is my best guess for how citing software works, feel free to send a correction!) | ||
Also, if you're actually using this for your research, I'd love to chat! Reach out at [email protected] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
## Local Development | ||
|
||
### DevContainer | ||
|
||
For a one-click setup of your development environment, this project includes a [DevContainer](https://containers.dev/). It can be used locally with [VS Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) or with [GitHub Codespaces](https://github.com/features/codespaces). | ||
|
||
### Manual Setup | ||
|
||
This project uses [Poetry](https://python-poetry.org/docs/#installation) for package management. Install as follows (this will also setup your virtual environment): | ||
|
||
```bash | ||
poetry config virtualenvs.in-project true | ||
poetry install --with dev | ||
``` | ||
|
||
Optionally, if you want Jupyter Lab you can run `poetry run pip install jupyterlab` (to install in the same virtual environment), and then run with `poetry run jupyter lab`. | ||
|
||
Then the library can be imported as `import transformer_lens`. | ||
|
||
### Testing | ||
|
||
If adding a feature, please add unit tests for it to the tests folder, and check that it hasn't broken anything major using the existing tests (install pytest and run it in the root TransformerLens/ directory). | ||
|
||
To run tests, you can use the following command: | ||
|
||
``` | ||
poetry run pytest -v transformer_lens/tests | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
## Gallery | ||
|
||
User contributed examples of the library being used in action: | ||
* [Induction Heads Phase Change Replication](https://colab.research.google.com/github/ckkissane/induction-heads-transformer-lens/blob/main/Induction_Heads_Phase_Change.ipynb): A partial replication of [In-Context Learning and Induction Heads](https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html) from Connor Kissane | ||
* [Decision Transformer Interpretability](https://github.com/jbloomAus/DecisionTransformerInterpretability): A set of scripts for training decision transformers which uses transformer lens to view intermediate activations, perform attribution and ablations. A write up of the initial work can be found [here](https://www.lesswrong.com/posts/bBuBDJBYHt39Q5zZy/decision-transformer-interpretability). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
## Getting Started | ||
|
||
**Start with the [main demo](https://neelnanda.io/transformer-lens-demo) to learn how the library works, and the basic features**. | ||
|
||
To see what using it for exploratory analysis in practice looks like, check out [my notebook analysing Indirect Objection Identification](https://neelnanda.io/exploratory-analysis-demo) or [my recording of myself doing research](https://www.youtube.com/watch?v=yo4QvDn-vsU)! | ||
|
||
Mechanistic interpretability is a very young and small field, and there are a *lot* of open problems - if you would like to help, please try working on one! **Check out my [list of concrete open problems](https://docs.google.com/document/d/1WONBzNqfKIxERejrrPlQMyKqg7jSFW92x5UMXNrMdPo/edit) to figure out where to start.**. It begins with advice on skilling up, and key resources to check out. | ||
|
||
If you're new to transformers, check out my [what is a transformer tutorial](https://neelnanda.io/transformer-tutorial) and [tutorial on coding GPT-2 from scratch](https://neelnanda.io/transformer-tutorial-2) (with [an accompanying template](https://neelnanda.io/transformer-template) to write one yourself! | ||
|
||
### Advice for Reading the Code | ||
|
||
One significant design decision made was to have a single transformer implementation that could support a range of subtly different GPT-style models. This has the upside of interpretability code just working for arbitrary models when you change the model name in `HookedTransformer.from_pretrained`! But it has the significant downside that the code implementing the model (in `HookedTransformer.py` and `components.py`) can be difficult to read. I recommend starting with my [Clean Transformer Demo](https://neelnanda.io/transformer-solution), which is a clean, minimal implementation of GPT-2 with the same internal architecture and activation names as HookedTransformer, but is significantly clearer and better documented. | ||
|
||
### Installation | ||
|
||
`pip install git+https://github.com/neelnanda-io/TransformerLens` | ||
|
||
Import the library with `import transformer_lens` | ||
|
||
(Note: This library used to be known as EasyTransformer, and some breaking changes have been made since the rename. If you need to use the old version with some legacy code, run `pip install git+https://github.com/neelnanda-io/TransformerLens@v1`.) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
## Tutorials | ||
|
||
* **Start with the [main demo](https://neelnanda.io/transformer-lens-demo) to learn how the library works, and the basic features**. | ||
|
||
* To see what using it for exploratory analysis in practice looks like, check out [my notebook analysing Indirect Objection Identification](https://neelnanda.io/exploratory-analysis-demo) or [my recording of myself doing research](https://www.youtube.com/watch?v=yo4QvDn-vsU)! | ||
|
||
* [What is a Transformer tutotial](https://neelnanda.io/transformer-tutorial) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
easy\_transformer package | ||
========================= | ||
|
||
Module contents | ||
--------------- | ||
|
||
.. automodule:: easy_transformer | ||
:members: | ||
:undoc-members: | ||
:show-inheritance: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
--- | ||
hide-toc: true | ||
firstpage: | ||
lastpage: | ||
--- | ||
|
||
# TransformerLens | ||
(Formerly known as EasyTransformer) [![Pypi](https://img.shields.io/pypi/v/transformer-lens)](https://pypi.org/project/transformer-lens/) | ||
|
||
|
||
## A Library for Mechanistic Interpretability of Generative Language Models | ||
|
||
This is a library for doing [mechanistic interpretability](https://distill.pub/2020/circuits/zoom-in/) of GPT-2 Style language models. The goal of mechanistic interpretability is to take a trained model and reverse engineer the algorithms the model learned during training from its weights. It is a fact about the world today that we have computer programs that can essentially speak English at a human level (GPT-3, PaLM, etc), yet we have no idea how they work nor how to write one ourselves. This offends me greatly, and I would like to solve this! | ||
|
||
TransformerLens lets you load in an open source language model, like GPT-2, and exposes the internal activations of the model to you. You can cache any internal activation in the model, and add in functions to edit, remove or replace these activations as the model runs. The core design principle I've followed is to enable exploratory analysis. One of the most fun parts of mechanistic interpretability compared to normal ML is the extremely short feedback loops! The point of this library is to keep the gap between having an experiment idea and seeing the results as small as possible, to make it easy for **research to feel like play** and to enter a flow state. Part of what I aimed for is to make *my* experience of doing research easier and more fun, hopefully this transfers to you! | ||
|
||
I used to work for the [Anthropic interpretability team](https://transformer-circuits.pub/), and I wrote this library because after I left and tried doing independent research, I got extremely frustrated by the state of open source tooling. There's a lot of excellent infrastructure like HuggingFace and DeepSpeed to *use* or *train* models, but very little to dig into their internals and reverse engineer how they work. **This library tries to solve that**, and to make it easy to get into the field even if you don't work at an industry org with real infrastructure! One of the great things about mechanistic interpretability is that you don't need large models or tons of compute. There are lots of important open problems that can be solved with a small model in a Colab notebook! | ||
|
||
The core features were heavily inspired by the interface to [Anthropic's excellent Garcon tool](https://transformer-circuits.pub/2021/garcon/index.html). Credit to Nelson Elhage and Chris Olah for building Garcon and showing me the value of good infrastructure for enabling exploratory research! | ||
|
||
```{toctree} | ||
:hidden: | ||
:caption: Introduction | ||
content/getting_started | ||
content/gallery | ||
``` | ||
|
||
```{toctree} | ||
:hidden: | ||
:caption: Resources | ||
content/tutorials | ||
content/citation | ||
``` | ||
|
||
```{toctree} | ||
:hidden: | ||
:caption: Code | ||
transformer_lens.rst | ||
``` | ||
|
||
```{toctree} | ||
:hidden: | ||
:caption: Models | ||
model_properties_table.md | ||
``` | ||
|
||
|
||
|
||
```{toctree} | ||
:hidden: | ||
:caption: Development | ||
content/development | ||
Github <https://github.com/neelnanda-io/TransformerLens> | ||
``` | ||
|
Oops, something went wrong.