Welcome to the StatsBomb American Football Open Data repository.
StatsBomb are committed to sharing new data and research publicly to enhance understanding of the game of football. We want to actively encourage new research and analysis at all levels. Therefore we have made certain subsets of StatsBomb Data freely available for public use for research projects and genuine interest in football analytics.
StatsBomb are hoping that by making data freely available, we will extend the wider football analytics community and attract new talent to the industry.
If you publish, share or distribute any research, analysis or insights based on this data, please state the data source as StatsBomb and use our logo, available in our Media Pack. Please see the StatsBomb Public Data User Agreement for further information regarding use of the data.
The data is provided as CSV, Parquet and compressed JSON files exported from the StatsBomb Data API, in the following structure:
- Play-level data stored in
plays
. - Event-level data stored in
events
. - Low-frequency tracking data stored in
lft
. These can be accessed as:- Individual files for each game with file names referred to by their
game_id
i.e.<GAME_ID>.csv
. - Individual files for each season as Parquet files.
- Individual files for each game with file names referred to by their
- High-frequency tracking data stored in
tracking
.- Individual files for each game with file names referred to by the date of the game and the teams playing. These are compressed
json
files with ajson.gz
file extension.- Meta-data and the
url
for each game can be found in thegames.json
file.
- Meta-data and the
- Zipped
json
files containing the individual seasons, available via AWS S3:
- Individual files for each game with file names referred to by the date of the game and the teams playing. These are compressed
Some documentation about the meaning of different variables and the format of the files can be found in the doc
directory.
Examples of using the data can be found here.
tb12_passes_python.ipynb
- Python guide to load in the play and event data from the #TB12DB release, perform some basic analysis and plot passes on a field. Google Colab version available here.Tom Brady R Demo
- R guide to load in the play and event data from the #TB12DB release, perform some basic analysis and create a scatter plot and field plot of the data. Google Colab version available here.tb12_tracking_python.ipynb
- Python guide to load in a game of tracking data from the #TB12DB release and create an animation of an individual play. Google Colab version available here.tb12 tracking R Demo
- R guide to load in a game of tracking data from the #TB12DB release and create an animation of an individual play.tb12_tracking_defense_python.ipynb
- Python guide to load in multiple games of tracking data from the #TB12DB release and automatically detect defensive alignments. Google Colab version available here.
If you're interested in football data, StatsBomb is always hiring!