-
Notifications
You must be signed in to change notification settings - Fork 7
/
Tom_Brady_R_Demo.Rmd
147 lines (112 loc) · 5.94 KB
/
Tom_Brady_R_Demo.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
title: "Tom Brady R Demo"
author: "By Matt Edwards, Head of Football Analysis, StatsBomb"
output:
pdf_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Tom Brady data release R Introduction
Below we've included some sample code to load in the events and plays datasets, some eploratory analysis of what the data looks like, as well as looking at some of the unique event data that StatsBomb collects. Documentation for the dataset can be found [here](https://github.com/statsbomb/amf-open-data/tree/main/doc).
First we'll import the required libaries - you'll need to install these to your local R environment before you begin if running locally.
```{r}
library(tidyverse)
library(ggdark)
```
Below we load in the events and plays data direct from the [open data repository](https://github.com/statsbomb/amf-open-data).
```{r}
events_2021 <- read_csv("https://raw.githubusercontent.com/statsbomb/amf-open-data/main/data/events/tb12_events_dataset_2021_2022.csv")
events_2022 <- read_csv("https://raw.githubusercontent.com/statsbomb/amf-open-data/main/data/events/tb12_events_dataset_2022_2023.csv")
plays_2021 <- read_csv("https://raw.githubusercontent.com/statsbomb/amf-open-data/main/data/plays/tb12_plays_dataset_2021_2022.csv")
plays_2022 <- read_csv("https://raw.githubusercontent.com/statsbomb/amf-open-data/main/data/plays/tb12_plays_dataset_2022_2023.csv")
events <- rbind(events_2021, events_2022)
plays <- rbind(plays_2021, plays_2022)
```
Let's take a look at the data from these datasets!
```{r}
head(events)
head(plays)
```
Let's put these datasets together to do some quick analysis and plotting!
```{r}
tom_brady_12_passes <- events %>% filter(player_name == "Tom Brady" & grepl("Pass",event_types) & !grepl("Fake Pass",event_types)) %>% merge(plays, by = "play_uuid")
tom_brady_12_passes %>% select(play_uuid, play_down, receiver_player_name) %>% head(5)
```
Let's take a look at Tom Brady's success while throwing to all his different receivers for 2021-2022.
```{r}
receiver_data <- tom_brady_12_passes %>% filter(!is.na(play_down_negated)) %>% group_by(receiver_player_name) %>% drop_na(receiver_player_name) %>%
summarize(Targets = n(),Completions = sum(event_success == T, na.rm=T), Comp_Perc = mean(event_success, na.rm=T),
Yds_Per_Catch = mean(play_yards_gained, na.rm=T), Explosive_Perc = mean(play_yards_gained >= 15, na.rm=T)) %>% arrange(desc(Targets))
receiver_data %>% filter(Targets >= 25)
```
Looks like two of his best targets from his last two years in Tampa were Mike Evans and Rob Gronkowski. They both were really high explosive percentage, and had two of the highest yards per catch on the team.
A good way to look at data is to create a visual plot. Let's create a plot that shows each receiver's yards per catch against their explosive play percentage.
```{r}
ggplot(receiver_data, aes(x=Yds_Per_Catch, y=Explosive_Perc, size = Targets)) +
geom_point(color = "black") +
labs(
title = "Tampa Bay WR Info",
x = "Yds / Pass",
y = "Explosive Perc"
)
```
One area that Tom Brady gets a lot of credit for is throwing over the middle of the field. Whether it was to slot receivers like Wes Welker and Julian Edelman, or his trusty Tight End Gronk. The middle of the field is where he did a lot of damage. Let's plot all of his passes that were targeted between the numbers to see who his main targets were in Tampa.
```{r}
middle_field_passes <- tom_brady_12_passes %>% filter(event_end_y >= 14 & event_end_y <= 39)
middle_field_passes %>% head(5)
```
Let's build a field to plot these passes!
```{r}
left_hashes = data.frame(
x = c(1:99),
xend = c(1:99),
y = c((70*12+9)/36),
yend = c(((70*12+9)/36)+.66)
)
right_hashes = data.frame(
x = c(1:99),
xend = c(1:99),
y = c(29.09),
yend = c(29.75)
)
yard_markers = data.frame(
x=c(0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100),
xend = c(0,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100),
y=c(0),
yend=c(53.33)
)
FarFieldNumbers = data.frame(x= c(10,20,30,40,50,60,70,80,90), y = c(41.33), Number = c(10,20,30,40,50,40,30,20,10))
CloseFieldNumbers = data.frame(x= c(10,20,30,40,50,60,70,80,90), y = c(12), Number = c(10,20,30,40,50,40,30,20,10))
full_field <- ggplot() +
geom_segment(aes(x=0,y=0,xend=0,yend=53.33)) +
geom_segment(aes(x=-10,y=0,xend=-10,yend=53.33)) +
geom_segment(aes(x=100,y=0,xend=100,yend=53.33)) +
geom_segment(aes(x=110,y=0,xend=110,yend=53.33)) +
geom_segment(aes(x=-10,y=0,xend=110,yend=0)) +
geom_segment(aes(x=-10,y=53.33,xend=110,yend=53.33)) +
geom_segment(data= yard_markers, aes(x=x, xend=xend, y=y, yend=yend)) +
geom_segment(data = left_hashes, aes(x=x, xend=xend, y=y, yend=yend)) +
geom_segment(data = right_hashes, aes(x=x, xend=xend, y=y, yend=yend)) +
geom_text(data = CloseFieldNumbers, mapping = aes(x,y, label = Number), colour = "#FFFFFF", size = 8,) + ##These are the Numbers on the field
geom_text(data = FarFieldNumbers, mapping = aes(x, y, label = Number), colour = "#FFFFFF", size = 8, angle = 180) +
dark_theme_void()
full_field
```
```{r}
top_middle_field_receivers <- middle_field_passes %>% filter(receiver_player_name %in% c("Rob Gronkowski", "Chris Godwin", "Mike Evans", "Cameron Brate", "Russell Gage"))
full_field +
geom_point(data = top_middle_field_receivers, aes(x=event_end_x, y=event_end_y, color = receiver_player_name)) +
coord_flip() +
scale_color_manual(values = c("Rob Gronkowski" = "#Ff8e00", "Chris Godwin" = "#c247ff", "Mike Evans" = "#00f6ff", "Cameron Brate" = "#Fffb00", "Russell Gage" = "#FFFFFF")) +
labs(
title = "Tom Brady Middle of the Field Passes",
color = ""
) +
theme(
plot.title = element_text(hjust = 0.5, size=22, face = "bold")
)
```
There is so much in this dataset, and I am excited to see what everyone puts together!
Remember to share your work online and use the #TB12DB so that everyone can share the data and work together in analyzing one of the best to ever do it!