-
Notifications
You must be signed in to change notification settings - Fork 17
/
p4c1-summary-statistics.qmd
52 lines (38 loc) · 1.76 KB
/
p4c1-summary-statistics.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# Summary Statistics
Summary statistics (otherwise known as descriptive statistics) are usually where one starts when beginning to develop insights. You may hear the phrase "Exploratory Data Analysis" (sometimes abbreviated "EDA") throughout your career. This is the point where you try to get a high-level understanding of the distributions and relationships within your dataset.
## Quantitative Data
When dealing with continuous data, one of the quickest ways to get a high level view of your data is by using the "summary" function. This function will return your extreme (minimum and maximum) values, your median, mean, 1st quantile, and 3rd quantile.
```{r}
summary(mtcars$mpg)
```
Alternatively, you can use the following eight functions to retrieve specific information about your data.
```{r}
# Returns the average
mean(mtcars$mpg)
# Returns the median
median(mtcars$mpg)
# Returns the standard deviation
sd(mtcars$mpg)
# Returns the sample variance
var(mtcars$mpg)
# Returns the minimum value
min(mtcars$mpg)
# Returns the maximum value
max(mtcars$mpg)
# Returns the minimum and maximum value
range(mtcars$mpg)
# Returns quantile data
quantile(mtcars$mpg)
```
## Qualitative Data
If you're working with data that is categorical and encoded as a factor, you can view all categories by using the "levels" function.
```{r}
levels(iris$Species)
```
However, if you want to count the number of occurrences for each level, you can use the "table" function.
```{r}
table(iris$Species)
```
If you need to keep digging for insights, you can represent your categories however you'd like to using the "group_by" function covered in the last chapter.
## Resources
- "Exploring Data and Descriptive Statistics (using R)" from princeton: <https://www.princeton.edu/~otorres/sessions/s2r.pdf>