Some elements of data visualization with R 'ggplot2' mostly
Examples include scatterplots, histograms, kernel density estimators (KDE), time series plots, bar plots, diverging bar plots, pareto plots... It also includes code for multiple plots on the same window. Here is an excerpt of some of the visualizations taken as examples.
- Plots of time data
Time series plot (numerical x time variable)
ggplot(dataset, aes(x = dates, y = value)) +
geom_line() +
customization
Diverging bar plot (eg. numerical x time variable)
ggplot(dataset, aes(x = dates, y = value)) +
geom_bar() +
customization
Plot with values above a threshold (eg. numerical x time variable)
ggplot(dataset, aes(x = dates, y = value)) +
geom_point() +
geom_text(aes(label = ID), dataset %>% filter(y>mean(y)), show_guide = FALSE) +
customization
- Multiple plots
Scatterplot with marginal densities and regression lines by group (eg. numerical x numerical, by group)
library(cowplot)
Scatterplot with different pointsize
ggplot(data=dataset, aes(x=variable1, y=variable2, colour = group))+
geom_point(aes(size = variable3)) +
customization
Plot multiples subplots (eg. mixed of numerical and categorical/nominal variables
p1 = ggplot(data=dataset) +
customization
p2 = ggplot(data=dataset) +
customization
grid.arrange(p1, p2)
Plot of histogram with kernel density estimator (eg. numerical by group)
ggplot(dataset) +
geom_histogram() +
geom_density() +
customization
Ridge plots (eg. numerical by group)
ggplot(iris, aes(x = value, y = group, fill = group)) +
geom_density_ridges()
Plot joint distribution of two categorical or nominal variages (eg. nominal x categorical)
ggplot(dataset, aes(x = variable1, y = prop, width = cut.count, fill = variable2)) +
geom_bar() +
facet_grid(~ variable1, scales = "free_x", space = "free_x") +
customization
- Statistics
Plot a distribution with colored or shaded area
ggplot(aes(dataset)) +
as_reference(geom_density(), id = "density") +
with_blend(annotate("), bg_layer = "density", blend_type = "atop") +
customization
Plot multiple theoretical distributions on the same plot (eg. three Normal densities)
ggplot(data=dataset) +
stat_function(fun=dnorm, args=list(mean = 0, sd = 1)) +
customization
Convergence of MC estimators (eg. numerical x integer sequence)
ggplot(dataset, aes( x = seq)) +
geom_line( aes(y = cumsum(x1)/(1:length(x1)))) +
customization
And much more.