-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
145 lines (108 loc) · 5.28 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# augury <a href='https://github.com/gpw13/augury'><img src='man/figures/logo.png' align="right" height="139" /></a>
<!-- badges: start -->
[![R-CMD-check](https://github.com/gpw13/augury/workflows/R-CMD-check/badge.svg)](https://github.com/gpw13/augury/actions)
<!-- badges: end -->
The goal of augury is to streamline the process of fitting models to infill and
forecast data, particularly for models used within the WHO's Triple Billion
framework.
## Installation
You can install augury from [GitHub](https://github.com/gpw13/augury). The
package depends on the [INLA package](https://www.r-inla.org/home) which is not
available on CRAN. You will need to separately install that prior to installing
augury, following the code below.
``` r
if (!require("INLA")) install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)
remotes::install_github("gpw13/augury")
```
## Overview
Most of the functions available through the package are designed to streamline
fitting a model and replacing missing observations within a single data frame
in one pass.
* `predict_general_mdl()` is a function that takes in a data frame, generic R
modelling function (such as `stats::lm`), and fits a model and returns newly
projected data, error metrics, the fitted model itself, or a list of all 3.
* `predict_inla()` is a function similar to the above, except rather than taking
a generic modelling function it accepts a formula and additional argument
passed to `INLA::inla()` to perform Bayesian analysis of structured additive
models.
* `predict_forecast()` instead uses forecasting methods available from
`forecast::forecast()` to perform exponential smoothing or other time series
models on a response variable.
Given that models might need to be fit separately to different portions of a data
frame, such as fitting time series models for each country individually, these
functions now accept the argument `group_models` that determines whether to fit
separate models to each group in the data frame before joining them back together
into the original data frame.
## Additional model wrappers
Additional functions are provided as requested to streamlining fitting of
specific models. For general modeling, there are grouped and individual functions
for `stats::lm()`, `stats::glm()`, and `lme4::lmer()` to fit linear models,
generalized linear models, and linear mixed-effects models respectively. These
wrappers are:
* `predict_lm()`
* `predict_glm()`
* `predict_lmer()`
As well, wrappers are provided around INLA for models currently in place for the
Triple Billion framework. These are a time series model with no additional
covariates (fit individually by country) and a mixed-effects model using
covariates.
* `predict_inla_me()`
* `predict_inla_ts()`
And for forecasting, generic functions are provided for simple exponential smoothing
and Holt's linear trend exponential smoothing.
* `predict_holt()`
* `predict_ses()`
## Covariates
For convenience, the covariates used within the default INLA modeling are exported
from the package and can be easily joined up to a data frame for use in modeling.
These are available through `augury::covariates_df`.
## Grouped trend development
While we typically want to fit a model to a data set directly, augury has a set of
`predict_..._avg_trend()` functions that allows you to average data by group (e.g.
by region), fit the model to the grouped and averaged data, and extract that trend
to apply to the original dataset. See the section below on **Average trend modeling**
for more details.
## Error metrics
When doing model selection, it is always important to use metrics to evaluate
model fit and predictive accuracy. All `predict_...` functions in augury use the
same evaluation framework, implemented through `model_error()`. If a `test` column
is defined, then the evaluation framework is only applied to this set
## Additional functionality
To help streamline other portions of the modeling process, some other functions
are included in the package.
* `probit_transform()` uses a probit transformation on select columns of a data
frame, and inverses it if specified.
* `scale_transform()` scales a vector by a single number, very simple, and can
inverse the scaling if specified.
* `predict_simple()` performs linear interpolation or flat extrapolation in the
same manner as the other `predict_...` functions, but without modeling or
confidence bounds.
* `predict_average()` performs averaging by groups of columns in the same manner
as the other `predict_...` functions, but without modeling or confidence
bounds.
Together, they can be used to transform and scale data for better modeling, and
then inverse these to get data back into the original feature space.
## INLA modeling examples
```{r child = "vignettes/inla-modeling.Rmd"}
```
## Forecasting examples
```{r child = "vignettes/forecast-modeling.Rmd"}
```
## Simple prediction methods
```{r child = "vignettes/simple-methods.Rmd"}
```
## Average trend modeling
```{r child = "vignettes/average-trend.Rmd"}
```