Completed vignette

timbeechey · Feb 16, 2024 · 0f92f3e · 0f92f3e
1 parent 58c7fec
commit 0f92f3e
Show file tree

Hide file tree

Showing 7 changed files with 160 additions and 34 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Package: opa
 Type: Package
 Title: An Implementation of Ordinal Pattern Analysis
-Version: 0.8.1.032
+Version: 0.8.2
 Authors@R: person("Timothy", "Beechey", email = "[email protected]",
     role = c("aut", "cre"), comment = c(ORCID = "0000-0001-8858-946X"))
 Description: Quantifies hypothesis to data fit for repeated measures 
@@ -16,7 +16,12 @@ BugReports: https://github.com/timbeechey/opa/issues
 Encoding: UTF-8
 LazyData: true
 LinkingTo: Rcpp, RcppArmadillo
-Imports: Rcpp, graphics, stats, utils
+Imports: 
+    Rcpp,
+    graphics,
+    stats,
+    utils,
+    lattice
 Suggests: 
     knitr,
     rmarkdown,

diff --git a/NAMESPACE b/NAMESPACE
@@ -67,6 +67,7 @@ importFrom(graphics,par)
 importFrom(graphics,plot.new)
 importFrom(graphics,points)
 importFrom(graphics,segments)
+importFrom(lattice,xyplot)
 importFrom(stats,na.omit)
 importFrom(utils,setTxtProgressBar)
 importFrom(utils,txtProgressBar)

diff --git a/R/fitopa.R b/R/fitopa.R
@@ -111,7 +111,7 @@
 opa <- function(dat, hypothesis, group = NULL, pairing_type = "pairwise",
                 diff_threshold = 0, nreps = 1000L) {
 
-  if (class(hypothesis) == "opahypothesis") {
+  if (inherits(hypothesis, "opahypothesis")) {
     hypothesis <- hypothesis$raw
   }
 

diff --git a/R/opa-package.R b/R/opa-package.R
@@ -27,3 +27,8 @@ NULL
 #' @importFrom stats na.omit
 ## usethis namespace: end
 NULL
+
+## usethis namespace: start
+#' @importFrom lattice xyplot
+## usethis namespace: end
+NULL
diff --git a/README.md b/README.md
@@ -3,14 +3,6 @@
 
 # opa <a href="https://timbeechey.github.io/opa/"></a>
 
-<!-- badges: start -->
-
-![](https://www.r-pkg.org/badges/version-ago/opa?color=orange)
-![](https://cranlogs.r-pkg.org/badges/grand-total/opa)
-[![R-CMD-check](https://github.com/timbeechey/opa/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/timbeechey/opa/actions/workflows/R-CMD-check.yaml)
-[![codecov](https://codecov.io/gh/timbeechey/opa/graph/badge.svg?token=Q3ZI7BBMIK)](https://codecov.io/gh/timbeechey/opa)
-<!-- badges: end -->
-
 An R package for ordinal pattern analysis.
 
 ## Installation

diff --git a/man/pituitary.Rd b/man/pituitary.Rd
diff --git a/vignettes/opa.Rmd b/vignettes/opa.Rmd
@@ -12,64 +12,187 @@ knitr::opts_chunk$set(
   collapse = TRUE,
   comment = "#>"
 )
+
+palette(c("#56B4E9", "#E69F00", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7"))
+
+library(lattice)
 ```
 
 ## Background
 
 `opa` is an implementation of methods described in publications including [Thorngate (1987)](https://doi.org/10.1016/S0166-4115(08)60083-7) and [Grice et al. (2015)](https://doi.org/10.1177/2158244015604192). Thorngate (1987) attributes the original idea to:
 
-Parsons, D. (1975). _The directory of tunes and musical themes_. S. Brown.
+Parsons, D. (1975). _The directory of tunes and musical themes_. S. Brown. 
 
-## How ordinal pattern analysis works
+Ordinal pattern analysis may be useful as an alternative, or addition, to other methods of analyzing repeated measures data such as repeated measures ANOVA and mixed-effects models.
 
-Ordinal pattern analysis is a simple non-parametric method, similar to Kendall's Tau. Whereas Kendall's tau is a measure of similarity between two data sets in terms of rank ordering, ordinal pattern analysis is intended to quantify the match between a hypothesis and patterns of individual-level data across conditions or measurement instances.
+## Modeling repeated measures data
 
-Ordinal pattern analysis works by comparing the relative ordering of pairs of observations and computing whether those pairwise relations are matched by a hypothesis. Each pairwise ordered relation is classified as an increase, a decrease, or as no change. These classifications are encoded as 1, -1 and 0, respectively. For example, a hypothesis of a monotonic increase in a response variable across four experimental conditions can be specified as:
+Once installed, you can load `opa` with
 
-```{r hypothesis1}
-(h <- c(1, 2, 3, 4))
+```{r load_opa}
+library(opa)
 ```
 
-Note that the absolute values are not important, only their relative ordering. The hypothesis `h` encodes six pairwise relations, all increases: `1 1 1 1 1 1`.
+### Data
 
-A row of individual data representing measurements across four conditions, such as:
+For this example we will use the childhood growth data reported by Potthoff & Roy (1964) consisting of measures of the distance between the pituitary and the pteryo-maxillary fissure. Distances were recorded in millimeters when each child was 8, 10, 12 and 14 years old.
 
-```{r vector1}
-dat <- c(65.3, 68.8, 67.0, 73.1)
+```{r str_data}
+str(pituitary)
 ```
 
-encodes six ordered pairwise relations `1 1 1 -1 1 1`. The percentage of orderings which are correctly classified by the hypothesis (_PCC_) is the main quantity of interest in ordinal pattern analysis. Comparing `h` and `dat`, the PCC is `5/6 = 0.833` or 83.3%. A hypothesis which generates a greater PCC is preferred over a hypothesis which generates a lower PCC for given data.
 
-It is also possible to calculate a _chance-value_ for a PCC which is equal to the chance that a PCC at least as great as the PCC of the observed data could occur as a result of a random re-ordering of the data. Chance values can be computed using a randomization test.
+```{r plot_data, fig.width=6, fig.height=6}
+xyplot(distance ~ age | individual, pituitary, type = c("g", "p", "l"),
+       groups = sex, cex = 1.2, pch = 21,
+       fill = c("#E69F0070", "#0072B270"),
+       col = c("#E69F00", "#0072B2"),
+       scales = list(x = list(at = c(8, 10, 12, 14))),
+       xlab = "Age (years)",
+       ylab = "Distance (mm)",
+       main = "Pituitary-Pterygomaxillary Fissure Distance",
+       key = list(space = "bottom", columns = 2,
+                  text = list(c("Female", "Male")),
+                  lines = list(col = c("#E69F00", "#0072B2")),
+                  points = list(pch = 21, fill = c("#E69F0070", "#0072B270"),
+                                col = c("#E69F00", "#0072B2"))))
+```
 
-## Modeling repeated measures data
+#### Wide format
 
-Once installed, you can load `opa` with
+`opa` requires data in wide format, with one row per individual and one column per measurement time or condition.
 
-```{r load_opa}
-library(opa)
+```{r reshape_data}
+dat_wide <- reshape(data = pituitary,
+                    direction = "wide",
+                    idvar = c("individual", "sex"),
+                    timevar = "age",
+                    v.names = "distance")
 ```
 
-### Data
+```{r}
+head(dat_wide)
+```
 
-For this example we will use the `sleepstudy` data from the `lme4` package.
+### Specifying a hypothesis
 
-```{r load_data}
+For this analysis we will assume a hypothesis of monotonic increase in distance with increasing age. This monotonic increase hypothesis can be encoded using the `hypothesis()` function.
 
+```{r define_h1}
+h1 <- hypothesis(c(1, 2, 3, 4), type = "pairwise")
 ```
 
-#### Wide format
+Printing the hypothesis object shows that it consists of sub-hypotheses about six ordinal relations. In this case it is hypothesized that each of the six ordinal relations are increases, coded as `1`.
+
+```{r print_h1}
+print(h1)
+```
+
+The hypothesis can also be visualized with the `plot()` function.
+
+```{r plot_h1, fig.width=4, fig.height=4}
+plot(h1)
+```
+
+### Fitting an ordinal pattern analysis model
+
+How well a hypothesis accounts for observed data can be quantified at the individual and group levels using the `opa()` function. The first required argument to `opa()` is a dataframe consisting of only the response variable columns. The second required argument is the hypothesis.
+
+```{r fit_model1}
+m1 <- opa(dat_wide[,3:6], h1)
+```
+
+The results can be summarized using the `summary()` function.
+
+```{r summary_model1}
+summary(m1)
+```
+
+The individual-level results can also be visualized using `plot()`.
+
+```{r plot_model1, fig.width=7, fig.height=5}
+plot(m1)
+```
+
+It is also possible to determine how well the hypothesis accounts for the data at the group level for each of the sub-hypotheses using the `compare_conditions()` function.
 
-## Specifying a hypothesis
+```{r compare_conds_model1}
+compare_conditions(m1)
+```
+
+These results indicate that the hypothesis accounts least well for the relationship between the first two measurement times. 
+
+### Comparing hypotheses
 
-## Fitting an ordinal pattern analysis model
+The output of `compare_conditions()` indicates that it may be informative to consider a second hypothesis:
+
+```{r define_h2}
+h2 <- hypothesis(c(2, 1, 3, 4))
+```
+
+```{r print_h2}
+print(h2)
+```
+
+To assess the adequacy of `h2` we fit a second `opa()` model, this time passing `h2` as the second argument.
+
+```{r fit_model2}
+m2 <- opa(dat_wide[,3:6], h2)
+```
+
+The `compare_hypotheses()` function can then be used to compare how well two hypothese account for the observed data.
+
+```{r compare_hypotheses}
+compare_hypotheses(m1, m2)
+```
+
+These results indicate that the monotonic increase hypothesis encoded by `h1` better acounts for the data than `h2`. However, the difference between these hypotheses is not large. The computed chance value indicates that a difference at least as large could be produced by chance, through random permutations of the data, about one quarter of the time.
+
+### Comparing groups
+
+So far the analyses have not considered any possible differences between males and females in terms of how well the hypotheses account for the data. It is possible that a given hypothesis account better for males than for females, or vice-versa. A model which accounts for groups within the data can be fitted by passing a factor vector or dataframe column to the `group` argument.
+
+```{r fit_model3}
+m3 <- opa(dat_wide[,3:6], h1, group = dat_wide$sex)
+```
+
+The `compare_groups()` function may then be used to quantify the difference in hypothesis performance between any two groups within the data.
+
+```{r compare_groups}
+compare_groups(m3, "M", "F")
+```
+
+In this case, `compare_groups()` indicates that the difference in how well the hypothesis accounts for males and females growth is very small. the chance value shows that a difference at least as great could be produced by chance around 80% of the time.
+
+### Difference thresholds
+
+Each of the above models has treated any numerical difference in the distance data as a true difference. However, we may wish to treat values that differ by some small amount as equivalent. In this way we may define a threshold of practical or clinical significance. For example, we may decide to consider only differences of at least 1 mm. This can be achieved by passing a threshold value of `1` using the `diff_threshold` argument. 
+
+```{r fit_model4}
+m4 <- opa(dat_wide[,3:6], h1, group = dat_wide$sex, diff_threshold = 1)
+```
+
+Setting the difference threshold to 1 mm results in a greater difference in hypothesis performance between males and females.
+
+```{r diff_threshold_results}
+group_results(m4)
+```
+
+However, `compare_groups()` reveals that even this larger difference could occur quite frequently by chance.
+
+```{r compare_groups_with_diff_threshold}
+compare_groups(m4, "M", "F")
+```
 
 ## References
 
 Grice, J. W., Craig, D. P. A., & Abramson, C. I. (2015). A Simple and Transparent Alternative to Repeated Measures ANOVA. SAGE Open, 5(3), 215824401560419. https://doi.org/10.1177/2158244015604192
 
 Parsons, D. (1975). _The directory of tunes and musical themes_. S. Brown.
 
+Potthoff, R. F., & Roy, S. N. (1964). A Generalized Multivariate Analysis of Variance Model Useful Especially for Growth Curve Problems. Biometrika, 51(3/4), 313–326. https://doi.org/10.2307/2334137
+
 Thorngate, W. (1987). Ordinal Pattern Analysis: A Method for Assessing Theory-Data Fit. Advances in Psychology, 40, 345–364. https://doi.org/10.1016/S0166-4115(08)60083-7