diff --git a/DESCRIPTION b/DESCRIPTION index 2f29c16..abd30d7 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: opa Type: Package Title: An Implementation of Ordinal Pattern Analysis -Version: 0.8.1.032 +Version: 0.8.2 Authors@R: person("Timothy", "Beechey", email = "tim.beechey@proton.me", role = c("aut", "cre"), comment = c(ORCID = "0000-0001-8858-946X")) Description: Quantifies hypothesis to data fit for repeated measures @@ -16,7 +16,12 @@ BugReports: https://github.com/timbeechey/opa/issues Encoding: UTF-8 LazyData: true LinkingTo: Rcpp, RcppArmadillo -Imports: Rcpp, graphics, stats, utils +Imports: + Rcpp, + graphics, + stats, + utils, + lattice Suggests: knitr, rmarkdown, diff --git a/NAMESPACE b/NAMESPACE index 612bd15..a08cc0f 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -67,6 +67,7 @@ importFrom(graphics,par) importFrom(graphics,plot.new) importFrom(graphics,points) importFrom(graphics,segments) +importFrom(lattice,xyplot) importFrom(stats,na.omit) importFrom(utils,setTxtProgressBar) importFrom(utils,txtProgressBar) diff --git a/R/fitopa.R b/R/fitopa.R index bf2343d..6e3f9b7 100644 --- a/R/fitopa.R +++ b/R/fitopa.R @@ -111,7 +111,7 @@ opa <- function(dat, hypothesis, group = NULL, pairing_type = "pairwise", diff_threshold = 0, nreps = 1000L) { - if (class(hypothesis) == "opahypothesis") { + if (inherits(hypothesis, "opahypothesis")) { hypothesis <- hypothesis$raw } diff --git a/R/opa-package.R b/R/opa-package.R index 90693c9..c1d6061 100644 --- a/R/opa-package.R +++ b/R/opa-package.R @@ -27,3 +27,8 @@ NULL #' @importFrom stats na.omit ## usethis namespace: end NULL + +## usethis namespace: start +#' @importFrom lattice xyplot +## usethis namespace: end +NULL diff --git a/README.md b/README.md index c360e47..de804f0 100644 --- a/README.md +++ b/README.md @@ -3,14 +3,6 @@ # opa - - -![](https://www.r-pkg.org/badges/version-ago/opa?color=orange) -![](https://cranlogs.r-pkg.org/badges/grand-total/opa) -[![R-CMD-check](https://github.com/timbeechey/opa/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/timbeechey/opa/actions/workflows/R-CMD-check.yaml) -[![codecov](https://codecov.io/gh/timbeechey/opa/graph/badge.svg?token=Q3ZI7BBMIK)](https://codecov.io/gh/timbeechey/opa) - - An R package for ordinal pattern analysis. ## Installation diff --git a/man/pituitary.Rd b/man/pituitary.Rd index dd6820d..83c6816 100644 --- a/man/pituitary.Rd +++ b/man/pituitary.Rd @@ -8,7 +8,7 @@ ## `pituitary` A data frame with 108 rows and 4 columns: \describe{ - \item{distance}{distance in milimetres from the pituitary to the pteryo-maxillary fissure} + \item{distance}{distance in mm from the pituitary to the pteryo-maxillary fissure} \item{age}{age in years} \item{individual}{identifier for each individual} \item{sex}{sex of each individual} diff --git a/vignettes/opa.Rmd b/vignettes/opa.Rmd index 267ec70..4f129c8 100644 --- a/vignettes/opa.Rmd +++ b/vignettes/opa.Rmd @@ -12,57 +12,178 @@ knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) + +palette(c("#56B4E9", "#E69F00", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")) + +library(lattice) ``` ## Background `opa` is an implementation of methods described in publications including [Thorngate (1987)](https://doi.org/10.1016/S0166-4115(08)60083-7) and [Grice et al. (2015)](https://doi.org/10.1177/2158244015604192). Thorngate (1987) attributes the original idea to: -Parsons, D. (1975). _The directory of tunes and musical themes_. S. Brown. +Parsons, D. (1975). _The directory of tunes and musical themes_. S. Brown. -## How ordinal pattern analysis works +Ordinal pattern analysis may be useful as an alternative, or addition, to other methods of analyzing repeated measures data such as repeated measures ANOVA and mixed-effects models. -Ordinal pattern analysis is a simple non-parametric method, similar to Kendall's Tau. Whereas Kendall's tau is a measure of similarity between two data sets in terms of rank ordering, ordinal pattern analysis is intended to quantify the match between a hypothesis and patterns of individual-level data across conditions or measurement instances. +## Modeling repeated measures data -Ordinal pattern analysis works by comparing the relative ordering of pairs of observations and computing whether those pairwise relations are matched by a hypothesis. Each pairwise ordered relation is classified as an increase, a decrease, or as no change. These classifications are encoded as 1, -1 and 0, respectively. For example, a hypothesis of a monotonic increase in a response variable across four experimental conditions can be specified as: +Once installed, you can load `opa` with -```{r hypothesis1} -(h <- c(1, 2, 3, 4)) +```{r load_opa} +library(opa) ``` -Note that the absolute values are not important, only their relative ordering. The hypothesis `h` encodes six pairwise relations, all increases: `1 1 1 1 1 1`. +### Data -A row of individual data representing measurements across four conditions, such as: +For this example we will use the childhood growth data reported by Potthoff & Roy (1964) consisting of measures of the distance between the pituitary and the pteryo-maxillary fissure. Distances were recorded in millimeters when each child was 8, 10, 12 and 14 years old. -```{r vector1} -dat <- c(65.3, 68.8, 67.0, 73.1) +```{r str_data} +str(pituitary) ``` -encodes six ordered pairwise relations `1 1 1 -1 1 1`. The percentage of orderings which are correctly classified by the hypothesis (_PCC_) is the main quantity of interest in ordinal pattern analysis. Comparing `h` and `dat`, the PCC is `5/6 = 0.833` or 83.3%. A hypothesis which generates a greater PCC is preferred over a hypothesis which generates a lower PCC for given data. -It is also possible to calculate a _chance-value_ for a PCC which is equal to the chance that a PCC at least as great as the PCC of the observed data could occur as a result of a random re-ordering of the data. Chance values can be computed using a randomization test. +```{r plot_data, fig.width=6, fig.height=6} +xyplot(distance ~ age | individual, pituitary, type = c("g", "p", "l"), + groups = sex, cex = 1.2, pch = 21, + fill = c("#E69F0070", "#0072B270"), + col = c("#E69F00", "#0072B2"), + scales = list(x = list(at = c(8, 10, 12, 14))), + xlab = "Age (years)", + ylab = "Distance (mm)", + main = "Pituitary-Pterygomaxillary Fissure Distance", + key = list(space = "bottom", columns = 2, + text = list(c("Female", "Male")), + lines = list(col = c("#E69F00", "#0072B2")), + points = list(pch = 21, fill = c("#E69F0070", "#0072B270"), + col = c("#E69F00", "#0072B2")))) +``` -## Modeling repeated measures data +#### Wide format -Once installed, you can load `opa` with +`opa` requires data in wide format, with one row per individual and one column per measurement time or condition. -```{r load_opa} -library(opa) +```{r reshape_data} +dat_wide <- reshape(data = pituitary, + direction = "wide", + idvar = c("individual", "sex"), + timevar = "age", + v.names = "distance") ``` -### Data +```{r} +head(dat_wide) +``` -For this example we will use the `sleepstudy` data from the `lme4` package. +### Specifying a hypothesis -```{r load_data} +For this analysis we will assume a hypothesis of monotonic increase in distance with increasing age. This monotonic increase hypothesis can be encoded using the `hypothesis()` function. +```{r define_h1} +h1 <- hypothesis(c(1, 2, 3, 4), type = "pairwise") ``` -#### Wide format +Printing the hypothesis object shows that it consists of sub-hypotheses about six ordinal relations. In this case it is hypothesized that each of the six ordinal relations are increases, coded as `1`. + +```{r print_h1} +print(h1) +``` + +The hypothesis can also be visualized with the `plot()` function. + +```{r plot_h1, fig.width=4, fig.height=4} +plot(h1) +``` + +### Fitting an ordinal pattern analysis model + +How well a hypothesis accounts for observed data can be quantified at the individual and group levels using the `opa()` function. The first required argument to `opa()` is a dataframe consisting of only the response variable columns. The second required argument is the hypothesis. + +```{r fit_model1} +m1 <- opa(dat_wide[,3:6], h1) +``` + +The results can be summarized using the `summary()` function. + +```{r summary_model1} +summary(m1) +``` + +The individual-level results can also be visualized using `plot()`. + +```{r plot_model1, fig.width=7, fig.height=5} +plot(m1) +``` + +It is also possible to determine how well the hypothesis accounts for the data at the group level for each of the sub-hypotheses using the `compare_conditions()` function. -## Specifying a hypothesis +```{r compare_conds_model1} +compare_conditions(m1) +``` + +These results indicate that the hypothesis accounts least well for the relationship between the first two measurement times. + +### Comparing hypotheses -## Fitting an ordinal pattern analysis model +The output of `compare_conditions()` indicates that it may be informative to consider a second hypothesis: + +```{r define_h2} +h2 <- hypothesis(c(2, 1, 3, 4)) +``` + +```{r print_h2} +print(h2) +``` + +To assess the adequacy of `h2` we fit a second `opa()` model, this time passing `h2` as the second argument. + +```{r fit_model2} +m2 <- opa(dat_wide[,3:6], h2) +``` + +The `compare_hypotheses()` function can then be used to compare how well two hypothese account for the observed data. + +```{r compare_hypotheses} +compare_hypotheses(m1, m2) +``` + +These results indicate that the monotonic increase hypothesis encoded by `h1` better acounts for the data than `h2`. However, the difference between these hypotheses is not large. The computed chance value indicates that a difference at least as large could be produced by chance, through random permutations of the data, about one quarter of the time. + +### Comparing groups + +So far the analyses have not considered any possible differences between males and females in terms of how well the hypotheses account for the data. It is possible that a given hypothesis account better for males than for females, or vice-versa. A model which accounts for groups within the data can be fitted by passing a factor vector or dataframe column to the `group` argument. + +```{r fit_model3} +m3 <- opa(dat_wide[,3:6], h1, group = dat_wide$sex) +``` + +The `compare_groups()` function may then be used to quantify the difference in hypothesis performance between any two groups within the data. + +```{r compare_groups} +compare_groups(m3, "M", "F") +``` + +In this case, `compare_groups()` indicates that the difference in how well the hypothesis accounts for males and females growth is very small. the chance value shows that a difference at least as great could be produced by chance around 80% of the time. + +### Difference thresholds + +Each of the above models has treated any numerical difference in the distance data as a true difference. However, we may wish to treat values that differ by some small amount as equivalent. In this way we may define a threshold of practical or clinical significance. For example, we may decide to consider only differences of at least 1 mm. This can be achieved by passing a threshold value of `1` using the `diff_threshold` argument. + +```{r fit_model4} +m4 <- opa(dat_wide[,3:6], h1, group = dat_wide$sex, diff_threshold = 1) +``` + +Setting the difference threshold to 1 mm results in a greater difference in hypothesis performance between males and females. + +```{r diff_threshold_results} +group_results(m4) +``` + +However, `compare_groups()` reveals that even this larger difference could occur quite frequently by chance. + +```{r compare_groups_with_diff_threshold} +compare_groups(m4, "M", "F") +``` ## References @@ -70,6 +191,8 @@ Grice, J. W., Craig, D. P. A., & Abramson, C. I. (2015). A Simple and Transparen Parsons, D. (1975). _The directory of tunes and musical themes_. S. Brown. +Potthoff, R. F., & Roy, S. N. (1964). A Generalized Multivariate Analysis of Variance Model Useful Especially for Growth Curve Problems. Biometrika, 51(3/4), 313–326. https://doi.org/10.2307/2334137 + Thorngate, W. (1987). Ordinal Pattern Analysis: A Method for Assessing Theory-Data Fit. Advances in Psychology, 40, 345–364. https://doi.org/10.1016/S0166-4115(08)60083-7