Skip to content

Commit

Permalink
Fix bug in prior elicitation when background item is seen only once
Browse files Browse the repository at this point in the history
  • Loading branch information
lgaborini committed May 31, 2019
1 parent bf6cb6d commit e631c4c
Show file tree
Hide file tree
Showing 6 changed files with 44 additions and 13 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: bayessource
Type: Package
Title: Bayesian approach to evaluate whether two datasets come from the same population
Version: 0.3
Version: 0.3.1
Authors@R: c(person("Lorenzo", "Gaborini", role = c("aut", "cre"),
email = "[email protected]"),
person("Silvia", "Bozza", role = c("ctb", "cph")))
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# bayessource 0.3.1

* Fig bug in `make_priors_and_init` when a background item is seen only once.
Now warns or fails gracefully.

# bayessource 0.3

* Add prior and init elicitation functions.
Expand Down
36 changes: 29 additions & 7 deletions R/two_level_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,16 @@
#'
#' Two-level function: item is chosen with the third parameter.
#'
#' Items who are observed only once cannot contribute to within covariance estimation.
#' In that case, either we skip its contribute (default) or fail.
#'
#' Multivariate functions supporting paper:
#' Bozza, A., Taroni, F., Marquis, R., M. Schmittbuhl
#'
#' This function is RNG-free.
#'
#' @param population matrix or dataframe containing data columns, item column and other columns
#' @param idx.variables data columns (indexes or character vectors)
#' @param idx.item item column (index or character vector)
#' @param singular.action action to take when one of the items is observed once; one of `'skip'` (default) or `'fail'`
#' @importFrom stats cov
#' @importFrom utils modifyList
#' @return list with items:
Expand All @@ -25,7 +27,9 @@
#' - `W` (the within covariance matrix)
#' - `B` (the between covariance matrix)
#' @keywords internal
two.level.multivariate.calculate.UC <- function(population, idx.variables, idx.item) {
two.level.multivariate.calculate.UC <- function(population, idx.variables, idx.item, singular.action = 'skip') {

stopifnot(singular.action %in% c('skip', 'fail'))

# Convert to matrix
mtx_population <- as.matrix(population[, idx.variables])
Expand All @@ -37,10 +41,14 @@ two.level.multivariate.calculate.UC <- function(population, idx.variables, idx.i
all.items <- as.vector(unique(item_population))
n.items <- length(all.items)

if (n.items == 1) {
stop('two.level.multivariate.calculate.UC: only one item in population')
}

variable.names <- colnames(population)[idx.variables]
n.variables <- length(idx.variables)
if (n.variables < 2) {
stop("cannot handle fewer than two variables with this function")
stop("two.level.multivariate.calculate.UC: cannot handle fewer than two variables with this function")
}

# How many replicates per item
Expand All @@ -64,6 +72,7 @@ two.level.multivariate.calculate.UC <- function(population, idx.variables, idx.i
colnames(group.means) <- variable.names

n.pop <- dim(population)[1]
n.items.skip <- 0

# for each unique item
for (i.item in 1:n.items) {
Expand All @@ -75,12 +84,22 @@ two.level.multivariate.calculate.UC <- function(population, idx.variables, idx.i
n.replicates <- length(idx.replicates)

# pick out the measurements for the item
tempdat <- mtx_population[idx.replicates, ]
tempdat <- mtx_population[idx.replicates, , drop = FALSE]


# assign values to the group means for the population
means <- colMeans(tempdat)
group.means[i.item, ] <- means

if (n.replicates == 1) {
if (identical(singular.action, 'skip')) {
message(paste0('two.level.multivariate.calculate.UC: item "', i.item, '" has 1 replicate: cannot contribute to within covariance matrix estimation. Skipping.'))
n.items.skip <- n.items.skip + 1
next
}
stop(paste0('two.level.multivariate.calculate.UC: item "', i.item, '" has 1 replicate: cannot contribute to within covariance matrix estimation. Failing.'))
}

# sum the within and between item means
S.this <- t(means - all.means) %*% (means - all.means)
S <- S + S.this
Expand All @@ -93,11 +112,14 @@ two.level.multivariate.calculate.UC <- function(population, idx.variables, idx.i
Sw <- Sw + Cov.this
}

# Only these items have contributed to S, Sw
n.items.contributing <- n.items - n.items.skip

# convert matrix Sw to matrix U
W <- Sw/(n.pop - n.items)
W <- Sw/(n.pop - n.items.contributing)

# convert matrix S to matrix C
B <- S/(n.items - 1)
B <- S/(n.items.contributing - 1)

op <- list(group.means = group.means, all.means = all.means, W = W, B = B)
return(op)
Expand Down
2 changes: 1 addition & 1 deletion man/inv_pd.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/samesource_C.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions man/two.level.multivariate.calculate.UC.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit e631c4c

Please sign in to comment.