pkg_score returns different value if metic is `NA` and weights are set explicitly #293

emilliman5 · 2023-06-02T13:39:13Z

No description provided.

Jeff-Thompson12 · 2023-06-02T15:06:14Z

Small reprex

library(riskmetric)
library(tibble)

dplyr_ref <-
  pkg_ref("dplyr",
          source = "pkg_cran_remote",
          repos = c("https://cran.rstudio.com")) |>
  as_tibble()

dplyr_assess <-
  pkg_assess(dplyr_ref) |>
  as_tibble()
dplyr_assess$covr_coverage # Is NA

weights1 <-
  c(has_vignettes = 1, has_news = 1, news_current = 1, has_bug_reports_url = 1,
    has_website = 1, has_maintainer = 1, has_source_control = 1, export_help = 1,
    bugs_status = 1, license = 1, covr_coverage = 0, downloads_1yr = 1
  )
weights2 <- 
  c(has_vignettes = 1, has_news = 1, news_current = 1, has_bug_reports_url = 1,
    has_website = 1, has_maintainer = 1, has_source_control = 1, export_help = 1,
    bugs_status = 1, license = 1, covr_coverage = 1, downloads_1yr = 1
  )

dplyr_score1 <- pkg_score(dplyr_assess, weights = weights1)
dplyr_score1$pkg_score

dplyr_score2 <- pkg_score(dplyr_assess, weights = weights2)
dplyr_score2$pkg_score # Score does not match

AARON-CLARK · 2023-06-06T14:44:24Z

Whoops, I wrote up a reprex too, not knowing that @Jeff-Thompson12 had already done the same. I'll just share it anyway.

Instead of just toggling one metric with an NA assessment/score to 0, I toggle them all below. As you can see below, the pkg {dplyr} has 7 assessments that are NA given the following pkg_ref call and the score drops from 0.45 to 0.13 when zero'd! I would expect the default score (with NAs) should be 0.13; @emilliman5, do you agree? That is, the existence of an NA shouldn't hurt the riskmetric score by default. It should automatically exclude that metric from the score calculation? That way, the onus would be on the user to tell riskmetric which metrics are important to them by explicitly passing a 1 to the weights vector arg. Is that a adequate summary of our discussion from last Friday?

library(dplyr)
library(riskmetric)
packageVersion("riskmetric")

> [1] ‘0.2.2’

assessed <- "dplyr" %>%
  pkg_ref(source = "pkg_cran_remote", repos = c("https://cran.rstudio.com")) %>%
  as_tibble() %>%
  pkg_assess()

initial_scoring <- assessed %>% pkg_score()
initial_scoring$pkg_score %>% round(2)

> [1] 0.45
  
# riskmetric doesn't appear to be picking up certain metrics
# so we'll set their weights to zero here.
metric_scores <- initial_scoring %>%
  select(-c(package, version, pkg_ref, pkg_score)) %>%
  t

> metric_scores
                           [,1]
bugs_status          0.53333333
covr_coverage                NA
size_codebase                NA
export_help                  NA
r_cmd_check                  NA
dependencies         0.01098694
reverse_dependencies 1.00000000
license                      NA
has_maintainer       1.00000000
remote_checks        0.88461538
exported_namespace           NA
has_website          1.00000000
downloads_1yr        0.99339687
has_news             1.00000000
has_vignettes        1.00000000
has_examples                 NA
has_source_control   1.00000000
has_bug_reports_url  1.00000000
news_current         1.00000000

metric_weights <- ifelse(is.na(metric_scores[,1]), 0, 1)

# re-score the pkg, explicitly setting NAs with a new weight = 0
assessed %>%
  pkg_score(weights = metric_weights) %>%
  pull("pkg_score") %>%
  round(2)

> [1] 0.13

emilliman5 · 2023-06-09T16:27:00Z

To summarize a few things after reproducing this bug as well as digging through the problematic functions:

We expect the above reprex's to produce the same score (which they do not)
a) in fact: omitting weights and supplying uniform weights returns different scores (this is serious IMO)
We need to account for case where a df of metrics (i.e. multiple packages being assessed) has different sets of missing metrics
change management: this fix is going to change scoring across the board, communicating this will be important, we may want to provide some sort of minimal reverse compatibility.
In the above reprex; the correct pkg_score is 0.13.

AARON-CLARK · 2024-02-06T17:43:04Z

Bumping this "squishy spot" in riskmetric with a +1

emilliman5 added Bug Something isn't working High Priority High Priority labels Jun 2, 2023

emilliman5 self-assigned this Jun 2, 2023

AARON-CLARK mentioned this issue Jul 5, 2023

Attempted fix for known riskmetric weighting bug #293 pharmaR/riskassessment#566

Merged

emilliman5 linked a pull request Jul 18, 2023 that will close this issue

Metric weights 293 #304

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg_score returns different value if metic is `NA` and weights are set explicitly #293

pkg_score returns different value if metic is `NA` and weights are set explicitly #293

emilliman5 commented Jun 2, 2023

Jeff-Thompson12 commented Jun 2, 2023

AARON-CLARK commented Jun 6, 2023

emilliman5 commented Jun 9, 2023

AARON-CLARK commented Feb 6, 2024 •

edited

Loading

pkg_score returns different value if metic is NA and weights are set explicitly #293

pkg_score returns different value if metic is NA and weights are set explicitly #293

Comments

emilliman5 commented Jun 2, 2023

Jeff-Thompson12 commented Jun 2, 2023

AARON-CLARK commented Jun 6, 2023

emilliman5 commented Jun 9, 2023

AARON-CLARK commented Feb 6, 2024 • edited Loading

pkg_score returns different value if metic is `NA` and weights are set explicitly #293

pkg_score returns different value if metic is `NA` and weights are set explicitly #293

AARON-CLARK commented Feb 6, 2024 •

edited

Loading