Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add security assessment via {oysteR} #272

Open
wants to merge 38 commits into
base: master
Choose a base branch
from
Open

Add security assessment via {oysteR} #272

wants to merge 38 commits into from

Conversation

borgmaan
Copy link
Collaborator

@borgmaan borgmaan commented Feb 3, 2023

This is a draft of what the integration with oysteR could look like (#210). I am looking for feedback on some design decisions for following the Suggests integration approach for metrics from other packages (Option 2 in #209). Hopefully this draft PR is a good place to raise these questions and gather feedback. Please advise if another forum is preferred.


1) Handling calls to assess_* functions when package from Suggests is not present

I added some logic to the generic function to detect whether the Suggests dependency is available and prompt the user to install the dependency if it is not. The package is installed and the method dispatch continues as expected if the user chooses to install. An error is raised if the installation is not performed.

This seemed like reasonable behavior to me but I welcome feedback on what the end-user experience should look like here. Also, I noticed custom exceptions/conditions being used throughout the package. Is this a situation where we would want to raise something other than the default from stop?

2) Hooks to all_assessments-based workflows

Another area to address when extending the package in this manner is the behavior of the all_assessments function. This currently works by grep-ing the packages exported functions to find those starting with assess_*.

I have modified this mechanism to allow assess_* functions with dependencies present in Suggests to be excluded from all_assessments() based on the presence of an attribute. The full list of assess_* objects can be returned by passing include_suggests = TRUE).

This seemed in line with the desire to not impose additional dependencies on the end-user but I think there may be many ways to accomplish this (e.g. different naming conventions, etc.). I am not sure what is best.

What are others thoughts on what the behavior should look like here?

3) pkg_ref_cache.security.default Implementation

This is less integration focused and more just a question on implementing this metric. The current implementation does the following:

  1. Populates the cache of dependencies by calling assess_dependencies().
  2. Build a new list of pkg_refs including the original package and its dependencies.
  3. Uses this list to get vectors of package names and versions to pass to oysteR::audit.

This seemed like a good way to get the package dependencies and versions that were relevant to the "assessment context". It also made it easy to cover a number of pkg_ref sources (source, cran, bioconductor) in a single default method.

Admittedly, I am still building my mental model of all the package internals. Is this seem like a good way to accomplish this? Anything to look out for?


There is still work to be done to clean this up and build out supporting tests but was hoping to get some feedback at this early stage. @elimillera @emilliman5 looking forward to your feedback on the above points. And are there others to include in the design discussion?

@codecov-commenter
Copy link

codecov-commenter commented Feb 3, 2023

Codecov Report

Merging #272 (c16eec5) into master (3060d6a) will increase coverage by 2.66%.
The diff coverage is 80.00%.

❗ Current head c16eec5 differs from pull request most recent head 8e98a05. Consider uploading reports for the commit 8e98a05 to get more accurate results

@@            Coverage Diff             @@
##           master     #272      +/-   ##
==========================================
+ Coverage   59.23%   61.90%   +2.66%     
==========================================
  Files          64       69       +5     
  Lines         991     1063      +72     
==========================================
+ Hits          587      658      +71     
- Misses        404      405       +1     
Impacted Files Coverage Δ
R/pkg_ref_cache.R 9.09% <ø> (ø)
R/pkg_ref_cache_NEWS_urls.R 18.18% <ø> (ø)
R/pkg_ref_cache_archive_release_date.R 0.00% <ø> (ø)
R/pkg_ref_cache_bug_reports.R 64.70% <ø> (ø)
R/pkg_ref_cache_bug_reports_host.R 100.00% <ø> (ø)
R/pkg_ref_cache_bug_reports_url.R 55.55% <ø> (ø)
R/pkg_ref_cache_covr_coverage.R 100.00% <ø> (ø)
R/pkg_ref_cache_description.R 100.00% <ø> (ø)
R/pkg_ref_cache_downloads.R 100.00% <ø> (ø)
R/pkg_ref_cache_expr_coverage.R 0.00% <ø> (ø)
... and 22 more

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@emilliman5 emilliman5 self-requested a review February 5, 2023 19:36
@borgmaan
Copy link
Collaborator Author

borgmaan commented Feb 8, 2023

Notes from 2/8 sprint meeting:

  • How to make prompts for package install work smoothly when you are passing multiple pkg_ref objects? Only prompt once? Or store choice?
  • Update this to just be specific to the single package under assessment
    • Can re-use this implementation in the future to assess security for a cohort
  • Is there a way to have all_assessments() return Suggests assessments if the package is present and not if it is not installed?
    • Perhaps have the class attribute set by requireNamespace?

#' @export
assess_security <- function(x, ...) {
# TODO: discuss preferred approach for handling packages within Suggests
if (!requireNamespace("oysteR", quietly = TRUE)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the version of oysteR they have installed matter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, more of a general question is if we need to create some tests for each new metric.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the version of oysteR they have installed matter?

Good catch. Addressed in 1a7e7da

Also, more of a general question is if we need to create some tests for each new metric.

A good discussion to have generally. I did plan on adding them for this metric once the Suggests implementation details were sorted out.

@borgmaan
Copy link
Collaborator Author

Updates based on feedback from 2/8 sprint meeting:

  • How to make prompts for package install work smoothly when you are passing multiple pkg_ref objects? Only prompt once? Or store choice?

Handled via a package-level option that will suppress further attempts to install the package (see 07478e7).

  • Update this to just be specific to the single package under assessment

Only the pkg_ref package is now assessed and dependencies logic has been removed (see c81c54a)

  • Is there a way to have all_assessments() return Suggests assessments if the package is present and not if it is not installed?

Security metric is now included in all_assessments by default if oysteR is installed but excluded otherwise (see 0eeafc4)

@csgillespie
Copy link

If you need anything from {oysteR}, please let me know.

@emilliman5
Copy link
Collaborator

Testing strategy:

  1. Tests to ensure assessments that operate in a plugin manner behave consistently ("API" side tests)
  2. Tests of the actual assessments and metrics can be written in a traditional manner. We will added necessary packages to the test environments
  3. At least 1 environment should not have plugin packages to test usability in absence of plugin packages.

@parmsam-pfizer
Copy link
Collaborator

Ran into this error after installing the branch.

library(riskmetric)

dplyr_pref <- pkg_ref("dplyr")

dplyr_pref$security
#> Error in `dplyr::bind_rows()`:
#> ! Can't combine `version` <character> and `version` <package_version>.

Created on 2023-03-22 by the reprex package (v2.0.1)

@borgmaan
Copy link
Collaborator Author

Ran into this error after installing the branch.

library(riskmetric)

dplyr_pref <- pkg_ref("dplyr")

dplyr_pref$security
#> Error in `dplyr::bind_rows()`:
#> ! Can't combine `version` <character> and `version` <package_version>.

Created on 2023-03-22 by the reprex package (v2.0.1)

@parmsam-pfizer -- thanks for reporting. This is resolved now.

@borgmaan borgmaan marked this pull request as ready for review June 14, 2023 16:07
@borgmaan
Copy link
Collaborator Author

@emilliman5 (and any others) -- this should be ready to review now. I believe I have checked off everything we discussed to add as functionality and have covered the desired testing scenarios.

Note, the oldrel-4 CI job is failing as oysteR has dependencies that rely on R > 4.0.0 (details). All other R CMD check runs are passing (as is code coverage).

Also, as a reminder, there is a wiki page that describes the implementation strategy that may be helpful when reviewing:

https://github.com/pharmaR/riskmetric/wiki/Adding-Metrics-from-External-Packages

@emilliman5 emilliman5 linked an issue Jun 21, 2023 that may be closed by this pull request
#' @return \code{NA} if no vulnerabilities are found, otherwise \code{0}
#'
#' @export
metric_score.pkg_metric_security <- function(x, ...) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if assess_security == 0 wouldn't that mean no vulnerabilities were found? and thus the metric should be 1? but I also seem to remember discussing what 0 vulnerabilities found might mean and that just because none were found doesn't mean non-exist... I think I am back tracking on that line of thought and metric should either be 1 or 0, and we reserve NA for cases when the assessment cannot be performed (for some reason).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can capture the vulnerability overivew then we could set metric to NA if the package is not found in the Sonatype database.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the ref_cache we should capture the oysteR output, if ony the tibble to start, but the entire message would be best (especially the overview).

oysteR::audit("dplyr", version = "1.0.8", type = "cran")
ℹ Using cached results for 1 package

── Vulnerability overview ──

ℹ 1 package was scanned
ℹ 1 package was found in the Sonatype database
ℹ 0 packages had known vulnerabilities
ℹ A total of 0 known vulnerabilities were identified
ℹ See https://github.com/sonatype-nexus-community/oysteR/ for details.
# A tibble: 1 × 8
package version type oss_package description reference vulnerabilities no_of_vulnerabi…

1 dplyr 1.0.8 cran pkg:cran/dplyr@… "dplyr: A… https://… <list [0]> 0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assessment should be the tibble or possibly the list of vulnerabilities found. and then metric can be binary or presence/absence of vulnerabilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrations with oysteR
5 participants