Skip to content

Commit

Permalink
Update roxygen & websites (#225)
Browse files Browse the repository at this point in the history
* update roxygen documention for all functions

---------

Co-authored-by: Will Cornwell <[email protected]>
Co-authored-by: Daniel Falster <[email protected]>
  • Loading branch information
3 people authored May 5, 2024
1 parent 1bf0761 commit 96b8267
Show file tree
Hide file tree
Showing 28 changed files with 866 additions and 599 deletions.
76 changes: 63 additions & 13 deletions R/align_taxa.R
Original file line number Diff line number Diff line change
@@ -1,19 +1,52 @@
#' @title Align Australian plant scientific names to the APC or APNI
#'
#' @description
#' For a list of Australian plant names, find taxonomic or scientific name
#' alignments to the APC or APNI through standardizing formatting
#' and fixing spelling errors
#' alignments to the APC or APNI through standardizing formatting and fixing
#' spelling errors.
#'
#' Usage case: Users will run this function if they wish to see the details
#' of the matching algorithms, the many output columns that the matching
#' function compares to as it seeks the best alignment. They may also select
#' this function if they want to adjust the “fuzziness” level for fuzzy
#' matches, options not allowed in create_taxonomic_update_lookup. This
#' function is the first half of create_taxonomic_update_lookup.
#'
#' This function finds taxonomic alignments in APC or
#' scientific name alignments in APNI.
#' It uses the internal function `match_taxa` to attempt to match input strings
#' to taxon names in the APC/APNI.
#' It sequentially searches for matches against more than 20 different string
#' @details
#' - This function finds taxonomic alignments in APC or scientific name
#' alignments in APNI.
#' - It uses the internal function `match_taxa` to attempt to match input
#' strings to taxon names in the APC/APNI.
#' - It sequentially searches for matches against more than 20 different string
#' patterns, prioritising exact matches (to accepted names as well as
#' synonyms, orthographic variants) over fuzzy matches.
#' It prioritises matches to taxa in the APC over names in the APNI.
#' It identifies string patterns in input names that suggest a name can only be
#' aligned to a genus (hybrids that are not in the APC/ANI; graded species;
#' synonyms, orthographic variants) over fuzzy matches.
#' - It prioritises matches to taxa in the APC over names in the APNI.
#' - It identifies string patterns in input names that suggest a name can only
#' be aligned to a genus (hybrids that are not in the APC/ANI; graded species;
#' taxa not identified to species), and indicates these names only have a
#' genus-rank match.
#'
#' Notes:
#'
#' - If you will be running the function APCalign::create_taxonomic_update_lookup
#' many times, it is best to load the taxonomic resources separately using
#' resources <- load_taxonomic_resources(), then add the argument
#' resources = resources
#' - The name Banksia cerrata does not align as the fuzzy matching algorithm
#' does not allow the first letter of the genus and species epithet to change.
#' - With this function you have the option of changing the fuzzy matching
#' parameters. The defaults, with fuzzy matches only allowing changes of 3
#' (or fewer) characters AND 20% (or less) of characters has been carefully
#' calibrated to catch just about all typos, but very, very rarely mis-align
#' a name. If you wish to introduce less conservative fuzzy matching it is
#' recommended you manually check the aligned names.
#' - It is recommended that you begin with imprecise_fuzzy_matches = FALSE (the
#' default), as quite a few of the less precise fuzzy matches are likely to be
#' erroneous. This argument should be turned on only if you plan to check all
#' alignments manually.
#' - The argument identifier allows you to add a fix text string to all genus-
#' and family- level names, such as identifier = "Royal NP" would return "Acacia
#' sp. \[Royal NP]".
#'
#' @param original_name A list of names to query for taxonomic alignments.
#' @param output (optional) The name of the file to save the results to.
Expand Down Expand Up @@ -121,8 +154,25 @@
#' @export
#'
#' @examples
#' \donttest{align_taxa(c("Poa annua", "Abies alba"))}
#'
#' \donttest{
#' resources <- load_taxonomic_resources()
#'
#' # example 1
#' align_taxa(c("Poa annua", "Abies alba"), resources = resources)
#'
#' # example 2
#' input <- c("Banksia serrata", "Banksia serrate", "Banksia cerrata",
#' "Banksia serrrrata", "Dryandra sp.", "Banksia big red flowers")
#'
#' aligned_taxa <-
#' APCalign::align_taxa(
#' original_name = input,
#' identifier = "APCalign test",
#' full = TRUE,
#' resources = resources
#' )
#'
#' }
#'
#'
#' @seealso
Expand Down
7 changes: 5 additions & 2 deletions R/create_species_state_origin_matrix.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#' Use the taxon distribution data from the APC to determine state level
#' native and introduced origin status
#' @title State level native and introduced origin status
#'
#' @description
#' This function uses the taxon distribution data from the APC to determine
#' state level native and introduced origin status.
#'
#' This function processes the geographic data available in the APC and
#' returns state level native, introduced and more complicated origins status for all taxa.
Expand Down
78 changes: 67 additions & 11 deletions R/create_taxonomic_update_lookup.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,40 @@
#' Create a lookup table with the best-possible scientific name match for a
#' list of Australian plant names
#' @title Create a table with the best-possible scientific name match for
#' Australian plant names
#'
#' This function takes a list of Australian plant names that need to be
#' reconciled with current taxonomy and
#' generates a lookup table of the best-possible scientific name match for
#' each input name.
#' It uses first the function `align_taxa`, then the function `update_taxonomy`
#' to achieve the output.
#' @description
#' This function takes a list of Australian plant names that need to be
#' reconciled with current taxonomy and generates a lookup table of the
#' best-possible scientific name match for each input name.
#'
#' Usage case: This is APCalign’s core function, merging together the alignment
#' and updating of taxonomy.
#'
#' @details
#' - It uses first the function `align_taxa`, then the function `update_taxonomy`
#' to achieve the output. The aligned name is plant name that has been aligned
#' to a taxon name in the APC or APNI by the align_taxa function.
#'
#' Notes:
#'
#' - If you will be running the function APCalign::create_taxonomic_update_lookup
#' many times, it is best to load the taxonomic resources separately using
#' `resources <- load_taxonomic_resources()`, then add the argument
#' resources = resources
#' - The name Banksia cerrata does not align as the fuzzy matching algorithm
#' does not allow the first letter of the genus and species epithet to change.
#' - The argument taxonomic_splits allows you to choose the outcome for updating
#' the names of taxa with ambiguous taxonomic histories; this applies to
#' scientific names that were once attached to a more broadly circumscribed
#' taxon concept, that was then split into several more narrowly circumscribed
#' taxon concepts, one of which retains the original name. There are three
#' options: most_likely_species returns the name that is retained, with
#' alternative names documented in square brackets; return_all adds additional
#' rows to the output, one for each possible taxon concept;
#' collapse_to_higher_taxon returns the genus with possible names in square
#' brackets.
#' - The argument identifier allows you to add a fix text string to all genus-
#' and family- level names, such as identifier = "Royal NP" would return
#' `Acacia sp. \[Royal NP]`.
#'
#' @family taxonomic alignment functions
#'
Expand Down Expand Up @@ -93,13 +121,41 @@
#'
#' @seealso \code{\link{load_taxonomic_resources}}
#' @examples
#' \donttest{resources <- load_taxonomic_resources()
#' \donttest{
#' resources <- load_taxonomic_resources()
#'
#' # example 1
#' create_taxonomic_update_lookup(c("Eucalyptus regnans",
#' "Acacia melanoxylon",
#' "Banksia integrifolia",
#' "Not a species"),
#' resources=resources)
#'}
#' resources = resources)
#'
#' # example 2
#' input <- c("Banksia serrata", "Banksia serrate", "Banksia cerrata",
#' "Banksea serrata", "Banksia serrrrata", "Dryandra")
#'
#' create_taxonomic_update_lookup(
#' taxa = input,
#' identifier = "APCalign test",
#' full = TRUE,
#' resources = resources
#' )
#'
#' # example 3
#' taxon_list <-
#' readr::read_csv(
#' system.file("extdata", "test_taxa.csv", package = "APCalign"),
#' show_col_types = FALSE)
#'
#' create_taxonomic_update_lookup(
#' taxa = taxon_list$original_name,
#' identifier = taxon_list$notes,
#' full = TRUE,
#' resources = resources
#' )
#' }
#'
create_taxonomic_update_lookup <- function(taxa,
stable_or_current_data = "stable",
version = default_version(),
Expand Down
20 changes: 13 additions & 7 deletions R/load_taxonomic_resources.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
#' Load taxonomic resources from either stable or current versions of APC and APNI
#'
#' @title Load taxonomic reference lists, APC & APNI
#'
#' @description
#' This function loads two taxonomic datasets for Australia's vascular plants,
#' the APC and APNI, into the global environment.
#' It accesses taxonomic data from a dataset using the provided version number
#' the APC and APNI, into the global environment. It creates several data frames
#' by filtering and selecting data from the loaded lists.
#'
#' @details
#' - It accesses taxonomic data from a dataset using the provided version number
#' or the default version.
#' The function creates several data frames by filtering and selecting data
#' from the loaded lists.
#' - The output is several dataframes that include subsets of the APC/APNI based
#' on taxon rank and taxonomic status.
#'
#' @param stable_or_current_data Type of dataset to access.
#' The default is "stable", which loads the dataset from a github archived file.
Expand All @@ -21,7 +25,9 @@
#' @export
#'
#' @examples
#' \donttest{load_taxonomic_resources(stable_or_current_data="stable", version="0.0.2.9000")}
#' \donttest{
#' load_taxonomic_resources(stable_or_current_data="stable",
#' version="0.0.2.9000")}
#'

load_taxonomic_resources <-
Expand Down
21 changes: 14 additions & 7 deletions R/match_taxa.R
Original file line number Diff line number Diff line change
@@ -1,15 +1,22 @@
#' Match taxonomic names to accepted names in list
#' @title Match taxonomic names to names in the APC/APNI
#'
#' This function attempts to match input strings to a list of allowable
#' taxonomic names.
#' It cycles through more than 20 different string patterns, sequentially
#' @description
#' This function attempts to match input strings to Australia's reference lists
#' for vascular plants, the APC and APNI. It attempts:
#' 1. perfect matches and fuzzy matches
#' 2. matches to infraspecies, species, genus, and family names
#' 3. matches to the entire input string and subsets there-of
#' 4. searches for string patterns that suggest a specific taxon rank
#'
#' @details
#' - It cycles through more than 20 different string patterns, sequentially
#' searching for additional match patterns.
#' It identifies string patterns in input names that suggest a name can only be
#' - It identifies string patterns in input names that suggest a name can only be
#' aligned to a genus (hybrids that are not accepted names; graded species;
#' taxa not identified to species).
#' It prioritises matches that do not require fuzzy matching (i.e. synonyms,
#' - It prioritises matches that do not require fuzzy matching (i.e. synonyms,
#' orthographic variants) over those that do.
#' If prioritises matches to taxa in the APC over names in the APNI.
#' - If prioritises matches to taxa in the APC over names in the APNI.
#'
#' @param taxa The list of taxa requiring checking
#
Expand Down
17 changes: 10 additions & 7 deletions R/native_anywhere_in_australia.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
#' For a vector of taxon names in to the APC, check if the species are
#' native anywhere in Australia
#'
#' @title Native anywhere in Australia
#'
#' @description
#' This function checks which species from a list is thought to be native anywhere in
#' Australia according to the APC.
#' Important caveats: this will not detect within-Australia introductions,
#'
#' @details
#' Important caveats:
#' - This function will not detect within-Australia introductions,
#' e.g. if a species is from Western Australia and is invasive on the east coast.
#' Also, very recent invasions are unlikely to be documented yet in APC.
#' Ideally check spelling and taxonomy updates first via
#' - Very recent invasions are unlikely to be documented yet in APC.
#' - Ideally check spelling and taxonomy updates first via
#' \link{create_taxonomic_update_lookup}.
#' For the complete matrix of species by states that also represents
#' - For the complete matrix of species by states that also represents
#' within-Australia invasions, use \link{create_species_state_origin_matrix}.
#'
#' @family diversity methods
Expand Down
37 changes: 25 additions & 12 deletions R/standardise_names.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,26 @@

#' Standardises taxon names by performing a series of text substitutions to remove common inconsistencies in taxonomic nomenclature.
#'
#' @title Standardise taxon names
#'
#' @description
#' Standardises taxon names by performing a series of text substitutions to
#' remove common inconsistencies in taxonomic nomenclature.
#'
#' The function takes a character vector of taxon names as input and
#' returns a character vector of taxon names using standardised taxonomic syntax as output.
#' In particular it standardises taxon rank abbreviations and qualifiers (subsp., var., f.), as people use many variants of these terms.
#' It also standardises or removes a few additional filler words used within taxon names (affinis becomes aff.; s.l. and s.s. are removed).
#' returns a character vector of taxon names using standardised taxonomic syntax
#' as output.
#'
#' @details
#' - It removes stray punctuation at the start and end of a character string.
#' - It standardises unusual characters and symbols to ASCII equivalents.
#' - It standardises taxon rank abbreviations and qualifiers (subsp., var., f.),
#' as people use many variants of these terms.
#' - It standardises or removes a few additional filler words used within
#' taxon names (affinis becomes aff.; s.l. and s.s. are removed).
#'
#' @param taxon_names A character vector of taxon names that need to be standardised.
#'
#' @return A character vector of standardised taxon names.
#'
#'
#' @examples
#' standardise_names(c("Quercus suber",
#' "Eucalyptus sp.",
Expand Down Expand Up @@ -149,15 +159,18 @@ extract_genus <- function(taxon_name) {
}


#' Standardise taxon ranks from latin into english.
#'
#' The function takes a character vector of taxon ranks as input and
#' returns a character vector of taxon ranks using standardised english terms.
#' @title Standardise taxon ranks
#'
#' @description
#' Standardise taxon ranks from Latin into English.
#'
#' @param taxon_rank A character vector of taxon ranks that need to be standardised.
#' @details
#' The function takes a character vector of Latin taxon ranks as input and
#' returns a character vector of taxon ranks using standardised English terms.
#'
#' @return A character vector of standardised taxon names.
#' @param taxon_rank A character vector of Latin taxon ranks.
#'
#' @return A character vector of English taxon ranks.
#'
#' @examples
#' standardise_taxon_rank(c("regnum", "kingdom", "classis", "class"))
Expand Down
13 changes: 6 additions & 7 deletions R/state_diversity_counts.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
#' For Australian states and territories, use data from the APC to calculate
#' state-level diversity for native, introduced,
#' and more complicated species origins
#'
#' This function calculates state-level diversity for native, introduced,
#' and more complicated species origins
#' based on the geographic data available in the APC.
#' @title State- and territory-level diversity
#'
#' @description
#' For Australian states and territories, use geographic distribution data from
#' the APC to calculate state-level diversity for native, introduced,
#' and more complicated species origins
#'
#' @family diversity methods
#' @param state A character string indicating the Australian state or
Expand Down
Loading

0 comments on commit 96b8267

Please sign in to comment.