Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update roxygen & websites #225

Merged
merged 12 commits into from
May 5, 2024
Merged
76 changes: 63 additions & 13 deletions R/align_taxa.R
Original file line number Diff line number Diff line change
@@ -1,19 +1,52 @@
#' @title Align Australian plant scientific names to the APC or APNI
#'
#' @description
#' For a list of Australian plant names, find taxonomic or scientific name
#' alignments to the APC or APNI through standardizing formatting
#' and fixing spelling errors
#' alignments to the APC or APNI through standardizing formatting and fixing
#' spelling errors.
#'
#' Usage case: Users will run this function if they wish to see the details
#' of the matching algorithms, the many output columns that the matching
#' function compares to as it seeks the best alignment. They may also select
#' this function if they want to adjust the “fuzziness” level for fuzzy
#' matches, options not allowed in create_taxonomic_update_lookup. This
#' function is the first half of create_taxonomic_update_lookup.
#'
#' This function finds taxonomic alignments in APC or
#' scientific name alignments in APNI.
#' It uses the internal function `match_taxa` to attempt to match input strings
#' to taxon names in the APC/APNI.
#' It sequentially searches for matches against more than 20 different string
#' @details
#' - This function finds taxonomic alignments in APC or scientific name
#' alignments in APNI.
#' - It uses the internal function `match_taxa` to attempt to match input
#' strings to taxon names in the APC/APNI.
#' - It sequentially searches for matches against more than 20 different string
#' patterns, prioritising exact matches (to accepted names as well as
#' synonyms, orthographic variants) over fuzzy matches.
#' It prioritises matches to taxa in the APC over names in the APNI.
#' It identifies string patterns in input names that suggest a name can only be
#' aligned to a genus (hybrids that are not in the APC/ANI; graded species;
#' synonyms, orthographic variants) over fuzzy matches.
#' - It prioritises matches to taxa in the APC over names in the APNI.
#' - It identifies string patterns in input names that suggest a name can only
#' be aligned to a genus (hybrids that are not in the APC/ANI; graded species;
#' taxa not identified to species), and indicates these names only have a
#' genus-rank match.
#'
#' Notes:
#'
#' - If you will be running the function APCalign::create_taxonomic_update_lookup
#' many times, it is best to load the taxonomic resources separately using
#' resources <- load_taxonomic_resources(), then add the argument
#' resources = resources
#' - The name Banksia cerrata does not align as the fuzzy matching algorithm
#' does not allow the first letter of the genus and species epithet to change.
#' - With this function you have the option of changing the fuzzy matching
#' parameters. The defaults, with fuzzy matches only allowing changes of 3
#' (or fewer) characters AND 20% (or less) of characters has been carefully
#' calibrated to catch just about all typos, but very, very rarely mis-align
#' a name. If you wish to introduce less conservative fuzzy matching it is
#' recommended you manually check the aligned names.
#' - It is recommended that you begin with imprecise_fuzzy_matches = FALSE (the
#' default), as quite a few of the less precise fuzzy matches are likely to be
#' erroneous. This argument should be turned on only if you plan to check all
#' alignments manually.
#' - The argument identifier allows you to add a fix text string to all genus-
#' and family- level names, such as identifier = "Royal NP" would return "Acacia
#' sp. \[Royal NP]".
#'
#' @param original_name A list of names to query for taxonomic alignments.
#' @param output (optional) The name of the file to save the results to.
Expand Down Expand Up @@ -121,8 +154,25 @@
#' @export
#'
#' @examples
#' \donttest{align_taxa(c("Poa annua", "Abies alba"))}
#'
#' \donttest{
#' resources <- load_taxonomic_resources()
#'
#' # example 1
#' align_taxa(c("Poa annua", "Abies alba"), resources = resources)
#'
#' # example 2
#' input <- c("Banksia serrata", "Banksia serrate", "Banksia cerrata",
#' "Banksia serrrrata", "Dryandra sp.", "Banksia big red flowers")
#'
#' aligned_taxa <-
#' APCalign::align_taxa(
#' original_name = input,
#' identifier = "APCalign test",
#' full = TRUE,
#' resources = resources
#' )
#'
#' }
#'
#'
#' @seealso
Expand Down
7 changes: 5 additions & 2 deletions R/create_species_state_origin_matrix.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#' Use the taxon distribution data from the APC to determine state level
#' native and introduced origin status
#' @title State level native and introduced origin status
#'
#' @description
#' This function uses the taxon distribution data from the APC to determine
#' state level native and introduced origin status.
#'
#' This function processes the geographic data available in the APC and
#' returns state level native, introduced and more complicated origins status for all taxa.
Expand Down
78 changes: 67 additions & 11 deletions R/create_taxonomic_update_lookup.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,40 @@
#' Create a lookup table with the best-possible scientific name match for a
#' list of Australian plant names
#' @title Create a table with the best-possible scientific name match for
#' Australian plant names
#'
#' This function takes a list of Australian plant names that need to be
#' reconciled with current taxonomy and
#' generates a lookup table of the best-possible scientific name match for
#' each input name.
#' It uses first the function `align_taxa`, then the function `update_taxonomy`
#' to achieve the output.
#' @description
#' This function takes a list of Australian plant names that need to be
#' reconciled with current taxonomy and generates a lookup table of the
#' best-possible scientific name match for each input name.
#'
#' Usage case: This is APCalign’s core function, merging together the alignment
#' and updating of taxonomy.
#'
#' @details
#' - It uses first the function `align_taxa`, then the function `update_taxonomy`
#' to achieve the output. The aligned name is plant name that has been aligned
#' to a taxon name in the APC or APNI by the align_taxa function.
#'
#' Notes:
#'
#' - If you will be running the function APCalign::create_taxonomic_update_lookup
#' many times, it is best to load the taxonomic resources separately using
#' `resources <- load_taxonomic_resources()`, then add the argument
#' resources = resources
#' - The name Banksia cerrata does not align as the fuzzy matching algorithm
#' does not allow the first letter of the genus and species epithet to change.
#' - The argument taxonomic_splits allows you to choose the outcome for updating
#' the names of taxa with ambiguous taxonomic histories; this applies to
#' scientific names that were once attached to a more broadly circumscribed
#' taxon concept, that was then split into several more narrowly circumscribed
#' taxon concepts, one of which retains the original name. There are three
#' options: most_likely_species returns the name that is retained, with
#' alternative names documented in square brackets; return_all adds additional
#' rows to the output, one for each possible taxon concept;
#' collapse_to_higher_taxon returns the genus with possible names in square
#' brackets.
#' - The argument identifier allows you to add a fix text string to all genus-
#' and family- level names, such as identifier = "Royal NP" would return
#' `Acacia sp. \[Royal NP]`.
#'
#' @family taxonomic alignment functions
#'
Expand Down Expand Up @@ -93,13 +121,41 @@
#'
#' @seealso \code{\link{load_taxonomic_resources}}
#' @examples
#' \donttest{resources <- load_taxonomic_resources()
#' \donttest{
#' resources <- load_taxonomic_resources()
#'
#' # example 1
#' create_taxonomic_update_lookup(c("Eucalyptus regnans",
#' "Acacia melanoxylon",
#' "Banksia integrifolia",
#' "Not a species"),
#' resources=resources)
#'}
#' resources = resources)
#'
#' # example 2
#' input <- c("Banksia serrata", "Banksia serrate", "Banksia cerrata",
#' "Banksea serrata", "Banksia serrrrata", "Dryandra")
#'
#' create_taxonomic_update_lookup(
#' taxa = input,
#' identifier = "APCalign test",
#' full = TRUE,
#' resources = resources
#' )
#'
#' # example 3
#' taxon_list <-
#' readr::read_csv(
#' system.file("extdata", "test_taxa.csv", package = "APCalign"),
#' show_col_types = FALSE)
#'
#' create_taxonomic_update_lookup(
#' taxa = taxon_list$original_name,
#' identifier = taxon_list$notes,
#' full = TRUE,
#' resources = resources
#' )
#' }
#'
create_taxonomic_update_lookup <- function(taxa,
stable_or_current_data = "stable",
version = default_version(),
Expand Down
20 changes: 13 additions & 7 deletions R/load_taxonomic_resources.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
#' Load taxonomic resources from either stable or current versions of APC and APNI
#'
#' @title Load taxonomic reference lists, APC & APNI
#'
#' @description
#' This function loads two taxonomic datasets for Australia's vascular plants,
#' the APC and APNI, into the global environment.
#' It accesses taxonomic data from a dataset using the provided version number
#' the APC and APNI, into the global environment. It creates several data frames
#' by filtering and selecting data from the loaded lists.
#'
#' @details
#' - It accesses taxonomic data from a dataset using the provided version number
#' or the default version.
#' The function creates several data frames by filtering and selecting data
#' from the loaded lists.
#' - The output is several dataframes that include subsets of the APC/APNI based
#' on taxon rank and taxonomic status.
#'
#' @param stable_or_current_data Type of dataset to access.
#' The default is "stable", which loads the dataset from a github archived file.
Expand All @@ -21,7 +25,9 @@
#' @export
#'
#' @examples
#' \donttest{load_taxonomic_resources(stable_or_current_data="stable", version="0.0.2.9000")}
#' \donttest{
#' load_taxonomic_resources(stable_or_current_data="stable",
#' version="0.0.2.9000")}
#'

load_taxonomic_resources <-
Expand Down
21 changes: 14 additions & 7 deletions R/match_taxa.R
Original file line number Diff line number Diff line change
@@ -1,15 +1,22 @@
#' Match taxonomic names to accepted names in list
#' @title Match taxonomic names to names in the APC/APNI
#'
#' This function attempts to match input strings to a list of allowable
#' taxonomic names.
#' It cycles through more than 20 different string patterns, sequentially
#' @description
#' This function attempts to match input strings to Australia's reference lists
#' for vascular plants, the APC and APNI. It attempts:
#' 1. perfect matches and fuzzy matches
#' 2. matches to infraspecies, species, genus, and family names
#' 3. matches to the entire input string and subsets there-of
#' 4. searches for string patterns that suggest a specific taxon rank
#'
#' @details
#' - It cycles through more than 20 different string patterns, sequentially
#' searching for additional match patterns.
#' It identifies string patterns in input names that suggest a name can only be
#' - It identifies string patterns in input names that suggest a name can only be
#' aligned to a genus (hybrids that are not accepted names; graded species;
#' taxa not identified to species).
#' It prioritises matches that do not require fuzzy matching (i.e. synonyms,
#' - It prioritises matches that do not require fuzzy matching (i.e. synonyms,
#' orthographic variants) over those that do.
#' If prioritises matches to taxa in the APC over names in the APNI.
#' - If prioritises matches to taxa in the APC over names in the APNI.
#'
#' @param taxa The list of taxa requiring checking
#
Expand Down
17 changes: 10 additions & 7 deletions R/native_anywhere_in_australia.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
#' For a vector of taxon names in to the APC, check if the species are
#' native anywhere in Australia
#'
#' @title Native anywhere in Australia
#'
#' @description
#' This function checks which species from a list is thought to be native anywhere in
#' Australia according to the APC.
#' Important caveats: this will not detect within-Australia introductions,
#'
#' @details
#' Important caveats:
#' - This function will not detect within-Australia introductions,
#' e.g. if a species is from Western Australia and is invasive on the east coast.
#' Also, very recent invasions are unlikely to be documented yet in APC.
#' Ideally check spelling and taxonomy updates first via
#' - Very recent invasions are unlikely to be documented yet in APC.
#' - Ideally check spelling and taxonomy updates first via
#' \link{create_taxonomic_update_lookup}.
#' For the complete matrix of species by states that also represents
#' - For the complete matrix of species by states that also represents
#' within-Australia invasions, use \link{create_species_state_origin_matrix}.
#'
#' @family diversity methods
Expand Down
37 changes: 25 additions & 12 deletions R/standardise_names.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,26 @@

#' Standardises taxon names by performing a series of text substitutions to remove common inconsistencies in taxonomic nomenclature.
#'
#' @title Standardise taxon names
#'
#' @description
#' Standardises taxon names by performing a series of text substitutions to
#' remove common inconsistencies in taxonomic nomenclature.
#'
#' The function takes a character vector of taxon names as input and
#' returns a character vector of taxon names using standardised taxonomic syntax as output.
#' In particular it standardises taxon rank abbreviations and qualifiers (subsp., var., f.), as people use many variants of these terms.
#' It also standardises or removes a few additional filler words used within taxon names (affinis becomes aff.; s.l. and s.s. are removed).
#' returns a character vector of taxon names using standardised taxonomic syntax
#' as output.
#'
#' @details
#' - It removes stray punctuation at the start and end of a character string.
#' - It standardises unusual characters and symbols to ASCII equivalents.
#' - It standardises taxon rank abbreviations and qualifiers (subsp., var., f.),
#' as people use many variants of these terms.
#' - It standardises or removes a few additional filler words used within
#' taxon names (affinis becomes aff.; s.l. and s.s. are removed).
#'
#' @param taxon_names A character vector of taxon names that need to be standardised.
#'
#' @return A character vector of standardised taxon names.
#'
#'
#' @examples
#' standardise_names(c("Quercus suber",
#' "Eucalyptus sp.",
Expand Down Expand Up @@ -149,15 +159,18 @@ extract_genus <- function(taxon_name) {
}


#' Standardise taxon ranks from latin into english.
#'
#' The function takes a character vector of taxon ranks as input and
#' returns a character vector of taxon ranks using standardised english terms.
#' @title Standardise taxon ranks
#'
#' @description
#' Standardise taxon ranks from Latin into English.
#'
#' @param taxon_rank A character vector of taxon ranks that need to be standardised.
#' @details
#' The function takes a character vector of Latin taxon ranks as input and
#' returns a character vector of taxon ranks using standardised English terms.
#'
#' @return A character vector of standardised taxon names.
#' @param taxon_rank A character vector of Latin taxon ranks.
#'
#' @return A character vector of English taxon ranks.
#'
#' @examples
#' standardise_taxon_rank(c("regnum", "kingdom", "classis", "class"))
Expand Down
13 changes: 6 additions & 7 deletions R/state_diversity_counts.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
#' For Australian states and territories, use data from the APC to calculate
#' state-level diversity for native, introduced,
#' and more complicated species origins
#'
#' This function calculates state-level diversity for native, introduced,
#' and more complicated species origins
#' based on the geographic data available in the APC.
#' @title State- and territory-level diversity
#'
#' @description
#' For Australian states and territories, use geographic distribution data from
#' the APC to calculate state-level diversity for native, introduced,
#' and more complicated species origins
#'
#' @family diversity methods
#' @param state A character string indicating the Australian state or
Expand Down
Loading