-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Tobias Schulze
committed
Mar 15, 2022
1 parent
ae989d7
commit 42d34a2
Showing
3 changed files
with
29 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,31 @@ | ||
# REcoTox | ||
|
||
## Background | ||
|
||
The search and extraction of experimental ecotoxicological information is often a tedious work. A good and comprehensive data source is the [US EPA ECOTOX Knowledgebase](https://cfpub.epa.gov/ecotox/ "US EPA ECOTOX Knowledgebase"). It contains about 1 million data points for more than 12,000 chemicals and 13,000 single species. However, for a high-throughput hazard assessment, it is not possible to extract all relevant data of the online database The purpose of REcoTox is to extract the relevant information and to aggregate the data based on the user criteria out of the entire database [ASCII files](https://gaftp.epa.gov/ecotox/ecotox_ascii_03_10_2022.zip "ECOTOX Knowledgebase ASCII files"). | ||
|
||
## Introduction | ||
|
||
[REcoTox](https://github.com/tsufz/REcoTox) is a semi-automated, interactive workflow to process [US EPA ECOTOX Knowledgebase](https://cfpub.epa.gov/ecotox/ "US EPA ECOTOX Knowledgebase") entire database [ASCII files](https://gaftp.epa.gov/ecotox/ecotox_ascii_03_10_2022.zip "ECOTOX Knowledgebase ASCII files") to extract and process ecotoxicological data relevant (but not restricted) to the ecotoxicity groups algae, crustaceans, and fish in the aquatic domain. The latest version of the [ASCII files](https://gaftp.epa.gov/ecotox/ecotox_ascii_03_10_2022.zip "ECOTOX Knowledgebase ASCII files") is available on [US EPA ECOTOX Knowledgebase](https://cfpub.epa.gov/ecotox/ "US EPA ECOTOX Knowledgebase"). The focus is aquatic ecotoxicity and the unit of the retrieved data is `mg/L`. | ||
|
||
For use of [REcoTox](https://github.com/tsufz/REcoTox), clone the repository to your computer: | ||
|
||
`git clone https://github.com/tsufz/REcoTox.git` | ||
|
||
The workflow expects an `R version >4.0.0`. Please install additionally the `R packages` `Tidyverse`, `data_table` and `sqldf`. | ||
|
||
## Workflow | ||
|
||
The file `Query_Ecotox_DB.R` contains the workflow and loads all relevant packages and functions. The workflows allows to filter for endpoints, measurements, and species. The ecotoxicity data is interactivitely enriched with chemical information (e.g. the average mass). In best case with data linked to [US EPA CompTox Chemicals Dashboard](https://comptox.epa.gov/dashboard/ "US EPA CompTox Chemicals Dashboard") for example by using the output of the [batch search](https://comptox.epa.gov/dashboard/batch-search "US EPA CompTox Chemicals Dashboard Batch Search") according to Figure 1 and Figure 2. | ||
|
||
![Figure1: US EPA CompTox Chemicals Dashboard Batch Search - Enter Identifiers to Search](vignettes/figures/Figure_1.png "Figure 1: US EPA CompTox Chemicals Dashboard Batch Search - Enter Identifiers to Search") | ||
|
||
![Figure 2: US EPA CompTox Chemicals Dashboard Batch Search - Recommended selection of identifiers and properties](vignettes/figures/Figure_2.png "Figure 2: US EPA CompTox Chemicals Dashboard Batch Search - Recommended selection of identifiers and properties") | ||
|
||
At least, the molecular weight or average mass is required for the recalculation of the water concentrations from molar to milligrams. The main purpose of this workflow is to generate data for the hazard assessment of chemical pressures to aquatic organisms. Thus, only relevant data is aggregated and all data is calculated to `mg/L`. | ||
|
||
The data output contains `long pivot` tables containing all filtered datasets as the basis of further data processing and aggregation for the users' purposes. But it includes also a further pivoting step to `wider pivot` tables containing aggregated information, e.g. the geomean and the 5-percentile of the extracted data for each chemical, endpoint, and species. | ||
|
||
## Note | ||
|
||
This workflow will be further developed. Contributions and suggestions are welcome. Please create an [issue](https://github.com/tsufz/REcoTox/issues) to initialize the discussion. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.