Skip to content

monarch-initiative/ncbi-gene

Repository files navigation

NCBI Gene

| Documentation |

The NCBI Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.

Data Sources

Gene Information

Genes for all NCBI species (Dog, Cow, Pig, Chicken, et alia) are loaded using the ingest file (filtered to only NCBI taxon ID).

Biolink Captured

  • biolink:Gene
    • id
    • symbol
    • description
    • in_taxon
    • provided_by (["infores:ncbi-gene"])

Citation

National Center for Biotechnology Information (NCBI)[Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [1988] – [cited 2024 Dec]. Available from: https://www.ncbi.nlm.nih.gov/

Requirements

Installation

cd NCBI Gene
make install
# or
poetry install

Note that the make install command is just a convenience wrapper around poetry install.

Once installed, you can check that everything is working as expected:

# Run the pytest suite
make test
# Download the data and run the Koza transform
make download
make run

Usage

This project is set up with a Makefile for common tasks.
To see available options:

make help

Download and Transform

Download the data for the ncbi_gene transform:

poetry run ncbi_gene download

To run the Koza transform for NCBI Gene:

poetry run ncbi_gene transform

To see available options:

poetry run ncbi_gene download --help
# or
poetry run ncbi_gene transform --help

Testing

To run the test suite:

make test

This project was generated using monarch-initiative/cookiecutter-monarch-ingest.
Keep this project up to date using cruft by occasionally running in the project directory:

cruft update

For more information, see the cruft documentation