You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PreHGT runs at the genus level to pull in (pseudo) pangenome information, which is used to estimate contamination vs. real transfer events.
Right now, we don't do a good job of reporting how many/which species are represented for each genera. Below I include some code I recently used to get the species (organism name) information from ncbi based on the genome accession (`GCA*/GCF*).
I ran this on all files matching download/*_genome.csv
Install tools
conda install -c conda-forge ncbi-datasets-cli jq
collect genome accessions without csv headers
for infile in *csv
do
cat $infile | tail -n +2 >> genomes.csv
done
PreHGT runs at the genus level to pull in (pseudo) pangenome information, which is used to estimate contamination vs. real transfer events.
Right now, we don't do a good job of reporting how many/which species are represented for each genera. Below I include some code I recently used to get the species (organism name) information from ncbi based on the genome accession (`GCA*/GCF*).
I ran this on all files matching
download/*_genome.csv
Install tools
collect genome accessions without csv headers
get species (organism name)
The text was updated successfully, but these errors were encountered: