GitHub

SeqExtractor is a script for extracting sequences from a FASTA file based on a list of sequence IDs provided in a separate file.

Prerequisites

Python 3
Biopython library (pip install biopython)

Installation

Clone the repository:

git clone https://github.com/cavalheiromf10/SeqExtractor.git
cd SeqExtractor

Make the script executable

foo@bar: ~$ chmod +x SeqExtractor.py

Usage

./SeqExtractor.py -i input_file -s sequence_file -o output_file

Arguments

input_file: File with one sequence ID per line.
sequence_file: FASTA file containing sequences to extract.
output_file: Name of the output file to save the extracted sequences.

Example

./SeqExtractor.py -i IDs_DUFs.txt -s Esalsugineum_173_v1.0.protein.fa -o output.fasta

This example will extract sequences from Esalsugineum_173_v1.0.protein.fa based on the sequence IDs listed in IDs_DUFs.txt and save the results to output.fasta.

Troubleshooting

If you encounter any issues with file permissions or missing files, check the error messages provided by the script. Ensure that Biopython is installed (pip install biopython).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
SeqExtractor.py		SeqExtractor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequisites

Installation

Usage

Arguments

Example

Troubleshooting

About

Releases

Packages

Languages

License

cavalheiromf10/SeqExtractor

Folders and files

Latest commit

History

Repository files navigation

Prerequisites

Installation

Usage

Arguments

Example

Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages