The FASTA Security toolkit
SecureFASTA is a Python tool that allows you to encrypt and decrypt Fasta, Fastq, Bam and VCF files using AES encryption and RSA Cryptography.
- University of Aveiro, Dept. Electronics, Telecommunications and Informatics (DETI / IEETA), Aveiro, Portugal ↩
- University of A Coruña, Dept. of Information and Communications Technologies, A Coruña, Spain ↩
- There exists 2 modes for encryption:
compression
andsplit
- The first allows for the compression of the file using gzip and later full encryption of the information.
- The second allows for the partial encryption of information from genomic files based on the file type.
- Encryption and decryption is performed using a PGP strategy where the key of symmetric encryption is stored in a separate file ciphered with the public key of the receiver.
- The key file contains further information for decryption. This information includes:
- Ciphered checksum for integrity validation
- Ciphered filetype for which is not currently used but can be necessary if other compression methods bcome available
- Ciphered mode for decryption since the solution must know if it must decompress the information after decryption
- The iv used for encryption (This information can be public)
- The schema of encryption. This tells the tool where ciphered fields can be found (This information can also be public)
- Python 3.6 or higher
- Requirements in requirements.txt
- Requirements in requirements-test.txt if testing is performed
The script has 2 commands: encrypt
or decrypt
which are self explanatory
- The tool will need RSA keys in PEM format
input-file
: The file to be encrypted or decryptedoutput-file
: The file to place information (encrypted or decrypted)key-file
: The file to place the key, checksum, schema, mode and iv informationpublic-key-file
: Public key of the receiver of information for RSA encryption (only needed in encryption)private-key-file
: Private key of the receiver of information for RSA decryption (only needed in decryption)mode
: Mode of exeuction for the tool.compression
compresses the file and encrypts it,split
only encrypts the file but allows for customization on which information to be ciphered (Only needed for encryption).specification
: Only useful forsplit
mode. Allows for the specification of which information is to be ciphered (only needed for encryption).file-type
: Indicates which type of file is going to be encrypted (Fasta, Fastq, VCF or Bam) (onluneeded for encryption).
File | Headers only | Sequences | Sequences w/o Quality Scores | All |
---|---|---|---|---|
Fasta | ✅ | ✅ | - | ✅ |
Fastq | ✅ | ✅ | ✅ | ✅ |
VCF | ✅ | ✅ | - | ✅ |
Bam | - | - | - | ✅ |
Encryption of only the headers of a fasta file:
$ cd src/
$ python3 main.py encrypt \
--input-file ../data/input.fasta \
--output-file ../output/fasta.enc \
--public-key-file ../keys/public_key.pem \
--key-file ../output/key.bin \
--mode split \
--specification headers \
--file_type fasta
Decryption of the same file:
$ cd src/
$ python3 main.py decrypt \
--input-file ../output/fasta.enc \
--output-file ../output/output.fasta \
--private-key-file ../keys/private_key.pem \
--key-file ../output/key.bin
To encrypt (Same example as above):
docker run \
-v ${PWD}/data:/data \
-v ${PWD}/output:/output \
-v ${PWD}/keys:/keys \
<image_name> encrypt \
--input-file /data/input.fasta \
--output-file /output/fasta.enc \
--public-key-file /keys/public_key.pem \
--key-file /output/key.bin \
--mode split \
--specification headers \
--file_type fasta
To decrypt:
docker run \
-v ${PWD}/data:/data \
-v ${PWD}/output:/output \
-v ${PWD}/keys:/keys \
something decrypt \
--input-file /output/fasta.enc \
--output-file /output/output.fasta \
--private-key-file /keys/private_key.pem \
--key-file /output/key.bin
The test with perform all possible combinations of mode, specification and filetype available. The before and after files are compared for equality and the intermediate files can be inspected visually.
Note: Create a test directory for this example to ensure it works. This also helps with the organization as many forders and files will be created.
$ cd src/
$ pytest tests.py
Please cite the following, if you use SecureFASTA in your work:
todo
More details available here.
Please let us know if there are any issues.
This project is licensed under the MIT License - see the LICENSE file for details.