update help

NBISweden · Mar 16, 2020 · 8a6352b · 8a6352b
1 parent ff26d3f
commit 8a6352b
Showing 1 changed file with 19 additions and 12 deletions.
diff --git a/bin/agat_convert_sp_gff2gtf.pl b/bin/agat_convert_sp_gff2gtf.pl
@@ -475,21 +475,20 @@ =head1 DESCRIPTION
 
 The script aims to convert any GTF/GFF file into a proper GTF file.
 Full information about the format can be found here: https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/gxf.md
-The last descrption of the fomat specify only 9 acctepeted feature  type (3rd colum):
-gene, transcript, exon, CDS, Selenocysteine, start_codon, stop_codon, three_prime_utr and five_prime_utr
-Nevertheless if your file contains other type of features they will not be removed,
-as long as the parser can deal with them.
-
-To be fully GTF compliant all feature need to have a gene_id and a transcript_id attribute.
+You can choose among 6 different GTF types (1, 2, 2.1, 2.2, 2.5, 3).
+Depending the version selected the script will filter out the features that are not accepted.
+For GTF2.5 and 3, every level1 feature (e.g nc_gene pseudogene) will be converted into
+gene feature and every level2 feature (e.g mRNA ncRNA) will be converted into
+transcript feature.
+You can even produce a GFF-like GTF using the --relax option. It allows to keep all
+original feature types (3rd column).
+
+To be fully GTF compliant all feature have a gene_id and a transcript_id attribute.
 The gene_id	is unique identifier for the genomic source of the transcript, which is
 used to group transcripts into genes.
 The transcript_id	is a unique identifier for the predicted transcript,
 which is used to group features into transcripts.
 
-
-Keep in mind that some bioperl versions forget to add the header (##gff-version 2) in the output.
-Check the output to add it if missing, it will avoid you troubles during your downstream analyses.
-
 =head1 SYNOPSIS
 
     agat_convert_sp_gff2gtf.pl --gff infile.gtf [ -o outfile ]
@@ -509,18 +508,26 @@ =head1 OPTIONS
 
 =item B<--gtf_version>
 version of the GTF output. Default 3 (for GTF3)
+
 GTF3 (9 feature types accepted): gene, transcript, exon, CDS, Selenocysteine, start_codon, stop_codon, three_prime_utr and five_prime_utr
+
 GTF2.5 (8 feature types accepted): gene, transcript, exon, CDS, UTR, start_codon, stop_codon, Selenocysteine
+
 GTF2.2 (9 feature types accepted): CDS, start_codon, stop_codon, 5UTR, 3UTR, inter, inter_CNS, intron_CNS and exon
+
 GTF2.1 (6 feature types accepted): CDS, start_codon, stop_codon, exon, 5UTR, 3UTR
+
 GTF2 (4 feature types accepted): CDS, start_codon, stop_codon, exon
+
 GTF1 (5 feature types accepted): 	CDS, start_codon, stop_codon, exon, intron
 
 =item B<--relax>
 
-Relax option allows to not follow the strict GTF format rules. All feature type will be kept.
-No modification e.g. mRNA to transcript
+Relax option avoid to apply strict GTF format specification. All feature type will be kept.
+No modification e.g. mRNA to transcript.
 No filtering i.e. feature type not accepted by GTF format are kept.
+gene_id and transcript_id attributes will be added, and the attributes will follow the
+GTF formating.
 
 =item B<-o> , B<--output> , B<--out> , B<--outfile> or B<--gtf>