GFF3 document GFF/GTF document
GFF - General Feature Format, is a text-based format presenting genomic features (1 feature per line). GFF uses 1-based indices, with 9 mandatory columns.
GTF - General Transfer Format is identical to GFF version 2.
GFF3 - GFF version 3 is more standardized with a fixed number of columns and stricter syntax.
GFF columns
- seqid - chromosome Ensembl identifier/scaffold ID
- source - program or data source (database or project name)
- type - exon, gene, mRNA, etc.
- start - 1-based start position
- end - 1-based end position
- score - A floating point value.
- strand - + forward or - reverse.
- phase - 0/1/2, indicating the 1st/2nd/3rd base of the feature is the start of a codon.
- attributes - A list of tag-value pairs separated by ”;“. Predefined tags: ID, Name, Alias, Parent, etc.
GFF3 header: GFF3 file must start with a comment that identifies the version, e.g. ##gff-version 3
GFF file example from Ensembl