Meeting Abstract
Next generation sequencing platforms and improved computational resources can be used to identify novel genes and pathways in a non-model organism. Although there are multiple resources available for de novo assembly, annotation, and measurement of differential expression of RNAseq data, this bioinformatics field is still evolving to generate a standard method of analyzing data. Our goal was to evaluate the current tools and databases available for a comprehensive and comparative annotation of the molting gland, or Y-organ, transcriptome of the blackback land crab, Gecarcinus lateralis. Following assembly of Illumina sequence data using Trinity and sequence clustering via CD-HIT-EST, a transcriptome with 251,357 contigs was generated. Multiple resources and databases were used to assign functional and pathway annotation to the contig sequences. Both nucleotide and protein sequences, obtained via Transdecoder, were BLASTed against the NCBI NR, Swissprot, and Uniprot-Uniref90 databases, which resulted in annotation of about 20% of the sequences. Functional annotation of the sequences was performed using Blast2GO and the pathway annotation was generated via Kyoto Encyclopedia of Genes and Genomes. In addition to assigning BLAST information to nucleotide and protein sequences, the Trinotate suite was used to identify protein domains, transmembrane domains, and signal peptide sequences. Furthermore, we used the Interproscan resource to match the protein sequences against 16 protein signature databases. The analysis yielded genes encoding components of the mTOR, TGFβ, MAP kinase, Wnt, Hedgehog, Jak-STAT, and Notch pathways. Employment of these various methods is necessary for a comprehensive annotation of sequences generated from non-model organisms. Supported by NSF (IOS-1257732).