Phagocata morgani Transcriptome

Analysis NamePhagocata morgani Transcriptome
MethodTrinity (trinityrnaseq_r2012-10-05.tgz​)
Date performed2016-03-03

Uninjured adult planarians of the indicated species were pooled and homogenized in 600 ml of TRIzol Reagent (Ambion/Life Technologies). RNA isolation for each species was performed as indicated in the reagent manual. Briefly, phase separation using chloroform was performed, followed by RNA precipitation using isopropanol. The RNA pellets were washed with 75% ethanol and resuspended in 100 ml nuclease-free water to an approximate concentration of 550 ng/ml (P. gracilis), 491 ng/ml (Girardia sp.), 1566 ng/ml (S. mediterranea, asexual), and 764 ng/ml (S. mediterranea, sexual). mRNAseq libraries were generated from 1 to 1.5 mg of high quality total RNA, as assessed using the Agilent 2100 Bioanalyzer. Libraries were made according to the manufacturer’s directions for the TruSeq RNA Sample Prep Kit v2, (Illumina, RS-122-2101). Resulting short fragment libraries were checked for quality and quantity using the Bioanalyzer. Equal molar libraries were pooled, requantified and sequenced as 100 bp paired reads on the Illumina HiSeq 2000 instrument, using HiSeq Control Software 1.5.15. Following sequencing, Illumina Primary Analysis version RTA 1.13.48, and Secondary Analysis version CASAVA-1.8.2, were run to demultiplex reads for all libraries and generate FASTQ files.

RNA was assembled into transcripts with the Trinity software package (version r2012-04-27 for Girardia sp., D. Dorotocephala and P. gracilis; version r2012-10-05 for P. morgani) using default parameters.​ Following assembly the seqclean software package (version Oct 17, 2006) was used to trim off adapter and remove non-planarian contaminate. These transcriptomes are available at the NCBI. (accessions: GDGM00000000, x, y, z)

To create a set of non-redundant likely protein coding sequences, all sequences were translated into longest open-reading frames using the software package Transdecoder (Version 2.01). Sequences with open-reading frames of at least 300 nucleotides (100 amino acids) were collected and oriented in the direction of the open-reading frame. For each species these oriented sequences were clustered cd-hit software package (version 4.6) with the longest sequence from each cluster being chosen as the representative sequence.

Sequences were renamed with zero padded sequential numbers with differing prefixes for each species: Ddo (D. Dorotocephala), Gsp (Girardia sp.), Pgr (P. gracilis), Pmo (P. morgani)

The resulting sequence files contain 28,547 sequences for D. dorotocephala, 24,750 for Girardia sp., 32,802 for P. gracilis and 35,237 for P. morgani. ​



Additional information about this analysis:
Property NameValue
Analysis Typebulk_data

