General Information
- Analysis ID
- OEZ008243
- Analysis Name
- RNA-seq data analysis
- Description
- Sequence data were removed adaptors and low-quality reads using the Trimmomaticv 0.36. The cleaned sequence data were aligned to human reference genome (UCSC hg19 assembly) by STAR2 (2.7.3a) in two-pass mode (with parameters: –chimSegment-Min 30 –chimJunctionOverhangMin 10 –alignSJDBoverhangMin 10 –alignIntronMax 200000 –alignSJstitchMismatchNmax 5 -1 5 5). The cleaned sequence reads were used for further qualification of gene expression. transcripts per million (TPM) values of each gene and transcript were calculated by Salmon with parameters of –seqBias –gcBias –posBias Before variants calling, reads were pre-processed of deduplication with picard (2.20.4). Additionally, the tool of GATK’s SplitNCigar-Reads was applied to the deduped reads. The detection of variants was conducted by GATK in mode of ‘HaplotypeCaller’, with parameters of –dont-use-soft-clipped-bases –standard-min-confidence-threshold-for-calling 20.0. ‘VariantFiltration’ was implemented in the following step (with parameters: -window 35 -cluster 3 –filter-name FS –filter-expression "FS > 30.0" –filter-name QD –filter-expression "QD < 2.000), and only highly accurate variants were remained. For searching gene fusion in transcriptome, we applied STAR-Fusion (v1.8.1) to chimeric-junction files generated in the previous RNA-seq variants calling procedure. The detected fusions were filtered if supporting reads were less than 20. To find out alternative splicing events, Tophat (2.1.0) was implemented to generate junction files using Bowtie2-indexed (2.4.1) reference genome (hg19), with parameters of -g 1 -M -x 1 in fusion search mode.
Analysis Information
- Analysis Type
- Other
- Pipeline
-
1
Program:
Version:
v0.36
Notes:
Sequence data were removed adaptors and low-quality reads using the Trimmomatic
Output:
2Program:
Version:
v2.7.3a
Notes:
Aligned to human reference genome (UCSC hg19 assembly
Output:
3Program:
Version:
Notes:
Qualification of gene expression. transcripts per million (TPM) values of each gene and transcript
Output:
4Program:
Version:
v2.20.4
Notes:
Before RNA variants calling, reads were pre-processed of deduplication with picard (
Output:
5Program:
Version:
Notes:
Additionally, the tool of GATK’s SplitNCigar-Reads was applied to the deduped reads.
Output:
8Program:
Version:
v2.1.0
Notes:
To find out alternative splicing events, Tophat (2.1.0) was implemented to generate junction files using Bowtie2-indexed (2.4.1) reference genome (hg19)
Output:
Target
Analysis Data
Submitter Information
- Create Date
- 2021-12-30
- Last Modified
- 2021-12-30
- Submission
- Liangqing Dong