[Array Studio Video Tutorial] RNA-Seq Analysis Basics: Getting Started with RNA-Seq Pipeline Analysis and Data QC
Omicsoft Next Generation Sequencing (NGS) analysis includes NGS (next generation sequencing) bioinformatics tools for the entire process, from QC to alignment to post-alignment summarizations and analysis. RNA-Seq data analysis is a critical part of Omicsoft's NGS bioinformatics tools. In this article, we introduce our tutorial on how to get started with RNA-seq pipeline analysis and data QC.
- 1 Running the RNA-seq pipeline for a new project
- 2 Raw Data QC
- 3 Filtering and Trimming Raw Reads
- 4 Aligned Data QC
A typical RNA-seq analysis project consists steps from data quality control, alignment, aligned data quality control to data quantification, visualization, and statistical inference. In Array Studio, users have the choice of either executing each step of the analysis one-by-one, or can use the RNA-seq pipeline function. It only takes a few clicks to create a new RNA-seq project and run RNA-seq pipeline.
If you choose to perform analysis step by step, before aligning your RNA-seq data, you must first perform quality control (QC) on the raw data, to spot common problems like adapter or barcode sequence contamination, degraded quality at ends of reads, or problematic samples. The Array Studio Raw Data QC Wizard reports a number of useful measures of raw NGS quality, and can be generated as part of the RNA-seq pipeline function.
Example QC report includes:
- Base Distribution
- Basic Stats
- Duplication Level
- Kmer Analysis
- Overall/Per-sequence Quality Reports
- Quality Box plot
- Over-represented Sequences
- Per-sequence GC report
- Sequence Length Report
Array Studio's NGS Filter function can trim low-quality bases from raw NGS data, filter out uniformly low-quality reads, and strip away adapter sequences. The RNA-seq pipeline assumes that input reads are pre-filtered and stripped, so only quality-based trimming and filtering will be performed in the pipeline (no adapter stripping). It is a good idea to run the Filter function on your reads, based on the raw data QC results, before running the RNA-seq pipeline.
Array Studio automatically generates an Alignment Report after aligning reads to the genome or transcriptome. Additional alignment statistics can be generated by running the Aligned Data QC and RNA-seq 5'->3' Trend modules.
The best way to quickly learn how to perform these analysis steps is to watch our short video tutorials Getting Started with RNA-seq pipeline functions. Please stay tuned for more blog articles on RNA-seq analysis.