PRICING & INQUIRIES

For pricing and inquiries, send an email to sales@omicsoft.com.

5001 Weston Parkway, Suite 201
Cary, NC 27513
US

888-259-6642

Overview

Omicsoft is the leading provider of Next Generation Sequencing, Cancer Genomics, Immunology, and Bioinformatics solutions for Next Generation Sequencing Data and Gene Expression Analysis.

[Array Studio Video Tutorial] RNA-Seq Downstream Analysis: Normalization, Visualization and Data Integration

Exciting Updates and Latest News

Keeping you up-to-date with the latest in NGS, Bioinformatics Analysis, and cancer genomics with blogs on Array Suite, OncoLand (TCGA and more), ImmunoLand, and more.

[Array Studio Video Tutorial] RNA-Seq Downstream Analysis: Normalization, Visualization and Data Integration

Vivian Zhang

After aligning data, there are a number of downstream analyses that can be done. For instance, the generated RPKM (or FPKM) dataset can be used, as Microarray Data, for clustering (log2 transformation may be necessary). Count data can be used to look for changes between groups of samples through DESeq analysis. A large number of visualization and QC functions are available to analyze feature-level RNA-seq data in Array Studio. In this article, we will introduce our video tutorials on RNA-Seq Downstream Analysis 

 

 

 

1 Normalizing and Transforming RNA-seq Data for MicroArray-type analysis

Array Studio has a large number of modules originally designed for Gene Expression MicroArray analysis, but these modules are also useful for analyzing feature-level (e.g. gene-level, exon-level) RNA-seq data. However, many of these modules expect normalized and log-transformed input data. Array Studio provides a number of methods for normalizing RNA-Seq data, including Log Geometric Mean, Mean, Median, Quantile, TMM (edgeR), TotalCount, RPKM to TPM, UpperQuartile, and LandNormalization. Array Studio also provides methods for normalizing and transforming -Omic data. 

 

2 Attach new Views to Data

In Array Studio, data can be directly viewed in tables, but can also be displayed in up to 40 Views, depending on the contents of the underlying data. Array Studio features the very powerful Variable View, among it's most popular views:

The Variable View allows the user to visualize one chart for each variable in the dataset. The example variable view shows the Log 2 FPKM values for gene CLDM18, categorized by tissue and gender.

The Variable View allows the user to visualize one chart for each variable in the dataset. The example variable view shows the Log 2 FPKM values for gene CLDM18, categorized by tissue and gender.

 

3 Principal Component Analysis on normalized expression data

Principal Component Analysis (PCA) is an effective tool to group data by components that contribute to the greatest variance in the dataset. In other words, PCA can group your data based on variance, which should reflect differences between samples. Outliers (such as failed samples) will often appear as outliers. 

Both 2D and 3D PCA plots are commonly used to group data or identify outliers. 

Both 2D and 3D PCA plots are commonly used to group data or identify outliers. 

 

4 Hierarchical Clustering of normalized expression data

Gene expression data can be grouped by Hierarchical Clustering by Variables (e.g. genes) and Observations (e.g. samples) to reveal associations in your data. Array Studio can easily handle Hierarchical Clustering of up to 20000 variables, far more than the capacity of many popular gene clustering programs.

Classic dendrogram is an older version of dendrogram. The new version is more interactive and provides more gene annotation information for downstream analysis. 

Classic dendrogram is an older version of dendrogram. The new version is more interactive and provides more gene annotation information for downstream analysis. 

 

5 RNAseq-MicroArray Integration

Feature-level (genes, transcripts, etc.) results from RNA-seq experiments can directly be compared to microarray data from the same samples, using the Microarray-Microarray Integration module. This module allows the user to create a duplex matrix (two values for each variable in the dataset) for two “microarray” data types. The resulting dataset can also contain correlation information for each variable, making it easy to figure out which variables correlate well between datasets.

Microarray-microarray integration module provides variable views on gene and sample level showing how well microarray and RNA-seq data correlate. 

Microarray-microarray integration module provides variable views on gene and sample level showing how well microarray and RNA-seq data correlate. 

 

To learn how to perform these downstream analysis on RNA-seq data, please check out our video tutorials on RNA-Seq Downstream Analysis