[Omic Data Analysis Tutorial] Microarray Data Visualization, Statistical Inference and Pattern Discovery
No matter if you're dealing with microarray or RNA-seq data with calculated FPKM or read counts, it is important to perform downstream analysis to make sense of the data and identify interesting data patterns, samples, genes or proteins. In this article, we will introduce some commonly-used visualization and statistical analysis functions that are covered in the second half of our Microarray Analysis video tutorials:
- Visualize Data with Array Studio Views
- Statistical Inference and Pattern Discovery
-Omic Data are read-only data constructs. The most common way to explore -Omic data is to add "Views" onto your data, including a "table" view to directly visualize the numerical data values or a "chart" view, such as the Variable view and Scatter plot.
The most common way to look at your -Omic data is with the Table View. Although it looks like a standard spreadsheet, the Table View is actually a visualization of your underlying data. It is dynamically connected to the attached annotation and design metadata, and can be sorted and filtered without worry of altering the underlying data. Array Studio is able to easily handle millions of rows and columns in the Table View .
Example functions introduced in the video tutorial will allow you to:
- Sort and Filter Table Views
- Display context-specific details from metadata
- Convert read-only -Omic data to editable Table data
- Log2-transform your expression data
- Link to publish databases through Web Details On-Demand
- Visualize distribution of expression values with Kernel Density
Depending on the contents of your -Omic data or table, Array Studio has about 40 views to interactively display your data. This video clip briefly walks through some of the more popular Views for Gene-level data; the Variable View and Pairwise Scatter Plot.
Gene expression data can be grouped using Hierarchical Clustering by Variables (e.g. genes) and Observations (e.g. samples) to reveal associations in your data.
In additional to visualizing the overall clustering pattern, you can also search datasets for variables/observations with similar patterns to your variable/observation of interest through Find Neighbors. You can display these comparisons in multiple ways, including pairwise correlation/MA plots, heatmaps, and 3D scatter plots.
The One-Way ANOVA is used to research the effects of a single factor, while Two-Way ANOVA can be used to research the effects of two factors on expression data. This model generates an inference report, including automatically generated Report View and VolcanoPlotView. Additionally, the Venn Diagram and Inference Report Summary can help to quickly visualize the deferentially expressed genes.
If you are interested in discovering pathways or functionally related genes that are enriched in your data, you can run the Gene Ontology (GO) module. This module will perform built-in gene ontology classification on one or more significant lists. Once you generate a list of significant variables, Array Studio can go through all possible GO terms (across different class levels) to see how many variables in the list are covered by the GO terms. You can infer different biological attributes (such as functions, corresponding biological process) of the variables in the list.