In bioinformatics research, there are many different data sources, including microarray, sequence data, CNV data, ChIP-chip data, genotype data, etc. In Array Studio, we divide genomic data into two groups, -Omic data and Table data. First, -Omic data, which is basically a data matrix with annotation for both columns and rows. Microarray data is a standard example of -Omic data. The microarray tutorial is a great starting point for new users of Array Studio, whether or not you will be working directly with microarray data. In this article, we will cover Getting Started with Array Studio Microarray Analysis on microarray analyses basics.
- 1 Getting Started with Array Studio Microarray Analysis
- 2 Preparing Your Data for Downstream Analysis
When Array Studio is first installed, it will look similar to below. Array Studio organizes projects in the Solution Explorer. Any generated data or figure can be displayed in the middle of the window, while a Legend and Filter window appears on the right side of the window.
After you create a new project, Array Studio will guide you through importing your expression microarray datasets. Three data types, OMIC measurement data table, design table and annotation table are the basic -Omic data types.
After importing data and downstream analysis, Array Studio organizes data in four main data types: List Data, Table Data, -Omic Data and NGS Data in a project. -Omic data is read only table data with annotation and design tables attached (these can be modified). -Omic data and table data can be converted from one type to the other.
Array Studio provides several methods to reproduce analysis steps. Omicsoft scripts (Oscript) for analysis functions can be viewed in every function window, by right-clicking on an object name, or by viewing the full Audit Trail. Array Studio tracks all analysis steps done in a project, using its Audit Trail feature. It is important for data integrity needs, and for individual users to track the changes and reproduce the procedures.
Before downstream analysis, Array Studio contains modules to identify samples that deviate significantly from the rest of the data set, possibly indicating a failed sample that should be excluded from downstream analysis.
Principal Component Analysis (PCA) can identify variance in data sets, which can come from real differences between sample groups, or it can come from a failed microarray chip. Failed experiments can quickly be removed from your -Omic data objects for downstream analysis.
Array Studio can identify samples that deviate significantly from others in your data set, by calculating the correlation coefficient of each gene/probeset. Samples that correlate unusually poorly will be flagged as possible failed samples, and can be excluded from downstream analysis.
For step-by-step instructions, please check out our video tutorial: Getting Started with Array Studio Microarray Analysis.