For pricing and inquiries, send an email to

5001 Weston Parkway, Suite 201
Cary, NC 27513



Omicsoft is the leading provider of Next Generation Sequencing, Cancer Genomics, Immunology, and Bioinformatics solutions for Next Generation Sequencing Data and Gene Expression Analysis.

Exciting Updates and Latest News

Keeping you up-to-date with the latest in NGS, Bioinformatics Analysis, and cancer genomics with blogs on Array Suite, OncoLand (TCGA and more), ImmunoLand, and more.

Filtering by Tag: Oncoland

[New Feature] Manage Land Sample Clinical Data

Vivian Zhang

Omicsoft has been working diligently over the past few months to both strengthen our ability to incorporate clinical data, as well as  growing our list of curated clinical measurements from public datasets. Currently, there are more than 1000 different clinical measurement variables in total, including sample demographics, survival data, symptoms, treatments and more in OncoLand and DiseaseLand. Moreover, users often have their sets of internal clinical data they wish to add to the system. If you have not started leveraging the power of our clinical data subsystem, please take a look at OncoLand Case Study - Clinical Variables for a 10 mins quick video tutorial on how to utilize clinical data to identify novel associations.

To help users better manage Land clinical data, we recently implemented Manage Sample Clinical Data function in Land. This function can be accessed through:



This function allows users to add clinical data, manage clinical variable meta data, remove samples and remove clinical vatiables: 

Add Clinical Data

Add Clinical Data

Adding clinical data is straightforward. In addition, "Metadata" for clinical data columns can be controlled by adding a second table. For example, clinical data column grouping can be controlled by a table where the first column contains Clinical Data column names, and the second column contains category:

Add Clinical Variable Metadata

Add Clinical Variable Metadata


The function is easy-to-use and straightforward, allowing users to manage their clinical data efficiently and effectively. For more details on the function, please refer to our wiki page

Stay tuned for additional functionality coming at the end of this year, including support for CDISC formatted files, to include time-series measurement data.

[Important Land Update] Land Filter Now Carries Over Across Multiple Searches

Vivian Zhang

Omicsoft's Lands are known for being comprehensive, powerful and integrated, allowing users to navigate across samples, genes, data types, datasets and platforms. As comprehensive and flexible as it can be, the system may appear to be complex for some users, with growing numbers of samples, datasets and data types. To help user apply filters more easily and efficiently, Omicsoft recently improved its filtering logic in the Lands. 

Previously, filters applied to one search do not carry over from the main Land tab, requiring users to apply filters all over again for any new search.

For example, if the user wants to compare gene expression FPKM for EGFR in KIRC (Kidney Renal Clear Cell Carcinoma) and KIRP (Kidney Renal Papillary Cell Carcinoma), the first step might be to filter the tumor types in the TCGA_B37 main tab (to see sample numbers and understand the overall distribution of samples). Next, the user can search for EGFR and go to Gene FPKM view (Step 2). If the user wants to see the gene expression of TP53, previously the Land doesn't carry over the filter and the user needs to redo the filter again (Step 3).

Step 1  filter for Tumor Type KIRC and KIRP, and check sample statistics (data availability in this example search):

Step 2 search for Gene EGFR and view Gene FPKM view:

Step 3 search for Gene TP53, the sample filter does not carry over and the user has to redo the filter all over again:


Now with the new version, the filter carries  over without any additional steps:


Imagine when one has already filtered many steps and navigated to a group of samples/genes that appear intriguing, how easy and time-saving it becomes to directly have all the filter steps applied to the new search. 

This filter logic applies to all left-hand side filter tabs including Sample, Comparison and all data type filter tabs. 

If the user does not want to apply the filters, simply click Clear All Filters button to reset everything:

[New Feature] Geneset Analysis Functionality: integrated with Omicsoft Land databases

Vivian Zhang

Gene Set Analysis is a powerful tool to help users who have their own gene signatures and would like to identify comparisons or other signatures containing similar gene set enrichment from both tens of thousands of comparisons in the Lands as well as customer gene sets for on-premises customers. Recently, Omicsoft officially released our new GeneSet Analysis function. For more details, check out our webinar recording Announcing GeneSet Analysis Functionality, integrated with Omicsoft’s Land databases presented by Matt Newman, VP of Business Development at Omicsoft on September 28th, 2016. 

Previously, Omicsoft's Land system offered a simplified GeneSet Enrichment Analysis. It allowed users to compare their own gene sets with those contained in the Lands: 

Although this was powerful enough to identify comparisons with similar gene sets:

1. it was restricted within a specific Land of choice and not shared across Lands

2. it did not take directionality into account

3. it was not able to include other genesets beyond Land data as target gene sets 

4. it required the user to be familiar with the Land system, and not just the analysis sub-system of Array Suite.

Even though Omicsoft's Array Studio also provides a Molecular Signature module that allows users to compare to Broad's molecular signature database, the Molecular Signature module also does not take directionality into account and requires user to add straight lists to Array Studio Projects, with no ability to incorporate inference reports, nor any of the important data stored within the Lands or easily incorporate customer Gene Sets.


In order to more fully leverage Omicsoft's data assets, we have officially released our new GeneSet Analysis module. The new GeneSet Analysis allows the users to query across OncoLand, DiseaseLand, Molecular Signatures, and more. 

GeneSet Analysis Wizard

GeneSet Analysis Wizard

In addition to the geneset databases included, the new GeneSet Analysis also provides directional results -- up and down p-values and directions.

GeneSet Analysis result

GeneSet Analysis result

We are still in active development of the GeneSet Analysis module, constantly improving our content, functions and visualizations. Here are a couple examples we are working on:

1. Multi-species data support in addition to human and mouse data

2. Additional visualizations based on table results

If you have any comments or suggests, please let us know. 


Want to give it a try? Please check out our latest webinar Announcing GeneSet Analysis Functionality, integrated with Omicsoft’s Land databases and our GeneSet Analysis wiki for detailed illustration. 



[Land Update] Omicsoft OncoLand 2016 Q2 Update

Vivian Zhang

We've reached the time for our OncoLand Quarterly Update, and we're excited about what we have to tell you about!

In our Q1 2016 release following our kick-off User Group Meeting, we had a major update to the Lands including CCLE_B37, CGCI_B37, , Hematology_B37, ICGC_B37, OncoGEO_B37, TARGET_B37, TCGA_B37, and TumorMutation_B37, and the addition of two new lands, ClinicalOutcome_B37 and expO_B37. In the Q2 update, we provided update for Hematology_B37, ICGC_B37, TCGA_B37 and OncoGEO_B37. 

Here is  the sample statistics for updated Lands. For details, please refer to OncoLand 2016 Q2 Release Whitepaper.



•    60 samples (two cell lines under different conditions) with RNA-Seq data; based on SRA SRP041036
•    5484 samples with Affymetrix (U133 Plus 2.0) expression data; based on GEO GSE15695, GSE19784, GS6891, GSE12417, GSE13159, GSE17855 and MMGP
•    767 samples with CNV data; based on GEO, MMRC Collection, HMCL69 cell line and Corral2012 study
•    203 samples with DNA-Seq somatic mutation data; based on MMRC Reference Collection
•    68 samples with DNA-Seq mutation data; based on HMCL69 cell line collection



•    577 samples with RNA-Seq data
•    779 samples with Methylation450 data
•    5587 samples with DNA-Seq Somatic Mutation da
•    2869 samples with CNV data



•    2001 samples with RNA-Seq data
•    4786 samples with expression data



•    22301 samples with CNV data
•    9677 samples with DNA-Seq Somatic Mutation data
•    2377 samples with Expression Ratio (Agilent) data
•    9793 samples with Methylation450 data
•    11022 samples with miRNA-Seq data
•    7933 samples with RPPA (protein array) data
•    4735 samples with RPPA_RBN (protein array) data
•    11291 samples with RNA-Seq data


Most users should have already been contacted about this release update, and if not, we will work with you to update your servers in the near future.


[OncoLand Case Study] Summarize per-sample and per-tumor mutations across multiple genes

Vivian Zhang

Summarizing mutation frequencies within a protein complex, members of a pathway, or even across the genome, can give insights into differences between tumors. Combining the power of OncoLand and Array Studio functions, you can explore mutation frequencies. For example, let's take a research example using the Swi/Snf complex, which can regulate chromatin remodeling. 

Swi/Snf complex is multi-subunit ATP-dependent chromatin-remodeling complex. Early studies have suggested that the Swi/Snf complex plays a role in cancer development, likely to be tumor suppressors. ( Nature Reviews Cancer article: The SWI/SNF complex — chromatin and cancer). Mutations in the members of this complex have been linked to various cancers. You can leverage OncoLand to query samples containing those mutations. Please check out the detailed OncoLand case study video tutorials.


Identify samples with mutations in the Swi/Snf complex

To find out how often the genes from the Swi/Snf complex are mutated in tumors, you can use Summarize Sample Mutation Count to generate a SampleSet through Analytics tab and use this SampleSet for downstream analysis:

SampleSet results from Summarize Sample Mutation Count analysis by inputing all gene names from Swi/Snf complex as GeneSet and group by Tumor Type. The mutation count is sorted by the number of mutations in each sample.

SampleSet results from Summarize Sample Mutation Count analysis by inputing all gene names from Swi/Snf complex as GeneSet and group by Tumor Type. The mutation count is sorted by the number of mutations in each sample.


Visualize differences in Swi/Snf complex mutations using TCGALand Views

There are multiple ways to visualize mutation (frequency) differences in Swi/Snf. Without using land views, we can still achieve this goal in Array Studio. Array Studio empowers users to perform hundreds of different types of analysis with flexibility, and can potentially save biologists the hassle of waiting for a bioinformatician to get back the results for weeks. However, with OncoLand, we can visualize the mutation frequency in minutes. The following analysis pipeline clearly demonstrates the difference of using Array Studio and OncoLand.

OncoLand makes cancer genomics research easy. Again, please check out our case study video tutorials for more details.

[OncoLand Case Study] Find genes that are frequently co-mutated with your gene-of-interest: Co-mutation of TP53 and ATRX when IDH1-R132 is mutated

Vivian Zhang

The IDH1 gene encodes isocitrate dehydrogenase, which is  involved in NADPH production, especially in the brain. Mutations in IDH1 are frequently found in low grade and high grade gliomas (Low grade (grade II), anaplastic (grade III), and glioblastoma (GBM, grade IV).). (Research Article: IDH1 and IDH2 Mutations in Gliomas) These mutations play an important role in gliomagenesis and thus have clinical interest. We can query OncoLand to learn about IDH1 mutations, and other genes frequently co-mutated. For details, please refer to our OncoLand case study wiki:

Identify mutation hotspots in a gene of interest

In several cancers, IDH1 is frequently mutated at arginine 132, which alters the enzyme's active site. We can visualize the frequencies of mutations at different sites in each tumor. As we can see, our data confirms that IDH1 arginine 132 is frequently mutated in low grade gliomas (LGG) and glioblastoma (GBM):

TCGALand DNA-Seq Somatic Mutation Site Distribution View. 

TCGALand DNA-Seq Somatic Mutation Site Distribution View. 

The user can create a SampleSet, for example the one shown below, IDH1_mutaion, from the Analytics | Generate Sample Set | Generate Site Mutation Status SampleSet. 

SampleSet: IDH1_mutation

SampleSet: IDH1_mutation

Identify other genes that are co-mutated with your gene of interest

With the SampleSet, we can identify the gene mutations that are correlated through Analytics | Integration Analysis | Sample Grouping to Mutation. The test may take a few minutes if all genes are queried, and the results will be available from the Analytics | Open Result Set menu. From the results table, we can rank genes with the PValue from the Fisher Exact Test to identify the correlated genes, for instance ARRX and TP53 in LGG and GBM:

Analytics | Integration Analysis | Sample Grouping to Mutation Test results. Rank by PValue, filter by only co-occurring gene in LGG and GBM.

Analytics | Integration Analysis | Sample Grouping to Mutation Test results. Rank by PValue, filter by only co-occurring gene in LGG and GBM.

Visualize Co-mutation patterns with the Alteration Omicprint

There are several ways to visualize co-mutation frequencies of multiple genes. While the "Alteration Distribution" displays the number of samples mutated in any gene of the GeneSet, "Somatic Co-mutation Frequencies" will display the distribution of samples with different mutation loads. The "Alteration Omicprint" efficiently displays per-sample mutation status of one, ten, or even hundreds of genes. You can also generate custom Omicprinst based on custom queries if you want to query mutation status. Please check out our case study tutorial videos to learn how to perform the analysis. 

Alteration Omicprint displays gene alteration status for multiple genes for corresponding samples. Custom quires for IDH1 and TP53 somatic mutation status, and BMP2 RNA-Seq FPKM are created. Next, check out Custom Query Omicprint view. For each custom query, sample status is displayed. As we can see, samples with mutated IDH1 and TP53 frequently over-express BMP2 in GBM. 

Alteration Omicprint displays gene alteration status for multiple genes for corresponding samples. Custom quires for IDH1 and TP53 somatic mutation status, and BMP2 RNA-Seq FPKM are created. Next, check out Custom Query Omicprint view. For each custom query, sample status is displayed. As we can see, samples with mutated IDH1 and TP53 frequently over-express BMP2 in GBM. 

[Feature Review] Save Customized Views for Future Usage and Sharing through in the "Lands"

Vivian Zhang

In Omicsoft's Lands (ImmunoLand and OncoLand), we have pre-configured over 40 views for different data types, including RNA-Seq, DNA-Seq, miRNA-Seq, Copy Number Variation, Gene Expression Chip, Protein Expression, Methylation and hundreds of clinical measurements. While we design our Lands to be extremely powerful in providing visualizations with customizable gene, sample and project filters along with customizable graphical designs, we acknowledge that it sometimes takes time to explore the data. For some of our customers, admin/super user want to configure their company or group specific views. Or, users in a specific research group may want to set customized views that are most commonly used for a specific disease or project. All these customization can be done through the Land custom view format.

How many steps does it take to display the expression of gene POLR3A at difference stages of systemic sclerosis comparing to normal control in study GSE58095? To draw the plot like the one below, the use needs to search the gene POLR3A, click on Expression | Expression Intensity view, filter project GSE58095, change grouping to disease category and make sure the color and scale are of the preferred settings. 

Expression Intensity of POLR3A in different stage of systemic sclerosis in study GSE58095.

Expression Intensity of POLR3A in different stage of systemic sclerosis in study GSE58095.

After the user has made it to this view and feels it can be potentially very informative for his or her project, the user can save the view: 

Furthermore, if the user wants to share to view with the whole team and would like to replicate this query for other genes or projects, the user can ask the admin to create custom views. To see how to create customer views in land, please check out our wiki page: Custom Views in Land or contact us. As a standard user, the query can be set up as Custom Views, or even be grouped into a selection of custom views into specific project folder, like this one for Scleroderma Projects:

Creating your own Lands for integration with OncoLand or ImmunoLand

Matt Newman

Free Land Creation

While many of our users are aware of the OncoLand and ImmunoLand datasets, not everyone might be aware of how easy it is to create your own Lands, and further integrate these with the public Lands (for instance with TCGA).

Omicsoft provides easy-to-use command line tools that can be used to import your own mutation data, copy number data, and RNA-Seq data into a Land created specifically for you or your dataset.  These can then easily be combined with "virtual" lands to create a Land that allows visualization and querying of your data side-by-side with the public data. In order to do, curation is the key requirement, as you must choose two columns for integration. In most cases, for OncoLand-based Lands, this will be Tumor Type and Sample Type (Primary Tumor, Normal, etc.), and for ImmunoLand this might be DiseaseState and Tissue.

These tools are available for free with your subscription to either OncoLand or ImmunoLand, and if you'd like to try building the Lands yourself, contact with any questions on getting started.

Paid Land Creation

Many of our users prefer to have Omicsoft do the Land creation, including processing of their data through our pipelines (using either Omicsoft resources or the customer resources via VPN access).  This can be a way to get the benefit of internal Land creation, without having to invest any time in gaining expertise on how the process works.  If you're interested in seeing how we can help process your data, be it WGS, WXS, Targeted Sequencing, RNA-Seq, and more, contact us at

Choice of Gene Annotation on RNA-Seq Results

Matt Newman

Many users (and potential users) have asked us about our choice of gene annotation (or gene model, in Omicsoft lingo), in the OncoLand and ImmunoLand products. For the past three years, we've used an implementation of the UCSC gene model, that we refer to as the "Omicsoft Gene Model".  It consisted of UCSC gene annotation + mirBase (for miRNAs) + the mitochondrial genes from Ensembl.

It was recently announced that UCSC will be moving to the GENCODE basic gene annotation for future incarnations of the gene annotation for their GRCh38 reference library. This is really good news for everyone, as it will hopefully simplify and standardize the reporting of transcript and gene IDs across publications, tools, etc.  It's something we are actively looking at for next year's releases of OncoLand and ImmunoLand as well (in addition to likely maintaining our current B37.3 and Omicsoft gene model results).

An interesting read on the effect of the gene annotation source on RNA-Seq can be found here:  In it, the author found that the source of gene annotation does have a profound effect on RNA-Seq alignment, gene expression calculations, and differential expression results.