For pricing and inquiries, send an email to

5001 Weston Parkway, Suite 201
Cary, NC 27513



Omicsoft is the leading provider of Next Generation Sequencing, Cancer Genomics, Immunology, and Bioinformatics solutions for Next Generation Sequencing Data and Gene Expression Analysis.

Exciting Updates and Latest News

Keeping you up-to-date with the latest in NGS, Bioinformatics Analysis, and cancer genomics with blogs on Array Suite, OncoLand (TCGA and more), ImmunoLand, and more.

[Land] Behind the Scenes: Omicsoft Land processing and curation

Vivian Zhang

With Omicsoft continuing to release new and updated Land products to meet the high demand from our clients, we continue to overcome challenges, accumulate unique expertise and establish our leading role as a disease genomics data service and content provider. (If you are not familiar with our Lands, check out our ImmunoLand and CVMLand ). 

Unlike the field of cancer genomics, where there are quite a few institutes or consortiums providing large amounts of data, for example TCGA, CCLE, CGCI, ICGC, TARGET (for details, please check out our OncoLand), immunological, cardiovascular and metabolic disease genomics research have most data of their data scattered in individual research studies. Public data repositories such as GEO (Gene Expression Omnibus) and SRA (Sequence Read Archive) collect data from the research community. Problems easily arise:

  • Data query between studis is time-consuming
  • Data formats vary from study to study, making it difficult to understand the data or to perform cross-comparison and meta-analysis
  • Data accuracy is uncertain, with human error in the process of data uploading, archiving, processing and more
  • Data processing and analysis is not standardized, but instead often the choice of individual investigators

The complexity of the process means that there is a real need for someone to come in and clean up the data that is out there, and to do it properly. At Omicsoft, thanks to our experienced data curation and processing team, we aim to become the leading provider and data hub for public disease genomics research. 

At Omicsoft, we have a team of more than 10 domain experts handling projects manually, from dataset selection and data processing to analysis:

                                                                         Omicsoft Land Data Processing Workflow

                                                                         Omicsoft Land Data Processing Workflow

We carefully select which projects to include, filtering out unrelated projects or any project doesn't pass our curation standards:


We use controlled vocabularies and extract sample metadata with standardized fields and input. Very often, our curation team member needs to go back to either the public data repository or author of the primary article to clarify content or report errors, all to ensure the accuracy of our content: 

We perform iterative editing to minimize and eliminate out processing errors:

Our proprietary curation tool is a unique asset that ensures fast, accurate, efficient, and standardized large volumes of curation:

For more details, please refer to our wiki page on Omicsoft DiseaseLand Curation Ppeline

Please contact us is you have questions and suggestions.