With Omicsoft continuing to release new and updated Land products to meet the high demand from our clients, we continue to overcome challenges, accumulate unique expertise and establish our leading role as a disease genomics data service and content provider. (If you are not familiar with our Lands, check out our ImmunoLand and CVMLand ).
Unlike the field of cancer genomics, where there are quite a few institutes or consortiums providing large amounts of data, for example TCGA, CCLE, CGCI, ICGC, TARGET (for details, please check out our OncoLand), immunological, cardiovascular and metabolic disease genomics research have most data of their data scattered in individual research studies. Public data repositories such as GEO (Gene Expression Omnibus) and SRA (Sequence Read Archive) collect data from the research community. Problems easily arise:
- Data query between studis is time-consuming
- Data formats vary from study to study, making it difficult to understand the data or to perform cross-comparison and meta-analysis
- Data accuracy is uncertain, with human error in the process of data uploading, archiving, processing and more
- Data processing and analysis is not standardized, but instead often the choice of individual investigators
The complexity of the process means that there is a real need for someone to come in and clean up the data that is out there, and to do it properly. At Omicsoft, thanks to our experienced data curation and processing team, we aim to become the leading provider and data hub for public disease genomics research.
At Omicsoft, we have a team of more than 10 domain experts handling projects manually, from dataset selection and data processing to analysis:
We carefully select which projects to include, filtering out unrelated projects or any project doesn't pass our curation standards:
We use controlled vocabularies and extract sample metadata with standardized fields and input. Very often, our curation team member needs to go back to either the public data repository or author of the primary article to clarify content or report errors, all to ensure the accuracy of our content:
We perform iterative editing to minimize and eliminate out processing errors:
Our proprietary curation tool is a unique asset that ensures fast, accurate, efficient, and standardized large volumes of curation: