For pricing and inquiries, send an email to

5001 Weston Parkway, Suite 201
Cary, NC 27513



Omicsoft is the leading provider of Next Generation Sequencing, Cancer Genomics, Immunology, and Bioinformatics solutions for Next Generation Sequencing Data and Gene Expression Analysis.

Genetic Variant Annotation

Pharmacogenomics (PGX) and Genetics pipelines, data management, and variation annotation and search engine (gene or position based).

Genetic Variant Annotation

GeneticsLand's variant annotation and management functionality provides a turnkey solution for gathering and annotating your genetics results data using a variety of high-impact public resources.  The system can easily handle hundreds of millions of rows, easily filtering from your input dataset down to your SNPs and genetic variants of interest.

Example queries:

  • Show genetic variants that are known to be likely pathogenic from the ClinVar database, while limiting your search to only those variants in exon coding regions.
  • Show rare genetic variants that have population frequencies less than 5% in the 1000Genomes, ExAC, and ESP6500 projects, while limiting the search to variants showing some association in public GWAS studies (GRASP2).
  • Annotate association analysis results (GWAS), quickly showing particular genetic variants in a locus of interest, along with linkage disequilibrium plots (LD)

Variant Annotation Sources

  • Basic Information
    • Genomic coordinates and reference/alt
    • dbSNP/dbVAR
    • Gene
    • Protein/Domain
  • Function, Regulation, and Conservation
    • Functional predictions
    • Conservation
    • Regulome
    • Haploreg
    • eQTL from GTEx
  • Allele frequencies
    • 1000 Genomes
    • ExAC
    • ESP6500
    • Internal/Customer sources
  • Clinical, Disease, Drug
    • GWAS findings
    • ClinVar

Data Types

  • VCF files
  • Associate reports
  • PLINK bed files
  • eQTL data
  • RS IDs

Technological Breakthrough

  • Each annotation source uses a different ID, so no "gold standard" way to normalize all data sources/variants
    • RS ID is not comprehensive; does not contain both the reference allele and alternative allele; not unique, and not stable (changes from version to version)
    • HGVS (g coding) is comprehensive, but inefficient, and not normally included in variant data sources
    • Chromosome + position + reference + alternative (similar to a VCF) is comprehensive, but not efficient for searching
  • Unified variant coding
    • Internally stored code representing genomic coordinates, reference and alternative allele
    • Allows matching of all variant sources to VCF/input files