In RNA-Seq datasets, fusion genes can be detected based on both paired end connectivity and single end read spanning. In a paired-end NGS dataset, a discordant read pair is one that is not aligned to the reference genome with the expected distance or orientation. If a set of discordant read pairs are mapped to two different genes, a fusion gene is suggested (Fusion PE detection). On the other hand, single-end reads that span the fusion junctions provide base-pair evidence for the fusion events (Fusion SE detection). ArrayStudio/ArrayServer software provides functions to do both fusion detection algorithms and the Land contains fusion views for both detection types.
In our recent data analysis of the CCLE and TCGA datasets, we found that there are a significant number of fusion genes where both bi-directional fusion genes are expressed. Taking the BCR-ABL1 fusion as one example, after stringent filtering, there are eight cell lines expressing BCR-ABL1 fusion and five cell lines expressing ABL1-BCR in CCLE's 780 RNA-Seq samples in CGHub. Here is the fusion RPKM views for both fusions in the CCLE-Land:
By checking the fusion junction spanning reads in Omicsoft Genome Browser, the fusion from ABL1 to BCR is very convincing. Taking the cell line NALM1 as one example, it is expressing both BCR-ABL1 and ABL1-BCR.
There are papers reporting bidirectional fusion gene expression, such as the following one for BCR and ABL1:
We also found the expression of ERG-TMPRSS2 fusion in TCGA samples, as well as the FGFR3-TACC3, TACC3-FGFR3 fusion. In inter-chromosome cases, fusions are formed on both directions due to translocation (such as BCR-ABL). Within the same chromosome or genes nearby, fusions in both directions are formed by tandem repeats (such as the FGFR3-TACC3 example). Here is a literature figure showing the FGFR3-TACC3 fusion:
- FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution, Bioinformatics 27.14 (2011): 1922-1928.
- Wiki page for the best practice in Fusion gene detection in RNA-Seq