software

Multi-Dendrix: (Multiple Pathway De novo Driver Exclusivity)
Multi-Dendrix project page

Multi-Dendrix is an algorithm for the simultaneous discovery of multiple driver pathways using only somatic mutation data from a cohort of samples. Multi-Dendrix uses an integer linear program to identify pathway sets such that each pathway contains genes with approximately mutually exclusive mutations and high coverage of the sample set. We describe Multi-Dendrix in a paper in submission:

M.D.M. Leiserson, D. Blokh, R. Sharan, B.J. Raphael. (2012) Simultaneous identifcation of multiple driver pathways in cancer. [In submission]

We have released Multi-Dendrix as a Python package that includes functions for subtype and network analysis of Multi-Dendrix results.

Download the release on GitHub: Multi-Dendrix (Version 1.0, January 28, 2013)

Dendrix: (De novo Driver Exclusivity)
Dendrix project page

Dendrix web server

Dendrix is an algorithm for discovery of mutated driver pathways in cancer using only mutation data. It finds sets of genes, domains, or nucleotides whose mutations exhibit both high coverage and high exclusivity in the analyzed samples. This algorithm is described in the paper:

F. Vandin, E. Upfal, B.J. Raphael. (2012) De novo Discovery of Mutated Driver Pathways in Cancer. \Genome Research. 22(2):375-85. Epub 2011 Jun 7. PDF Preprint Publisher Link [Preliminary version accepted at 15th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2011)]

To download Dendrix see the Dendrix project page

 

HotNet: Finding Altered Subnetworks
Hotnet project page

HotNet is an algorithm for finding significanlty altered subnetworks in a large gene interaction network. This algorithm is described in the paper:

Vandin F, Upfal E, B.J. Raphael . (2011) Algorithms for Detecting Significantly Mutated Pathways in Cancer. Journal of Computational Biology. 18(3):507-22.

[PDF] Publisher Link.

[A preliminary version of the paper appeared at Proceedings of the 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2010). [PDF] ]

To download HotNet see the Hotnet project page

 

HotNet and Dendrix Visualization (Cytoscape plug-in)

A Cytoscape plug-in for viewing HotNet and Dendrix results.

Download coming soon.

NBC: Neighborhood Breakpoint Conservation

This software finds recurrent rearrangement breakpoints in DNA copy number data. The algorithm is described in the paper:

A. Ritz, P.L. Paris, M.M. Ittmann, C. Collins, and B.J. Raphael. (2011) Detection of Recurrent Rearrangement Breakpoints from Copy Number Data. BMC Bioinformatics. Publisher Link

Gremlin: Genome Rearrangement Explorer with Multi-Scale, Linked Interactions:

This is an interactive visualization model for the comparative analysis of structural variation in human and cancer genomes. The model is described in the following paper:

T.M. O'Brien, A. Ritz, B.J. Raphael, and D.H. Laidlaw. (2010) Gremlin: An Interactive Visualization Model for Analyzing Genomic Rearrangements. IEEE Transactions on Visualization and Computer Graphics. vol.16, no.6, pp.918-926. Publisher Link

Geometric Analysis of Structural Variants (GASV and GASVPro)

Software for analysis of structural variation from paired-end sequencing and/or array-CGH data. This software has been tested used to find structural variation in both normal and cancer genomes using data from a variety of next-generation sequencing platforms. It can be used to predict structural variants directly from aligned reads in SAM/BAM format.

GASVPro

GASVPro is a probabilistic version of our original GASV algorithm. GASVPro combines read depth information along with discordant paired-read mappings into a single probabilistic model two common signals of structural variation. When multiple alignments of a read are given, GASVPro utilizes a Markov Chain Monte Carlo procedure to sample over the space of possible alignments.

GASVPro is availabile at the GASV GoogleCode site. Download.) We also provide an Example Data Set for analysis with GASVPro.

The GASVPro algorithm is described in the following paper.

S. Sindi, S. Onal, L. Peng, H. Wu and B.J. Raphael. (2012) An Integrative Model for Identification of Structural Variation in Sequencing Data. Genome Biology (In Press)

GASV

The original GASV method is described in the following paper:

S. Sindi, E. Helman, A. Bashir, B.J. Raphael. (2009) A Geometric Approach for
Classification and Comparison of Structural Variants.Bioinformatics. 25: i222-i230. (Special issue for the Joint 17th Annual International Conference on Intelligent Systems in Molecular Biology and 8th Annual International European Conference on Computational Biology (ISMB/ECCB 09)). Publisher Link

Old versions. These are for archival purposes. It is recommended to download the latest version from link above.

  • Version 1.4 (3/5/2010) . Download
  • Version 1.3 (1/19/2010) . Download
  • Example BAM file
  • Version 1.2 (11/30/2009) . Download: software
  • New in Version 1.4: Release notes.
  • New in Version 1.3: New output formats, streamlining of BAM file handling, bug fixes.
  • New in Version 1.2 (11/30/2009): Improved handling of SAM/BAM alignment files, speed improvements, maxCliqueSize option.
  • New in Version 1.1: a preprocessor for SAM/BAM files, aCGH comparison, fusion gene detection, and more.
Motif Description Length (MoDL):

MoDL finds mutliple motifs in a set of phosphorylated peptides, and is described in the following paper:

A. Ritz, G. Shakhnarovich, A.R. Salomon, and B. Raphael. Discovery of Phosphorylation Motif Mixtures in Phosphoproteomics Data. (2009) Bioinformatics. 25(1):14-21. Publisher Link

Paired-End Reconstruction of Genome Organization (PREGO):
Structural Variation Project Page

This algorithm reconstructs a cancer genome as a rearrangement of segments, or intervals, from the reference genome using paired end sequencing data. The algorithm is described in the following paper:

L. Oesper, A. Ritz, S.J. Aerni, R. Drebin, and B.J. Raphael. (2012) Reconstructing cancer genomes from paired-end sequencing data. BMC Bioinformatics. 13(Suppl 6):S10. Publisher Link.

[Preliminary version accepted at 2nd Annual RECOMB Satellite Workshop on Massively Parallel Sequencing (RECOMB-seq)]

Old versions. These are for archival purposes. It is recommended to download the latest version from the link above.

Tumor Heterogeneity Analysis (THetA)

This algorithm estimates tumor purity and clonal/subclonal copy number aberrations directly from high-throughput DNA sequencing data. We describe this algorithm in a paper currently in submission:

L. Oesper, A. Mahmoody, and B.J. Raphael. (2013) Inferring Intra-Tumor Heterogeneity from High-Throughput DNA Sequencing Data. [In submission]

[Preliminary version accepted at 17th Annual International Conference on Research in Computational Molecular Biology (RECOMB). Extended Abstract]

We offer a pre-release Beta version of this software for download. The full release with source code will be available upon paper acceptance. Contact us for further information.

Download the pre-release: pre-release, January 22, 2013

Old versions. These are for archival purposes. It is recommended to download the latest version from the link above.