Skip Navigation
National Cancer Institute U.S. National Institutes of Health

The caBIG program has been retired, and while this website is being maintained temporarily to prevent broken links and provide access to information on the subset of caBIG projects that were transitioned into the new NCIP program, it will be archived in the near future. For information on the NCI’s biomedical informatics program, please visit

Personal tools
Document Actions

caBIG® Community Code Resource Directory

This directory is intended to promote the exchange of community-developed digital capabilities supporting cancer research. The resources in this compilation are recommended by the community and will grow quickly to include capabilities across many biomedical domains. We encourage you to check back frequently as new resources are added.


Name Category Attribution Description
Annotation Imaging Markup (AIM)
Application Programming Interface (API)
Imaging Daniel Rubin, Stanford The Annotation Imaging Markup (AIM) API allows users of commercial and open-source imaging workstations to collect, manage, and store structured information and to associate that information with imaging features in a platform-independent manner.  It enables researchers to collect case report information in the context of radiology studies, and to integrate that data with images as the basis for analysis and Computer Aided Diagnosis (CAD).
Bayesian Analysis of COpy number Mixtures (BACOM)
Application Programming Interface (API)
Integrative Cancer Research Yue Wang, Virginia Tech Bayesian Analysis of COpy number Mixtures (BACOM), is a statistically-principled in silico approach to accurately estimate genomic deletions and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We have developed a cross-platform and open source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues and other relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, through which users can smoothly incorporate BACOM into their specific analyses.
BioMixer Infrastructure Margaret-Anne Story and Bo Fu, University of Victoria British Columbia, Canada BioMixer supports ontology and ontology mapping visualizations in collaborative settings. In particular, BioMixer supports social interaction around the visualization. A user can send an existing visualization workspace to collaborators via email, as well as initiate discussions by adding notes to the visualization. BioMixer is accessible using a web browser and does not require the download or installation of any software. The tool supports the publication of visualizations by providing interactive visualizations that can be easily inserted into external websites. BioMixer supports users with diverse backgrounds and preferences by presenting multiple coordinated views, which aim to engage the audience from different viewpoints..
caArray Importer Integrative Cancer Research Matthew Eldridge, Cancer Research UK, Cambridge Research Institute The caArray Importer extends the caArray Java API to provide programmatic access to retrieval of imported array designs, creation, modification and deletion of projects, upload, validation and import of array data files, modification of file type, and file removal.
caRuby Biobanking Fred Loney, Oregon Health and Sciences University The goal of caRuby is to reduce barriers to adopting caBIG tools; caRuby simplifies interaction with caBIG® application services by presenting a JRuby caBIG façade that supports the following: migration from legacy systems, incremental update from source applications, extract from a caBIG database, utility administrative tasks, workflow data transformations, lightweight web services, and site-specific user interfaces. There is a current implementation for caTissue.
Copy Number Modules Integrative Cancer Research Lee Cooper and Carlos Moreno, Emory University This collection of software modules, developed by the Emory In Silico Research Centers of Excellence (ISRCE), defines a pipeline for calculating copy number data for the REMBRANDT project. The data created by this pipeline was used to perform a GISTIC analysis to define significant copy number alterations in the transcriptionally defined tumor subtypes identified by TCGA. This pipeline operates on the raw Affymetrix 100K SNP CEL files and produces a list of altered regions for each sample.
Differential Dependence Network (DDN) Integrative Cancer Research Yue Wang, Virginia Tech DDN (Differential Dependency Network) is an analytical tool for detecting and visualizing statistically significant topological changes in transcriptional networks representing two biological conditions. Developed under caBIG's In Silico Research Centers of Excellence (ISRCE) Program, DDN enables differential network analysis and provides an alternative way for defining network biomarkers predictive of phenotypes. DDN also serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables affect biological networks and clinical phenotypes. Besides the standalone Java application, we have also developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly.
GLU: Genotype Library and Utilities Integrative Cancer Research Kevin Jacobs, National Cancer Institute Whole-genome association studies are generating unprecedented amounts of genotype data, frequently billions of genotypes per study, and require new and scalable computational approaches to address challenges involving storage, management, quality control, and genetic analysis. GLU is a framework and a software package that was designed around a set of novel conceptual approaches. GLU addresses the need for general and powerful tools that can scale to effectively handle trillions of genotypes.
LabKey Server Integrative Cancer Research LabKey Software Foundation LabKey Server is open source software that helps scientists organize, analyze, and share biomedical research data. It is a secure, web-based data management platform that provides a flexible and scalable foundation for building applications customized to researchers' protocols, analysis tools and data sharing requirements.
Microscopy Image Segmentation Imaging Jun Kong and Lee Cooper, Emory University Developed by the Emory In Silico Research Center of Excellence (ISRCE), these tools support the segmentation of nuclei and angiogenesis regions in digitized pathology slide images.
Microscopy Image Segmentation - Grid Services Imaging Tahsin Kurc and Jun Kong, Emory University Developed by the Emory In Silico Research Center of Excellence (ISRCE), these caGrid services support the segmentation of nuclei and angiogenesis regions in digitized pathology slide images.
Pathology Analytical Imaging Standards (PAIS) Imaging Fusheng Wang, Emory University As part of Emory's In Silico Research Centers of Excellence (ISRCE) project, we have designed a data model and a database PAIS to address the data management requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines. The data model represents virtual slide-related image, annotation, markup, and feature information. This set of information includes: a) metadata about images; b) context relating to specimens; c) human observations involving pathology classification and characteristics; d) algorithm- and human-described segmentations, features, and classifications; and e) a description of the computation being carried out as well as identification of input and output datasets.
Phenotypic Up-regulated Gene Support Vector Machine (PUG-SVM) Integrative Cancer Research Yue Wang, Virginia Tech PUGSVM (Phenotypic Up-regulated Gene Support Vector Machine) is an analytical tool for multiclass gene selection and classification. Developed under caBIG's In Silico Research Centers of Excellence (ISRCE) Program, PUGSVM addresses the problem of imbalanced class separability, small sample size and high gene space dimensionality, where multiclass gene markers are defined by the union of one-versus-everyone phenotypic up-regulated genes, and used by a well-matched one-versus-rest support vector machine. PUGSVM provides a simpler yet more accurate strategy to identify statistically-reproducible mechanistic marker genes for characterization of heterogeneous diseases.
QI-Bench Imaging Andew Buckler, Buckler Biomedical Sciences We provide open-source informatics tooling used to characterize the performance of quantitative medical imaging as needed to advance the field. These tools may be deployed internal to an organization or used for collaborative work across organizations. The data on which they work may be accessible only to identified individuals, or more broadly in an open archive, to suit the specific project purpose.
Region Classification Algorithms and Tools Imaging Sharath Cholleti, Emory University Developed by the Emory In Silico Research Centers of Excellence, these tools use a texton-based classification algorithm to classify an RGB imaging into tumor and normal regions.
Region Classification Algorithms and Tools - Parallel Machines Imaging Sharath Cholleti, Emory University; Patrick Widener, Sandia National Lab Developed by the Emory In Silico Research Centers of Excellence, these tools use a texton-based classification algorithm to classify an RGB imaging into tumor and normal regions.
swBIG Infrastructure James McCusker, Yale University swBIG is a web service that lets users treat caBIG data services as Linked Data. swBIG is available as a prototype RESTful service that converts requests for resources from linked data URIs to caGrid service calls to requisite grid endpoints. This service uses a representation of NCI Thesaurus converted to a SKOS representation using OWLtoSKOS. This representation provides the ability to reason over concepts as instances in property value sets as well as in conceptual models.
XcaCORE Biobanking Gunther Schadow, Regenstrief Institute The XcaCore interface can be used with any UML model. It is the UML model that determines the XML schema in a straightforward manner, and the XcaCore functions can be added to any such schema. For use with caTissue, XcaCore is used to programmatically move data in and out of the caTissue Core system by transforming data into the structure defined by the caTissue UML model. The XSLT script in xcaCORE addresses the API calls.

The U.S. Government, NIH, NCI CBIIT, and their employees and contractors do not make any warranty, express or implied, including warranties of fitness for a particular purpose, with respect to resources or tools listed in the caBIG® Community Code Resource Directory.  Links to other Internet sites are provided only for the convenience of users:  NIH and NCI CBIIT are not responsible for the availability or content of external sites.  See the complete disclosure statement

NCI CBIIT does not provide support for the resources and tools listed in the caBIG® Community Code Resource Directory.  Users should contact the code submitter for support via the relevant link listed in the Directory. To recommend a community-code resource for inclusion in this directory or to provide comments on this page, please e-mail Application Support at

National Cancer Institute Department of Health and Human Services National Institutes of Health