Skip Navigation
caBIG® Data Sharing Information —
National Cancer Institute   U.S. National Institutes of Health www.cancer.gov

The caBIG program has been retired, and while this website is being maintained temporarily to prevent broken links and provide access to information on the subset of caBIG projects that were transitioned into the new NCIP program, it will be archived in the near future. For information on the NCI's biomedical informatics program, please visit http://ncip.nci.nih.gov.

The information and links on this website are no longer being updated and are provided for reference purposes only.

Home » Working Groups » Data Sharing & Intellectual Capital » caBIG® Data Sharing Information
Document Actions

caBIG® Data Sharing Information

The caBIG® program facilitates the integration of diverse research and data across the "bench to bedside continuum" through an interoperable infrastructure and collaborative leadership. Appropriate data sharing is a key element of this collaboration. Benefits of data sharing include:

  • The large volumes of research data created by the high throughput genomics and proteomics technologies can best be harvested by teams of individuals, rather than by single PI's.
  • Collaboration across and within disciplines is required to leverage the broad knowledge and skill bases needed to realize the scientific and public health benefits of translational and personalized medicine.
  • Data sharing raises the visibility of individual studies and data collections; it opens new avenues of data dissemination and validation, and points to opportunities for future collaboration.
  • Grants from NIH exceeding $500,000 require a plan for data sharing.

Data Sharing Elements

Data sharing using caBIG® tools and infrastructure requires attention to three primary elements. These three elements apply regardless of whether you adopt a caBIG® compatible tool or adapt your tool to be interoperable. They include:

  • Semantic Interoperability: Ensuring that controlled standard vocabularies are used to define and describe the data to be shared in ways that other systems can process and understand. Visit caBIG compatibility for more.
  • Syntactic Interoperability: Connecting to caGrid through well-defined and interoperable application programming interfaces (APIs) so that both analytical and data services can be advertised and accessed by others as appropriate. Visit caGrid for more.
  • Legal, Regulatory, and Security Requirements: Implementation of appropriate policies, procedures and protections to both define and govern the sharing of data given a range of variables such as privacy requirements, intellectual property considerations, and institutional/sponsor restrictions.

The rest of this page focuses on Data Sharing Legal, Regulatory, and Security Requirements.


The caBIG® Data Sharing and Security Framework

The basic tool for caBIG® Data Sharing is the Data Sharing and Security Framework (DSSF), designed to facilitate appropriate data sharing between and among organizations by addressing legal, regulatory, policy, ethical, proprietary, contractual, and socio-cultural barriers. The current version of the framework is available from the DSIC Knowledge Center. The DSSF was established and is being further developed by the caBIG® Data Sharing and Intellectual Capital (DSIC) Workspace.

Understanding the Framework

  • You can use the caBIG® Data Sharing and Security Framework (DSSF) as a decision support tool to facilitate data sharing at your Center by determining which data can be shared and under which type of access and data security controls. To do so, you will need to assess the sensitivity of the data by using the Framework's four elements:
    • Economic/Proprietary/IP Value
    • Privacy/Confidentiality/Security Considerations
    • IRB or Institutional Restrictions
    • Sponsor Restrictions
  • The organization assesses the data to be shared along the four elements and assigns a low, medium or high sensitivity rating to the data, which drives the selection of the sharing mechanism.
  • The organization arrives at an overall level of sensitivity for the data by weighting the outcomes of the four elements according to its own judgment.
  • The outcome, that is, a low, medium or high sensitivity rating, determines how the organization wants to control access to that data.
  • The organization offering to share data determines the controls on access to that data by determining:
    • The level of certainty needed regarding the authentication of the identity of data users, and
    • Whether particular authenticated groups or individuals are authorized to access the particular data.
    • The levels of security attached to data sensitivities of various levels are informed by guidance from the National Institute of Standards and Technology (NIST). Note: The assessment is discussed in more detail below.

Organizational Assessment Elements

Economic/Proprietary/IP Value

  • Relates to the Need for Protection. Sample Questions: Are the data subject to a restrictive license? Do the data relate to an invention report you have, or intend to file, with your institution? Is the study closed? Are the data or study findings awaiting publication?
  • The Framework asks you to select the category of proprietary value that best describes your data:
    • None/Low – Not subject to restrictive license or invention report
    • Medium – Data not yet submitted for publication
    • High – PHI, nonpublic intellectual property or other significant restriction

Privacy/Confidentiality/Security Considerations

  • Relates to Privacy and Security. Sample Question: Do federal or state laws or your institution's policies prohibit or restrict disclosure?
  • The Framework asks you to select the category of sensitivity that best describes your data:
    • Low Sensitivity - De-Identified/Anonymized Data Set
    • Medium - Coded/Limited Data Set
    • High Sensitivity - Identifiable Data

IRB or Institutional Restrictions

  • Relates to Human Subjects Research Considerations.Sample Question: Do your institution's or IRB's policies or the applicable informed consent documents explicitly or implicitly restrict or permit disclosure (e.g., "no commercial use")?
  • The Framework asks you to select the level of restriction that best describes your data:
    • Low - Explicit Permission for Registry Participation
    • Medium - Policy or Consent Limitations
    • High - Explicit Consent Limitations/Restrictions

Sponsor Restrictions

  • Relates to Restrictions from Sponsors in Grants and Contracts. Sample Question: Do the terms and conditions in any sponsored agreements prohibit or restrict disclosure outside the institution or to the Grid?
  • The Framework asks you to select the level of sponsor restriction that best describes your data:
    • Low - No Restrictions
    • Medium - Delays or Other Moderate Restrictions
    • High - Classified Research/Major Restrictions

Using the Framework as a Decision Support Tool

  • Using the Framework as a decision support tool can help you determine the structures and mechanisms needed to share the data under consideration:
    • General Website Terms of Use - No Restrictions on Access
    • Standardized Click-Through Terms and Conditions - Some Limitations on Access to the Data
    • Individually Negotiated Bi-Lateral or Multi-Lateral Agreement - More Restricted Access Conditions

Note: Each organization must select the type of agreement that best fits the needs. The Data Sharing and Security Framework is not a strict policy or guideline.


Related Resources

last modified 01-25-2013 01:23 AM