MSigDB v5.0 (Mar 2015)

New collection H: Hallmark signatures#

H: Hallmarks is a new collection of 50 sets. These gene sets represent specific well defined biological states or processes and display coherent expression. The hallmark gene sets were generated by a computational methodology based on identifying gene set overlaps and extracting coherent representatives of them. Details of the procedure will become available after the manuscript describing it is accepted for publication. The hallmark gene sets reduce noise and redundancy and provide a better biological space for GSEA and other gene set-based analyses of genomic data.

We envision this collection as the starting point for exploring MSigDB resource and GSEA. This collection is an initial release of 50 hallmarks which condense information from over 4,000 original overlapping gene sets from v4.0 MSigDB collections C1 through C6. We refer to the original gene sets as “founder” sets.

Hallmark gene set pages provide links to the corresponding founder sets for more in-depth exploration. In addition, hallmark gene set pages include links to microarray data that served for refining and validation of the hallmark signatures.

Updates to C2 Collection#

C2:CP Matrisome Gene Sets#

The CP (Canonical Pathways) sub-collection has 10 new gene sets from the Matrisome Project. The "matrisome" refers to the ensemble of genes encoding extracellular matrix (ECM) and ECM-associated proteins (as defined by Naba and collaborators). The Matrisome Project is a collaborative effort between the laboratory of Richard Hynes at MIT, researchers at the Barbara K. Ostrom (1978) Bioinformatics & Computing Facility at the Koch Institute at MIT and the Broad Institute, pursuing extensive in silica and experimental characterization of ECM components.

Updates to C2:CGP Collection#

In response to requests from multiple users of our resource, we removed all 7 gene sets based on the publication in Nat Med 2006 by Potti et. al, which has been retracted.

Alerted by sharp-eyed users of MSigDB, we redefined four gene sets based on the publication in Cancer Cell 2010 by Verhaak et al.

At request of Dr. Durand, with have updated records of two gene sets he contributed earlier.

Fixed errors in a number of other gene sets.

Changes in the XML File Format#

To accommodate new features in the Hallmarks collection, we have introduced additional attributes for gene set description in the database XML format. The new attributes are:

Viewing Previous Versions of MSigDB#

Files from previous versions of MSigDB (v4.0, v3.1, v3.0, v2.5, v2.1 and v1.0) are archived and available at Downloads page. You can view them through the MSigDB Browser tool in the GSEA desktop application.

Gene Set Corrections#

The following table only includes gene sets from v4.0 with direct counterparts in v5.0, or sets that became deprecated in v5.0.

C2 Collection#

v4.0 (old) v5.0 (new) notes
PECE_MAMMARY_STEM_CELL_UP PECE_MAMMARY_STEM_CELL_UP changed members
PECE_MAMMARY_STEM_CELL_DN PECE_MAMMARY_STEM_CELL_DN changed members
KOINUMA_COLON_CANCER_MSI_UP KOINUMA_COLON_CANCER_MSI_UP changed members
VERHAAK_GLIOBLASTOMA_PRONEURAL VERHAAK_GLIOBLASTOMA_PRONEURAL changed members
VERHAAK_GLIOBLASTOMA_NEURAL VERHAAK_GLIOBLASTOMA_NEURAL changed members
VERHAAK_GLIOBLASTOMA_CLASSICAL VERHAAK_GLIOBLASTOMA_CLASSICAL changed members
VERHAAK_GLIOBLASTOMA_MESENCHYMAL VERHAAK_GLIOBLASTOMA_MESENCHYMAL changed members
HOLLEMAN_PREDNISOLONE_RESISTANCE_ALL_UP HOLLEMAN_PREDNISOLONE_RESISTANCE_ALL_UP changed members
HOLLEMAN_PREDNISOLONE_RESISTANCE_ALL_DN HOLLEMAN_PREDNISOLONE_RESISTANCE_ALL_DN changed members
AZARE_NEOPLASTIC_TRANSFORMATION_BY_STAT3_UP AZARE_NEOPLASTIC_TRANSFORMATION_BY_STAT3_UP changed members
AZARE_NEOPLASTIC_TRANSFORMATION_BY_STAT3_DN AZARE_NEOPLASTIC_TRANSFORMATION_BY_STAT3_DN changed members
DURAND_STROMA_MAX_DN DURAND_STROMA_S_UP changed members, brief description and name
DURAND_STROMA_NS_UP DURAND_STROMA_NS_UP changed members, brief description and name
POTTI_5FU_SENSITIVITY - deprecated
POTTI_ADRIAMYCIN_SENSITIVITY - deprecated
POTTI_CYTOXAN_SENSITIVITY - deprecated
POTTI_DOCETAXEL_SENSITIVITY - deprecated
POTTI_ETOPOSIDE_SENSITIVITY - deprecated
POTTI_PACLITAXEL_SENSITIVITY - deprecated
POTTI_TOPOTECAN_SENSITIVITY - deprecated