MSigDB v7.2 (Sep 2020)

This release includes a substantial reorganization of C5 to accommodate the addition of the Human Phenotype Ontology, the addition of gene sets from WikiPathways to C2:CP, and the promotion of SCSig to C8, among other minor updates and additions.

Note: Due to substantial changes introduced in MSigDB 7.0, using GSEA 4.0.0+ is recommended when utilizing MSigDB 7.0+ resources.

Advisory: It is strongly recommended that users of MSigDB 7.2 '''always''' use the GSEA "Collapse/Remap to gene symbols" feature with the provided Symbol Remapping chip file if your dataset was generated with a transcriptome other than '''Ensembl v101/GENCODE v35'''.

New Additions and Changes to Collection Organization#


Begining in MSigDB 7.2, the WikiPathways analysis subset gene sets are now included as a canonical pathway subset in C2. This initial release reflects the WikiPathways September 2020 release.


60 gene sets have been curated from literature or contributed by users and are now available in C2:CGP.

36 of these gene sets derived from two publications (prefixed with "MANNE" and "BLANCO_MELO") are derived from research related to the ongoing COVID-19 global pandemic.

The remaining sets consist data contributed by the following individuals:

C5 Ontology#

C5 has been renamed from "C5 GO gene sets" to "C5: ontology gene sets". This change reflects the addition of a new sub-collection of gene sets from the Human Phenotype Ontology project. This initial release is categorized under C5:HPO and reflects the August 2020 release of the Human Phenotype Ontology. This sub-collection has been redundancy filtered through a procedure comparable to that of the GO and Reactome sub-collections.

C8: Cell Type Signature Gene Sets#

The previously supplemental release of gene sets for single cell identities has been updated and promoted to a full MSigDB collection. The new C8 differs from the previously released supplemental in the following ways:

Updates to Existing Gene Sets by Collection#

C1 (Positional Gene Sets)#

C1 has been updated to reflect the primary assembly of the current release of the Human Genome as present in Ensembl 101 and GENCODE 35 (GRCh38) (+0 gene set). Gene annotations for this collection are derived from the Chromosome and Karyotype band tracks from the Ensembl BioMart (version 101) and reflect the gene architecture as represented on the primary assembly.


C3 Regulatory Target Gene Sets#

C3:GTRD has been updated to GTRD v20.04. A substantial addition of new content to the source database resulted in a substantial number of gene sets increasing in size over the MSigDB maximum size of inclusion threshold. This resulted in a net decrease in the size of the collection (-176 gene sets).

C5:GO (Gene Ontology)#

Gene sets in these sub-collections are derived from the controlled vocabulary of the Gene Ontology (GO) project: The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology Nature Genet 2000. The gene sets are named by GO term and contain genes annotated by that term. This collection has been updated to the most recent GO annotations as present in the GO-basic obo file released on 2020-08-11 and NCBI gene2go annotations downloaded on 2020-09-03.

This collection is divided into three sub-collections:

These updates were generated in accordance with the procedure described in the GO release notes for MSigDB 7.0.

CHIP File Updates#

All CHIP files previously provided in the standard MSigDB 7.1 release have been updated for MSigDB 7.2 in accordance with previously described procedures.

Gene orthology annotations for mapping mouse and rat genes to their best match human orthologs have been updated to Alliance of Genome Resources orthology database release 3.1.1.


Hallmark founder gene sets in the MSigDB XML file have had their identifiers adjusted to reflect their internal "systematic name". This change enables more precise tracking of Hallmark founder gene sets across releases. Previously these gene sets were identified by their standard name as represented in the initial release of the MSigDB Hallmarks collection.