Researchers from the National Cancer Institute’s (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) have produced a resource of global proteomic and post-translational modifications, whole genome and whole exome sequencing, miRNA and totalRNA sequencing, DNA methylation, imaging, and clinical information for more than 1,000 cancer patients across 10 tumor types. A description of the effort to harmonize and disseminate this resource, as well as four subsequent studies that utilize it, have just been published online by Cell Press. These analyses, probing post-translational modifications (PTM), oncogenic drivers, DNA methylation, and histopathology images are the first of what is expected will be many to leverage this rich, now publicly-available, resource.
A Perspective in Cancer Cell, co-led by Samuel H. Payne from Brigham Young University, Bing Zhang from Baylor College of Medicine, and Ana I. Robles from NCI, addresses the challenges encountered by CPTAC researchers as they integrated diverse data types from various cancer cohorts and details the methodology they used to re-process the data into a unified resource for dissemination. In addition to highlighting several computational resources aimed at integrating nucleotide sequencing and mass spectrometry proteomics data, the team developed interactive websites to engage scientists with it, aiming to inspire similar approaches in other molecular studies.
“The CPTAC pan-cancer data set is the largest proteogenomic resource for cancer research, linking genotypes to phenotypes through comprehensive genomic, proteomic and phosphoproteomic characterization of tumors,” said Dr. Payne. “We hope that releasing this newly harmonized dataset, along with numerous interactive web-resources will engage new audiences and assist in their research for better diagnosis and treatment of cancer.”
Two Research Articles in Cell, led by Gad Getz, from Broad Institute, Li Ding from Washington University in St. Louis, and Lewis Cantley from Weill Cornell Medical College, provide a first glimpse at the value of this resource. One, focused on the links between PTMs and cancer processes, researchers identified 33 molecular signatures that grouped tumors into new classes through shared and divergent biological processes. Notably, they found that tumors with similar genomic mutation patterns, such as colon and endometrial, displayed distinct protein phosphorylation patterns, potentially explaining diverse treatment responses. Additionally, the study found that tumors with reduced acetylation on metabolic proteins were more likely to respond to immunotherapy. This comprehensive analysis of PTMs highlights their role in cancer biology and response to therapy.
“There’s a lot more to be done with this data, but the biggest thing for the community is the resource we’ve created and the availability of the data,” Dr. Getz said. “We hope this will be a useful example for how other kinds of studies can integrate genomic and proteomic data, and that this will be a rich dataset for many years to come.”
In a separate study, the team investigated the impact of oncogenic driver mutations on RNA, proteins, and PTMs. The team discovered genetic changes that rewired inferred protein interactions, identified gene pairs leading to tumor cell death with potential therapeutic value, and unveiled drivers of DNA methylation. The research linked driver genes with proteomic patterns, revealing cis-effects in 59 genes at RNA, protein, and phosphoprotein levels. Mutations and copy number alterations were also found to ‘rewire’ protein-protein interactions. Comparative analysis between tumor and normal adjacent tissue identified key protein changes for oncogenic pathways, while neoantigen burden and T-cell infiltration suggested targeted therapy vulnerabilities. The study underscores how comprehensive proteomics across tissues yields insights into oncogenic drivers beyond individual cancer types.
On the impact of this achievement, Dr. Ding wrote, “this set of Pan-Cancer papers represents an extraordinary effort by the CPTAC consortium to extend existing knowledge on genetic drivers to the understanding of their regulations at the epigenetic level and their functional impacts modulated through protein abundance and modifications on biological processes involved in carcinogenesis."
Appearing in Cancer Cell, another working group also led by Li Ding built a pan-cancer catalog of DNA methylation events associated with RNA transcript and protein changes that they mined to uncover epigenetic alterations that have broad effects on the tumor microenvironment, can inform tumor lineage, heterogeneity, and other phenotypes, and reveal potential new therapeutic avenues. The publication describes novel methylation subtypes with distinct RNA and protein signatures and identifies FGFR2 hypomethylation and STAT5A hypermethylation as critical for further investigation.
Lastly, in Cell Reports Medicine, a team co-led by David Fenyö from NYU Grossman School of Medicine, and Alexander Lazar from The University of Texas MD Anderson Cancer Center, leveraged the availability of histopathological slides to develop innovative prediction models using machine learning tools, that integrate pathology imaging with genomics and proteomics to arrive at classification systems with potential clinical utility. Latent image features extracted by the models were found to directly correlate pathway level perturbations in protein expression with observed and interpreted pathologic images as well as identify molecular signatures driving phenotype differences.
This resource, encompassing proteomic, genomic, transcriptomic, and clinical data, offers a unified platform for deep analysis and has already yielded important insights. The integration of diverse data types and the development of interactive tools encourage their use make clear the transformative nature of this achievement. On the impact of these publications, Dr. Robles wrote, “This suite of Pan-Cancer articles demonstrates the power of multidisciplinary team science and showcases the value of investigating cancer through a proteogenomic lens.” As this resource becomes publicly available, it holds the promise of unlocking new avenues for targeted therapies, enhancing our ability to classify tumors, and ultimately advancing the field of cancer medicine.
This research was supported by the Clinical Proteomic Tumor Analysis Consortium at the National Cancer Institute and the National Institutes of Health.