Large-scale datasets generated via high-throughput sequencing and multi-omics technologies are shaping how researchers uncover features of cancer biology. The National Cancer Institute's (NCI) Proteomic Data Commons (PDC), a component of the Cancer Research Data Commons (CRDC), enables discoveries by providing a cloud-based platform that centralizes access to proteomic data and facilitates its integration with other data types.
The PDC serves as a hub for high-quality proteomic data, genomic data, imaging data, and associated clinical annotations within the larger CRDC ecosystem. By connecting these data types (housed in the Genomic Data Commons, Imaging Data Commons, and The Cancer Imaging Archive) the PDC allows researchers to correlate protein-level changes with genomic mutations, imaging characteristics, and clinical outcomes in the same cancer samples.
Since it officially launched in 2020, the PDC has grown to contain a collection of over 160 proteomic datasets spanning 19 cancer types. Main data contributors to the PDC include consortia that use standardized workflows to profile tumor proteomes, including the NCI-supported Clinical Proteomic Tumor Analysis Consortium (CPTAC), the International Cancer Proteogenome Consortium (ICPC), and the Applied Proteogenomics Organizational Learning and Outcomes (APOLLO) network.
Furthermore, the PDC offers a suite of cloud-based tools that make the data easier to analyze and visualize—researchers can run advanced statistical analyses, use APIs for data retrieval, and generate visualizations to better understand proteome dynamics.
The PDC is continuously improved with FAIR data sharing principles (Findability, Accessibility, Interoperability, and Reusability) in mind to ensure all datasets are available and easily accessible to any researcher with an internet connection.
With more datasets, tools, and capabilities being added over time, the PDC is set to become an integral part of the cancer research ecosystem. Researchers can access the PDC and explore its resources via the PDC portal at pdc.cancer.gov
The PDC was recently featured in the American Association for Cancer Research's (AACR) journal, Cancer Research Communications. The article, titled “NCI’s Proteomic Data Commons: A Cloud-Based Proteomics Repository Empowering Comprehensive Cancer Analysis through Cross-Referencing with Genomic and Imaging Data,” highlights the PDC’s role in advancing cancer research.