This article is the second in the Investigator Spotlight Series that was the brain child of Dr. Dawn Hayward, a NCI Communications Fellow at OCCPR, highlighting our CPTAC researchers and thier work. In case you missed it the first time, here she features CPTAC scientists Dr. David Fenyö and researcher Emily Kawaler!
Feature: Proteomic Data Analysis
The Fenyö lab of NYU Langone Medical Center takes large CPTAC data sets and looks at protein and gene expression level changes in different cancer types such as endometrial cancer. Here they describe their work and discuss Ph.D student Emily's talk during the CPTAC Scientific Symposium which took place October 16th, 2019.
Investigator Spotlight (IS): Your lab is focused on computational proteomics. How would you describe the field?
David Fenyö (DF): My lab focus has shifted; it’s definitely still proteomics, but it’s much more on integrating genomics and proteomics and we’ve been working with medical images. The Cancer Genome Atlas (TCGA) and other efforts have been very successful for us in characterizing the genomic changes that drive cancer. But proteomics adds another dimension in that it studies the functional gene product and is closer to the phenotype. So that gives additional information.
IS: How does being part of CPTAC help you achieve your research goals?
DF: For us being part of CPTAC has been very good because we get access to high quality data and we get early access to this data. Being a computational group, we are dependent on data and especially on high quality data.
Emily Kawaler (EK): We don’t have a wet lab, we only have what’s called a dry lab, which is purely computational. There are some people who work partly in the wet lab, partly in the dry lab. But for somebody like me who did all of their training in computer science, I can pipette and other than that I'm a disaster in the wet lab! I can’t really generate my own data. I had two options - I could work with publicly available datasets or I could work with people who can generate data, like our collaborators in CPTAC. Working with people is generally the preferable option because then you get to interact with them - you can help develop the study, tell them what you need, and they can tell you what they need. Having a collaboration like CPTAC where we’re involved in the whole process is really the best thing somebody like me can hope for.
IS: What is an example of the computational workflow you would go through once you have a data set for a project?
DF: There are pipelines for whole exome sequencing and processing copy number variants, as well as for RNAseq, proteomics and phosphoproteomics. Those pipelines are all different and specialized for each data type. Ultimately, we end up with data tables that contain quantitative information for each gene: copy number, transcript level, protein level and phosphopeptide level. However, the key is not so much each specific pipeline… rather that we do different types of exploratory analysis on the aggregate results.
EK: Less like pipelines, more like a sandbox at that point. We have various strategies that we might use to look at the data but some of it is dependent on what we know about the cancer type. For instance, we know that TCGA identified unique genomic features, such as genomic subtypes, in the cancers they studied. Because we know they're different on a clinical and a genomic level, one of the things we can do is look at differential protein levels between these subtypes to learn more about what makes them different on a molecular level. That’s just one example of the things we might do, but there's lots we can play with, like looking at the downstream effects of certain genes. So, some of it is totally data-driven, but a lot of it is working to build on the knowledge that we already have.
IS: What are a few key messages from your talk during the CPTAC Scientific Symposium (October 16th)?
EK: One of the things I want the audience to come away with from this talk is that [by] combining all of these data types and working with this big collaboration we are able to generate so many exciting new hypotheses that can be tested about ways to bring the work that we’ve done into the clinic. So I’m going to talk about different things we’ve found at the acetylome level, the proteome level and the phosphoproteome level that can now be taken into the wet lab and tested - do these possible biomarkers that we’ve found help us do a better job of predicting the course of a disease? Or maybe these biomarkers can help us decide what treatment to use or maybe even find some new treatments that haven’t been used in endometrial cancer before. Essentially, I want to give [the audience] a sense of the number of great translational possibilities that have opened up from the work that we’ve done.
IS: What’s your favorite part about working in the lab? Are there any fun things you do?
EK: David's got a little place upstate [New York], and last summer we had a lab retreat there which was a lot of fun. We went hiking, played games, and ate some good meals. It was actually also the setting for an episode of Elementary!