Connecting genotype with proteome





Proteogenomics is a new exciting field in mass spectrometry (MS) based proteomics that combines proteomics information with sample specific genomics and transcriptomics information. The initial proteogenomics efforts have focused on organisms with small genomes, where MS based peptide data is used to discover novel protein coding regions/genes to provide protein level-evidence of gene expression, thereby improving genome annotation. The fast development in methods to generate DNA and RNA sequence data has created an enormous resource that if coupled with proteomics data can be used for proteogenomics research. This combined analysis makes it possible to detect novel protein species (mutated proteins, fusion proteins, pseudogenic proteins etc.) that are missed in conventional proteomics. It can also be used to analyze the influence of genomic variants and aberrations on the molecular phenotype; for example, influence of genomic aberrations at the protein level, pathway activation, splice variants and on the post translational modification (PTM) status.

In the Lehtiö group, we develop and use proteogenomics methods to understand how cancer genome alterations impact proteome level with the aim to use this knowledge to improve future therapies. Moreover, we have developed methods to improve genome annotation and analysis of splice variants at the protein level.

Selected papers:

  1. Zhu Y., Hultin-Rosenberg L., Forshed J., Branca R.M., Orre L.M., Lehtiö J., SpliceVista, a tool for splice variant identification and visualization in shotgun proteomics data. Mol Cell. Prot. 2014 Jun;13(6):1552-62.

  2. Branca R.M., Orre L-M., Johansson H.J., Granholm V., Huss M., Pérez-Bercoff Å., Forshed J., Käll L.,  Lehtiö J. HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nature Methods, 2014 Jan;11(1):59-62.

  3. Boekel J, Chilton JM, Cooke IR, Horvatovich PL, Jagtap PD, Käll L, Lehtiö J, Lukasse P, Moerland, Griffin TJ. Multi-omic data analysis using Galaxy. Communication, Nature Biotechnol. 2015 Feb 6;33(2):137-9. doi: 10.1038/nbt.3134.



Our HiRIEF LC-MS/MS based proteogenomics method (Branca R. et al., Nature Methods 2014) combines experimental and bioinformatics workflows to improve detection of novel coding regions and protein variants.