Different Annotation Transcripts used between cell lines


For some genes the reported Protein Change is different between cell lines, while the genome change is identical. This is likely caused by mapping of the mutation to different Annotation Transcripts. I suppose the use of different transcripts is related to the different genomics datasets inlcuded in DepMap?

The different Protein Change annotations can hamper selection of relevant cell lines and functional annotation of mutations based on knowledge bases such as OncoKB. Below is an example to illustrate this issue.

A well-known oncogenic driver mutation in EGFR is the p.L858R missense mutation. Based on the Protein Change column in DepMap, only NCIH2172 is reported to harbor this mutation. Two other cell lines (NCIH1975 & NCIH3255) harbor a p.L813R mutation. However, the genome changes of these mutations are identical to that of the NCIH2172 cell line and would thus also result in the p.L858R mutation when the other transcript is used.
Interestingly, this difference in Protein Change also seems to affect the Cosmic Hotspot Count.

1 Like

I have a smiler question, which i found cellmodelpassports and depmap has different mutation in same gene same cell model.

Maybe this is caused by mapping of different Annotation Transcripts…


Apologies for the late reply!

Prior to the 22Q4 release, we had been concatenating a small number of “legacy data” to our mutation data that’s actively being reprocessed every release. As a result, there were some discrepancies due to legacy data not being run in our most up-to-date pipeline. Starting from 22Q4, however, we are no longer including these legacy data in our mutation data set to make sure all of our cell lines are run through the same version of the pipeline, which should resolve the issue here.

Hope this helps!