Inconsistent Annotation_Transcript in CCLE_mutations file across different versions

Hello,

I noticed that for some genes, entries in the Annotation_Transcript column of CCLE_mutations file in the same cell line have changed for newer versions.
For example, the transcript of NFE2L2 gene for ACH-000488 cell line was ENST00000397062.3 in 20Q3, but in 20Q4, it is now ENST00000446151.2. As a result, the same amino acid change is annotated differently, e.g. for ACH-000488, p.D29G in 20Q3, but p.D13G in 20Q4.

What is the best way to interpret this change?

Thank you,
Tyler

Hello Tyler,

The mutation maf file is, for now, an aggregate of many maf files. Some MAF files are pretty old and their annotations are based on less recent gene annotations. So annotations might come from different gene annotation files for each MAF files. When mutations do match, we aggregate them and need to use the annotations of one of the files. We were previously using the oldest annotations as the reference.
We recently changed the way we aggregate mutations and are now using the annotations of the file with the most recent annotations (here CGA_WES_AC).

We hope to soon release an unaggregated MAF files so our users have better control over this.

Best,

1 Like