Duplicate genes in the RNA seq data in 20Q4

Dear Depmap team,

I noticed there are two duplicates for PINX1 (54984) and TBCE (6905) in the RNA seq data matrix (https://ndownloader.figshare.com/files/25494389), could you please confirm them?


Yes, this duplication got introduced within our process which remapped gene IDs. The RNAseq pipeline we run emits ensemble IDs for genes, which we then map over to entrez IDs to be consistent with the CRISPR data. However, the mapping we got from Biomart mapped, for these two genes, two ensemble IDs to one entrez ID.

We’re investigating how we will fix this for future data updates.