Retired Ensembl IDs

mw222 · April 5, 2023, 9:06pm

I use the ensembl IDs for analyzing the expression datasets (specifically the OmicsExpressionGenesExpectedCountProfile.csv) so that I can merge them with my own RNA-seq datasets that were aligned and counts generated using STAR + RSEM. I have noticed with the latest depmap22Q4 release that a significant number of the ensembl ids are retired (as in when I search them in the ensembl website they say they are retired and no longer in the new builds). Examples being SOD2 (ENSG00000112096), new id is ENSG00000291237, and HOMEZ (ENSG00000215271), new id is ENSG00000290292.

Is there a reason I am seeing this? I assumed since the new omics data were realigned with the latest STAR+RSEM versions that the alignment builds would be updated as well.

Thank you!

mw222 · April 5, 2023, 9:47pm

Did a little more digging, most of the unmatched ones are novel proteins and snoRNA/lncRNA/etc. A handful are retired ensembl IDs like I mentioned before so it doesn’t seem to be a huge issue.

simz · April 21, 2023, 8:36pm

Hi,

Thanks for reaching out!

We are currently using indices generated with Gencode v38 for STAR and RSEM, and the Nov2020 version of Biomart to map Ensembl IDs to gene names. They are not the most up-to-date at the moment, and that should be why there are unmatched novel proteins and retired IDs in our data sets. We hope to update them in the future to minimize the number of mismatches.

Best,
Simone

mw222 · April 21, 2023, 8:52pm

Oh I see, that makes sense. That was very helpful thank you! I will alter my ID:gene mapping.

Thank you,
MW

Topic		Replies	Views
Duplicate Entrez ID in the 22Q1 expression dataset Issues and Bugs	1	552	April 13, 2022
Entrez Gene ID, a tracked integers ends with ".0" in OmicsSomaticMutations.csv Report an Issue	1	256	July 6, 2023
What's the genome version and gene annotation file used for Depmap bulk RNA-Seq expression estimation? Q&A data	1	91	November 20, 2024
Duplicate genes in the RNA seq data in 20Q4 Current Issues data	1	538	December 8, 2020
Updated gene names in 21Q4 mutations data Issues and Bugs	1	521	February 2, 2022

Retired Ensembl IDs

Related topics