RNAseq transcript annotation

Hi, I’d like to check the expression level of genes using the RNAseq data from CCLE. However, I only see the gene ID in the file without the corresponding transcript ID, making it difficult to understand which isoform has been used for the analysis. May I know the criteria of picking transcript for this analysis? Many thanks!

Hi,

Thanks for reaching out!

I am assuming you are using OmicsExpressionProteinCodingGenesTPMLogp1.csv in the “primary files” section in the download page. It contains RSEM’s gene-level abundance estimates which are essentially a summary of all isoforms for each gene. If you want to see transcript-level abundance estimates, you should be able to download OmicsExpressionTranscriptsExpectedCountProfile.csv in the “all files” section which contains expected counts on the transcript level. Hope this helps!

Simone

1 Like

Hi, thank you so much for the clarification! I indeed found the “OmicsExpressionTranscriptsExpectedCountProfile.csv” to get transcript level RNA-seq value. Could I double check what are the values in this file? I would assume they are the TPM value calculated from RSEM tool, is it correct? Thanks!

Hi,

“OmicsExpressionTranscriptsExpectedCountProfile.csv” contains expected counts, not TPM. If you are looking for transcript-level TPM specifically, you might want to check out OmicsExpressionTranscriptsTPMLogp1Profile.csv.

Best,
Simone