Pseudocount TPM Normalization Gene Expression

mavergara · April 8, 2021, 9:41pm

Dear all,
I have doubt about the normalization applied for RNAseq gene expression data. In the download portal says for gene expression: " Log2 transformed, using a pseudo-count of 1 " (CCLE)

On the other hand, looking on the database of “Genomics of Drug Sensitivity in Cancer” (GDSC) using the web engine tool “Orcestra” (ORCESTRA) they say that

Gene TPM Values: After estimation by the tool detailed above, gene TPM values are transformed by log2(x + 0.001).

Therefore, we can conclude that the only difference its just that you guys for the RNAseq expression data you just add 1 instead of 0.001 like the guys GDSC?

Thanks!

jnoorbak · April 9, 2021, 3:45pm

Hi, that is correct. The only difference as you mentioned is in the pseudocount value being 1 in our data reports. This does not have any significant analysis value and was mainly chosen for historical reasons and to reduce the dynamic range of expression while avoiding negative and/or infinite values. -thanks

Topic		Replies	Views
comparison between raw counts and log TPM values Q&A	0	76	July 10, 2025
comparison between raw counts and log TPM values Q&A data	6	184	July 29, 2025
Clarification on Negative Values in Log-Transformed Gene Expression Data Q&A omics	1	278	January 30, 2025
Are RSEM expected counts log transformed? Q&A	2	372	August 31, 2022
Normalization -TMM Q&A omics	3	117	April 16, 2025

Pseudocount TPM Normalization Gene Expression

Related topics