comparison between raw counts and log TPM values

Abed_Kurdi · July 10, 2025, 2:16pm

Hello there,

I downloaded first the log TPM values and plotted them for a specific gene and found that it is expressed. Then, I have to use another normalization method and downloaded the raw counts. Checking the raw counts of this exact same gene in the same cell line, I found it zero.

How can this be explained?

the files I downloaded: OmicsExpressionTPMLogp1HumanAllGenes.csv and OmicsExpressionRawReadCountHumanAllGenesStranded.csv.

Thank you in advance.

SongSong · July 24, 2025, 4:15am

In my opinion, this is because the expression data was calculated using Salmon 1.10.0 (please refer to the 25Q2 release note), which employs a probabilistic model. But since I explained it based on the latest version of DepMap data, it would be better to clarify the version of the dataset.

Abed_Kurdi · July 25, 2025, 9:07am

Thank you for your reply.

I am using the current release data (25Q2).
You mean the TPM values are calculated using raw data generated using the Salmon pipeline?

Abed_Kurdi · July 25, 2025, 9:51am

I checked the release notes for the current and previous releases, I could not see a clear explanation on how the TPM values were generated. In the current release notes, they state only how the read counts were generated; they used STAR. Do you think TPM values are generated based on Salmon quantification?

SongSong · July 28, 2025, 12:54am

Maybe yes. It would be better to check the original paper.

simz · July 28, 2025, 8:34pm

Hi,

As of 25q2, our TPM values were generated using Salmon. The discrepancy you are seeing might be because Salmon allocates ambiguous reads using EM, while STAR raw counts ignores them since it only uses uniquely mapping reads.

We are happy to look into this further if you’d be willing to share some of the genes that show this kind of behavior.

Thanks!
Simone

Abed_Kurdi · July 29, 2025, 5:45am

Thank you for reply!

No need, just wanted to clarify the discrepancy between the two datasets.

Topic		Replies	Views
comparison between raw counts and log TPM values Q&A	0	46	July 10, 2025
Pseudocount TPM Normalization Gene Expression Q&A	1	590	April 9, 2021
Gene Expression Scatterplots Q&A	3	890	July 15, 2021
DepMap Expression Public 20Q2 Q&A omics	1	728	July 27, 2020
How can we know the last value of TPM taken by DepMap for normalization of gene expression ? Q&A data	7	543	June 7, 2023

comparison between raw counts and log TPM values

Related topics