I am working with the DepMap expression datasets, and I have a question regarding the log-transformed values in the OmicsExpressionProteinCodingGenesTPMLogp1.csv file.
As per the dataset documentation, the expression values are inferred using RSEM (unstranded mode) and reported after log2 transformation with a pseudo-count of 1. However, I have noticed that some of the log-transformed gene expression values are negative, which seems inconsistent with typical log transformation expectations.
In standard log2 transformations, values of TPM > 0 should generally yield non-negative results, particularly when a pseudo-count is added. Could you kindly clarify why negative values appear in the log-transformed expression data? Are there any additional steps or adjustments performed during the transformation that would explain this?
I would greatly appreciate any insights you can provide on this matter.
Thank you for your time and assistance.