In the recent 25Q3 release many new genes were added that only appear in ~50 cell-lines. These new genes have corrupted the precomputed correlations feature in the data explorer since pearson correlations are being compared to the existing genes with data across 1,100 cell lines and the new genes with less data get artefacturally better scores. For example, when I look at the profile for CNOT9, the top ~30-40 correlated genes have very sparse data across ~50 cell-lines (e.g. NCAM1, SHOX, PCNX4, etc) and the correlations are poor. I have to click on each one going down the list until I get to the first real correlation with RNASEK.
New data for more genes is great. But this precomputed correlation feature should be revised to either correct the correlation scores somehow or remove the sparse-data genes from the analysis. It used to be a great feature and now it’s very broken.
Thanks