precomputed correlations features was ruined in the last release

In the recent 25Q3 release many new genes were added that only appear in ~50 cell-lines. These new genes have corrupted the precomputed correlations feature in the data explorer since pearson correlations are being compared to the existing genes with data across 1,100 cell lines and the new genes with less data get artefacturally better scores. For example, when I look at the profile for CNOT9, the top ~30-40 correlated genes have very sparse data across ~50 cell-lines (e.g. NCAM1, SHOX, PCNX4, etc) and the correlations are poor. I have to click on each one going down the list until I get to the first real correlation with RNASEK.

New data for more genes is great. But this precomputed correlation feature should be revised to either correct the correlation scores somehow or remove the sparse-data genes from the analysis. It used to be a great feature and now it’s very broken.

Thanks

Hello, I believe you’re correct in your diagnosis of the problem and this morning, we’ve rolled out a change which filters out the genes which had coverage in the ~50 cell lines from the correlation analysis.

(See also @CRISPR co-depency top hits obscured by newly added screens for more information on this change)

Thanks,

Phil