How should I interpret changes in top co-dependencies across different datasets?

Hello everyone,

TL;DR: I see different top co-dependencies between the 20Q2 and 24Q2 for the gene CHEK1. I think this might be because the 20Q2 version was calculated using CERES whereas 24Q2 is calculated using Chronos. Secondly, this could just be due to the inherent change in the dataset between 20Q2 and 24Q2. Either way, what does it mean for the top-codependents of a gene to change drastically? Thanks!

Full Post:
I hope you’re doing well. As my title says, how would one interpret changes in the top co-dependencies of a gene across different versions of the dataset (e.g. 20Q2 vs 23Q4).

For a more concrete example, consider the gene CHEK1. Given the top 5 positive co-dependencies of CHEK1 in the 20Q2 dataset, I compare where that gene ranks in the 24Q2 dataset:

  1. PSMD14 (from 1st to not observed)
  2. WEE1 (from 2nd to 20th)
  3. SMU1 (from 3rd to 22nd)
  4. PSMD7 (from 4th to not observed)
  5. SNRNP200 (from 5th to not observed)

As a reference, the top 5 positive co-dependencies of CHEK1 in the 24Q2 dataset are now:

  1. HSPA5
  2. ANAPC4
  3. THOC2
  4. YTHDC1
  5. AQR

Something I did observe is that the portal stops using CERES scores around 2021. Consequently, the 20Q2 and 24Q2 datasets’ correlation values might yield different results. Would the new scoring system change the rankings so significantly?

To test this idea, I could investigate the 24Q2 dataset calculated using CERES scores, although I can’t find that information anywhere in the data download page at all.

Secondly, this could just be due to the change in the underlying data (e.g. more cell lines/genes are added in the 24Q2 set). However, should that change the order that significantly, too?

A quick comparison of 23Q4 and 24Q2 (two most recents sets), indeed, shows changes on the top co-dependencies. In this instance, however, there’s only a re-shuffling of the order of the top co-dependencies instead of genes just dropping out of the top co-dependencies (as observed in 20Q2 vs 24Q2).

Any insights would be helpful here and thank you so much for taking your time reading this.

Respectfully yours,
Andrew