First of all, thank you so much for this amazing resource.
I would be happy if someone clarified the difference between the GeCKO dataset and the other later CRISPR datasets. Does the major difference lie in the release date and number of cell lines?
I am trying to study the dependencies between the essentiality of some mRNAs and the copy numbers of some microRNAs of interest, in order to strengthen the notion of the interaction between them. Strangely, there are much stronger correlations with the microRNAs in the CRISPR GeCKO dataset as opposed to the latest release (CRISPR 21Q3). What does it mean if the significant effect I see in the GeCKO dataset is not significant in the CRISPR 21Q3 dataset? Can this effect still be true?
Thank you a lot in advance!
The most important difference with the GeCKO dataset is the library. The GeCKO library is much inferior to Avana and other second-generation CRISPR libraries. However, there are also many processing differences as we have not revisited this data in a long time.
We are always interested in knowing about cases where we might have lost real biological signal when we updated our data processing, so these could be informative. I would first check if the relationships you saw were with mostly common essential genes however, as early datasets included a lot of artifactual signal in these genes.
I would also check if the relationships exist in the Achilles 21Q3 public dataset, the 20Q2 CERES Achilles dataset, the Project Score dataset, or the most recent RNAi dataset. If you don’t see the relationships in any of those, they’re probably a library artifact.
Feel free to email me if you would like to discuss details outside a public forum.