Hello, do you have any thoughts / plans for how one might calculate a “combined” dependency score that takes into account both the RNAi and the CRISPR dependency data?
Whether it’s helpful to combine CRISPR and RNAi gene effects depends on the use case. CRISPR knockout generally results in a much stronger effect on cell viability than RNAi partial suppression. This means that for a given gene, there is usually selectivity or variation in dependency observed across cell lines using one of the perturbation types (CRISPR or RNAi), but not both. The exceptions where we see meaningful variation in both CRISPR and RNAi datasets are typically classic oncogenes or tumor suppressors where viability effect is large enough to detect using RNAi and occurs in a clear subset of cell lines harboring an activating or damaging mutation. Therefore, if you are interested in studying oncogene dependencies, we expect reasonable agreement between perturbation types and you might benefit from integrating CRISRP and RNAi data.
If you are interested in identifying synthetic lethality or dependencies associated with more complex cancer cell states, you might want to first determine which dataset is better at capturing biologically relevant variation for your gene of interest. For example, looking at correlations between a gene dependency and other dependency data or CCLE gene-level features, it’s a good sign if top correlations are enriched for genes with prior evidence of association (PPI, CORUM, MSigDB). It’s a bad sign if the top correlates are confounding variables, such as Cas9 activity, media type, or other screen quality metrics. You will usually find that one perturbation type is superior by these metrics and combining data with other dependency data only adds noise.
Unlike datasets of the same perturbation type (DRIVE & Achilles RNAi, Project Score & Project Achilles CRISPR), we don’t necessarily expect the same cell response different perturbation types (partial mRNA suppression vs. knockout). One might consider combining the data using per gene weightings based on data confidence, but defining a confidence metric that applies to all use cases would be challenging. For example, CRISPR knockout shows that PRMT5 is a common essential gene. RNAi knockdown shows that there is variation in PRMT5 dependency and more clearly identifies that the most dependent cell lines have MTAP loss. They are both accurate. Whether you assign higher confidence in the CRISPR or RNAi data for PRMT5 depends on the question you’re asking.
Thank you for the extremely detailed answer, very much appreciated!