Technical questions regarding the utilization of two-group comparison in custom analysis section

Mbgventer · January 19, 2022, 11:30pm

Dear DepMap community,

I would like to ask some more methodological/technical questions regarding the two-class comparison of custom analysis section; in detail, based on pre-defined groups of cancer cell lines, I noticed the following “differences” in the interpretation of results:

while using the CRISPR (DepMap 21Q4 Public, Chronos) dataset

and with the RNAi dataset:

The difference in the results, regarding the top hits and the profiling of included cell lines, is based in each methodological pipeline of scoring dependencies?
In addition, which is the main difference between the above CRISPR and the CRISPR (DepMap 21Q4 Public +Score, Chronos)? mainly on the profiled cell lines and the batch effect correction? And the latter corresponds in the downloads section in the CRISPR_gene_effect.csv file?
Regarding the statistical test implemented: as the two-group comparison uses empirical-Bayes moderated effect size estimates, regarding the hypothesis generation, could someone uses the raw p-values instead of Q-values? even with small effect sizes ?as it might be that including all the available genes, might result in stricter multiple correction?
Finally, when comparing Group 1 vs Group 2, a positive effect estimate would indicate, a higher essentiality in that gene in the second group?

Thank you in advance for your overall help and support

Best,

Efstathios

Mustafa_Kocak · January 31, 2022, 2:56pm

Hi Efstathios,

Here are my brief responses for the last two questions are below, hope that helps but please feel free to reach out for any clarifications. For the former two, I refer to @Joshua_Dempster and @mburger

I think that is possible but from a statistical perspective, I would avoid that unless you have a good biological/scientific reason to do so. I personally suggest primarily using effect sizes while filtering relevant genes and then using q-values as a secondary criterion.
A positive effect size implies group one has a higher score than the second, in other words, cell lines in the second group are more vulnerable to the perturbation (either gene knock-out or knock-down).

Warmly,
Mustafa

Joshua_Dempster · February 1, 2022, 4:15pm

With regards to question 2, I think you are correct if I understand you correctly. Public+Score is just the integrated dataset and corresponds to CRISPR_gene_effects.

jkmak · February 2, 2022, 3:18pm

To address your first question, I would say that differences in results when using RNAi or CRISPR datasets most likely reflect the cell lines included (the RNAi dataset is a combination of several sub-genome libraries with many missing values) or the difference between gene suppression (RNAi) and gene knockout (CRISPR). However, they are correct in assuming that the computational data processing pipelines also differ (DEMETER2, Chronos).

Topic		Replies	Views
Identify conditionally essential genes in clinical subgroups of a specific cancer type Q&A portal , genetic-screens , omics	4	1543	January 7, 2022
Crispr (Avana) Public 20Q3 scores Q&A	3	715	December 11, 2020
Combined scores for CRISPR & RNAi dependencies Q&A	2	3985	August 1, 2020
Difference between Achilles Common Essentials and CRISPR Common Essentials? Q&A	7	1095	June 7, 2022
Method to find vulnerabilities between two groups of cell lines Q&A	0	264	May 26, 2023

Technical questions regarding the utilization of two-group comparison in custom analysis section

Related topics