I am new to DepMap (and to bioinformatics in general) and I would greatly appreciate some help
My goal is to identify specific vulnerabilities of cell lines which have lost one copy of our gene of interest and I have several questions on how to do this :
- first, how to define if a cell line is heterozygous for our gene of interest ? I have seen two methods in paper : some people define heterozygosity based on a range of CNV values (for example if the CNV value is between -0.3 and -0.7); others use the GISTIC algorithm to assign a binary status (heterozygous/homozygous/normal/amplified). Does it make a difference and which method do you think is the best ?
- my second question is which score to use ? For the CRISPR screen there is a choice between the “CRISPRGeneEffect” and the “CRISPRGeneDependency” files. Which one would be the most adapted for our purpose of identifying differential vulnerabilities between two groups of cell lines ? Which one correspond to the “CRISPR (DepMap Public 23Q2+Score, Chronos)” in the Data Explorer ? And finally for the RNAi screen, there is only one file available named “dependency”, is it a similar logic to the “CRISPRGeneDependency” file ?
- last, how could we statistically compare our two groups of cells in term of vulnerabilities ? Our plan is to look at the difference in dependency value (or effect depending on which of these two files is the best) between the two groups of cell lines. But from what I understand, these score are normalised values (0 being non-essential genes and -1 the average of essential genes) so I am not sure how we can properly handle these data to get statistics. Our goal is to get a p-value (to know if the difference is significant) and an effect size (to know how strong is the difference). Would it be proper to do that on these data and if it is, would you use parametric or non-parametric tests for this ?
Thank you so much in advance for your replies, it will really be a huge help !
All the best,