I just downloaded “RNAi_(Achilles+DRIVE+Marcotte,_DEMETER2).csv” from the Depmap portal via the “custom download” function. However, I found that it included many NA values. I would like to know what should I do with these NA in order to get similar results to the online analysis results in the Depmap portal. Should I manually remove all cell lines with NA or all genes with NA? After that, I wonder if the Pearson coefficient is suitable to get Co-Dependencies. Or, should I use Spearman coefficients to detect co-dependencies since gene expression does not seem to fit a normal distribution?
I would be very grateful for any advice you could give me.
The code we use for generating co-dependencies is linked to in this post:
We use pairwise complete observations when computing correlation, so if one sample has a NA for a given gene, we do not include that sample when comparing each gene to that one.