HI,
I downloaded the gene dependency scores calculated using DEMETER2 (D2_Achilles_gene_dep_scores.csv). There were so many “NA” values in the matrix. I checked the NC paper and still didn’t understands the reasons. By the way, there is no “NA” value in the LogFoldChange matrix.
Look forward for your reply!
Hi Yajing,
By far the dominant source of missing values in the DEMETER2 Achilles gene effect outputs is due to the nature of the Achilles RNAi data. Specifically, it’s constructed from 3 different batches of data that used two different shRNA libraries. One of the libraries has 98k hairpins, the other has 55k. The smaller library thus targets a smaller set of genes and so for cell lines screened with that library (216 I believe) the model outputs NA for genes that weren’t screened in the library so it can output data in a matrix format. There are some other rules that can lead to NA values in the post-processing of the model output (i.e. genes where the model assigns low efficacies to the shRNAs, or genes that are part of ‘gene families’). Those are more edge cases though.
Hope that clarifies things!
Hi,
Very clear! I got it! Thank for your prompt reply!
YJ