NormLRT code availability

Hi,

I am in search of the code used to generate NormLRT scores as described in all recent DepMap publications. I see in a post from July '20 that the code is “not publicly available”, but this contradicts statements in publications that “code is available on request”, even going back to the Macdonald 2017 paper published 3.5 years ago.

Can you please upload this information somewhere? It is a necessary input into some of the seemingly helpful posts your team has made available regarding analyzing the data (Assessing Confidence in Achilles Gene Profiles - Cancer Data Science Blog) and including how to calculate it would take these tools from the realm of the theoretical to the practical.

Thank you!

After playing around with the MASS package in R, this is what I’ve got:

(dmf$MYCN (4613)|GEscore is a vector containing gene effect scores for MYCN in all cell lines)

library(MASS)
2*((logLik(fitdistr(dmf$MYCN (4613)|GEscore, “t”)) - logLik(fitdistr(dmf$MYCN (4613)|GEscore, “normal”))))

Do I have it right or am I missing something here?

So the above works with a t-distribution, but not a skewed-t, which might be better, but is not available in the MASS package. I’m looking at the “sn” package now, but am having some trouble. Any advice?

Ok, using the sn and MASS packages in R:

library(sn)
library(MASS)
2*((logLik(selm(dmf$MYCN (4613)|GEscore ~ 1, family="ST")) - logLik(fitdistr(dmf$MYCN (4613)|GEscore, "normal")) ))

Also, this way you have a choice between skew-T and skew-normal (“ST” or “SN” respectively under family

I’ve posted the code that the DepMap portal uses to compute this at LRT calculation code used by the DepMap portal · GitHub

Thanks,
Phil

3 Likes