Some measure of # of selective genes

Hi, I’m trying to get a rough number of selectively dependent genes. Analogous to the definition of “strongly selective” in the “Agreement between two large pan-cancer CRISPR-Cas9 gene dependency data sets” paper. I’m guessing a way to do this would be to use a lower threshold of NormLRT. Have you all done, this, can you share any rough estimates of the number of genes that are selective?

1 Like

We have a definition of “strongly selective” that we use in the portal, which is a threshold of 100 < skewed-LRT.

If you want to see which genes are classified as “strongly selective” you can find that information in the table at https://depmap.org/portal/api/download/gene_dep_summary

Thanks,
Phil

2 Likes

Thank you! Feature request, could you share the skewed-LRT values for each gene as well / someday / somehow?

Going to answer my own question now that I’ve finally RTFM’d: in “Agreement between two large pan-cancer CRISPR-Cas9 gene dependency data sets” supplementary 4 is an excel sheet containing normLRT values calculated for Broad and Sanger. Thank you!

Is the file downloadable at https://depmap.org/portal/api/download/gene_dep_summary kept up to date?

1 Like

It’s worth noting that those values are out of date/incomplete.

You can find many examples of genes labeled as LRT >= 100 on the web browser that are lower in that dataset. Sime genes are missing entirely.

What I did is simply re-run the data myself using the R code shared by @pmontgom on the most recent release. There’s an error I havent been able to pin down still where errors/warnings from the try() function in LRT_test() aren’t being properly caught, which allows for some bad skewed T fits on certain genes that are clearly not skewed to return a value instead of NA (HSPA5 and a few RNA polymerase genes). I was able to bandaid this with the “testit” R library and the has_warning() function, but it’s fairly computationally intensive.