Nomenclature of essential genes

I tried to understand “genes essentiality” classification and the best way to identify them.

Genes (within a cell line) with probability of dependency > 0.5 are classified as essential for the cell line and the other non essential.
For each genes, based on all cell lines values for dependency score, one performs a normality likelihood test test. If a gene has a normLRT score > 100 (ie it is closer to a Student distribution than a Gaussian one, ie there are outliers), it is called a selective dependency (independently of the cell line).
Within a cell line, an essential gene is call strongly selective if it’s a selective dependency and common essential otherwise.
Am I right?

In the Tsherniak paper (2017), (I understand that) they try to identify an equivalent of the strongly selective class, the differential dependency class, by selecting cell line - genes dependency effects that are smaller that mu - 6 * sigma (where mu and sigma are the mean and the standard deviation of the dependency effects among all cell lines for the considered gene).
What do you recommend to use to identify these interesting genes among a cell line? Probability > 0.5 and normLRT > 100, or deviation from the mean > 6 * sigma?

Thanks a lot!

1 Like


I’m not familiar with a selective designation. Strongly selective means the gene’s profile has normLRT > 100. Either method you suggest for identifying interesting dependencies within a cell line would work, although I would not generally suggest trying to identify interesting genes within a single cell line.

1 Like

Thanks for your answer.

However I don’t understand what you mean here. It is not appropriate to say that gene X is essential for cell line Y if its probability of dependency P_{X,Y} is greater than 0.5 (or another threshold)? Did I miss something?