I tried to understand “genes essentiality” classification and the best way to identify them.
Genes (within a cell line) with probability of dependency > 0.5 are classified as essential for the cell line and the other non essential.
For each genes, based on all cell lines values for dependency score, one performs a normality likelihood test test. If a gene has a normLRT score > 100 (ie it is closer to a Student distribution than a Gaussian one, ie there are outliers), it is called a selective dependency (independently of the cell line).
Within a cell line, an essential gene is call strongly selective if it’s a selective dependency and common essential otherwise.
Am I right?
In the Tsherniak paper (2017), (I understand that) they try to identify an equivalent of the strongly selective class, the differential dependency class, by selecting cell line - genes dependency effects that are smaller that mu - 6 * sigma (where mu and sigma are the mean and the standard deviation of the dependency effects among all cell lines for the considered gene).
What do you recommend to use to identify these interesting genes among a cell line? Probability > 0.5 and normLRT > 100, or deviation from the mean > 6 * sigma?
Thanks a lot!