Is there any way we can get a reference to why 0.5 and -1 is the threshold value for the probability of dependency and gene effect, respectively? What is the mathematical background to these values? I understand the processing and steps that are taken but I cannot still make the sense of why these values are the threshold.

We try to explain this under the Dependent Cell lines tile on the gene page, and so Iâ€™ve taken some quotes from that in my explanation below.

We describe the 0.5 threshold as:

A cell line is considered dependent if it has a probability of dependency greater than 0.5.

Note, this is a threshold on the measurement â€śProbability of Dependencyâ€ť which we define as:

Probabilities of dependency are calculated for each gene score in a cell line as the probability that score arises from the distribution of essential gene scores rather than nonessential gene scores. See here for details.

â€śProbability of dependencyâ€ť is essentially modeling the gene effect as either being sampled from the distribution arising from â€śessentialâ€ť genes or â€śnon-essentialâ€ť genes, and being a probability, the values range from 0 to 1.

The reason for the .5 threshold is that weâ€™re saying that if there is more than a 50% chance that this gene was sampled from the â€śessentialâ€ť distribution and not from the â€śnon-essentialâ€ť distribution, weâ€™ll call that gene â€śessentialâ€ť.

As far as what a gene effect of -1 represents: That value is not intended to be used as a threshold rather as a reference point which can be used to compare gene scores. To put gene effect scores on a comparable range with one another, weâ€™ve scaled the values with -1 and 0 representing anchors to put the scores on a similar scale. This normalization is described on the page as follows:

Outcome from DEMETER2 or CERES. A lower score means that a gene is more likely to be dependent in a given cell line. A score of 0 is equivalent to a gene that is not essential whereas a score of -1 corresponds to the median of all common essential genes.

Again, weâ€™re using the distributions of essential and non-essential genes. Weâ€™re scaling gene effect such that 0 is placed at the median of the non-essential distribution and -1 is the median of the essential distribution.

