Hi,
If i want to get a binary matrix A, each columns represent a gene and each row represent a cell line, and the i,j th element represent the essentiality of i th gene on j th cell line.(essential: 1, nonessential: 0)
Q1: Which original dataset should I used, CRISPRGeneEffect.csv or CRISPRGeneDependency.csv?
Q2: What is the cutoff value I need to select? For example, If I chose -0.5 as a cutoff value, then for i,j th element in original dataset B, B_{ij}<-0.5 means A_{ij} = 1, and if B_{ij}>0.5, A_{ij} = 0.
Another option I have is to set the top 20% genes in each cell line as essential genes. In this case, there is no definite cutoff value, and the threshold value of each column is different. which option do you think is better?
Best,
C