Hi, I was trying to obtain the gene effect score for each cell line for the gene RPS14 using the CRISPR-gene_effect.csv download. However I see that the data for that gene is not included in the cdv file but there is values available on the portal. So just wanted to flag it. Thank you;
Unfortunately, it sounds like you have stumbled into a very common source of confusion.
The heart of the issue:
- We have multiple versions of CRISPR data and we go through a prioritized list of datasets and show the profile from the first version which has the gene of interest.
- Our filenames are not consistent with how datasets are labeled in the portal.
In the case of RPS14, this is the dataset containing only CRISPR screens at the Broad:
Note the label at the top which reads “CRISPR (DepMap 22Q2 Public, Chronos)” and contrast this with what you’d see for BRAF:
This one is labeled “CRISPR (DepMap 22Q2 Public+Score, Chronos)”
These are two different datasets which are in two different files in the download section.
In the downloads section, the first example comes from the file “Achilles_gene_effect.csv” and the second example comes from “CRISPR_gene_effect.csv”
(The readme in the data release explains the differences between these two datasets in more detail but basically, CRISPR_gene_effect is a combination of two data produced at the Broad and Sanger, whereas Achilles_gene_effect is just data produced at the Broad.)
I’ve seen a fair number of people ask questions like yours and so I think it’s important to change how the portal refers to datasets to avoid this confusion. However, that will take some time, so you can expect it improved in the future, but I don’t yet have a timeline for when we’ll be improving this.
Thanks,
Phil