Get the graphs "Dependent Cell Lines" "Mutations" "Enriched Lineages" of a gene

Hi,

I try to download the graphs “Dependent Cell Lines”, “Mutations” and “Enriched Lineages” of a gene (for instance from this page https://depmap.org/portal/gene/PTGER4?tab=overview ).
I have found a way to download them by using the linux command :

curl https://depmap.org/portal/tile/gene/essentiality/PTGER4 |jq -r ‘.html’>dependent_cell_lines.html
curl https://depmap.org/portal/tile/gene/mutations/PTGER4 |jq -r ‘.html’>mutations.html
curl https://depmap.org/portal/tile/gene/selectivity/PTGER4 |jq -r ‘.html’>enriched_lineages.html

Is there any problem to do these kinds of downloads ? I mean, these URLs are not documented (I found them by analyzing the URLs called when accessing the webpage https://depmap.org/portal/gene/PTGER4?tab=overview ), and I would like to know if there are any restrictions regarding the use of them. Also, will these URLs continue to work in the future ?

The reason why I want to download them is because there are no API to access the data directly.

Instead of downloading the graphs, I also considered to download the data directly (DepMap Data Downloads). The problem now is to know where are the files containing the source of the data.
About “Dependent Cell Lines”, do the CRISPR data come from the file “gene_dependency.csv” (release “Sanger CRISPR (CERES)”), and the RNAi data come from the file “Achilles_gene_dependency.csv” (release “DepMap Public 20Q4”) ?
About " Mutations", do the data come from the file “CCLE_mutations.csv” (release “DepMap Public 20Q4”) ?
About “Enriched Lineages”, do the data come from the file “Achilles_gene_effect.csv” (release “DepMap Public 20Q4”) ? Also, how do I know that the data come from CRISPR or from RNAi ? Also, could you please inform me what kind of graphs is represented here ? How to reproduce that from the datasource ?

Thank you very much

Hi,

Could the administrators/persons in charge of this website reply to at least my first questions ? If not by the forum, how could I contact the administrators/persons in charge of this website ?

Is there any problem to do these kinds of downloads ? I mean, these URLs are not documented (I found them by analyzing the URLs called when accessing the webpage https://depmap.org/portal/gene/PTGER4?tab=overview ), and I would like to know if there are any restrictions regarding the use of them. Also, will these URLs continue to work in the future ?

Thank you very much

Hello,

I’m sorry that I missed your earlier post somehow.

Here’s the state of where we currently are at: Scrapping is okay as long as it doesn’t disrupt other users. We’ve had trouble in the past where the site would get flooded by requests and unable to keep up. As a result, we now have a throttle in place that will temporarily block requests when it discovered a high rate coming from a single client.

If you want to get a few hundred genes, as long as you make the requests ~1/second, you shouldn’t have any problems with the throttle. However, if you want all genes, it’ll probably take a prohibitively long time to download them one by one.

That being said, the ideal scenario would be that should not have to scrape this data. Philosophically, we’d like to make everything the portal has, available for download. I realize there are some gaps today, so that is something we’re exploring how to address in general.

In your particular case, it sounds like you’re having trouble lining up the download files with the data shown on the tiles, so I’ll try to clarify:

About “Dependent Cell Lines”, do the CRISPR data come from the file “gene_dependency.csv” (release “Sanger CRISPR (CERES)”), and the RNAi data come from the file “Achilles_gene_dependency.csv” (release “DepMap Public 20Q4”) ?

In the tiles, when we refer to “CRISPR” we’re referring to the latest Achilles CRISPR data. For the histogram the data from Achilles_gene_effect.csv is being shown. However, the count of the number of dependencies at the top of the file are coming from Achilles_gene_dependency.csv.

About " Mutations", do the data come from the file “CCLE_mutations.csv” (release “DepMap Public 20Q4”) ?

Yes, mutations refers to the latest CCLE_mutations file.

About “Enriched Lineages”, do the data come from the file “Achilles_gene_effect.csv” (release “DepMap Public 20Q4”) ?

Yes

Also, how do I know that the data come from CRISPR or from RNAi ?

There are two enrichment plots on the tile, one for Achilles CRISPR and one for RNAi. See the screenshot below which I’ve labeled with filenames

Also, could you please inform me what kind of graphs is represented here ? How to reproduce that from the datasource ?

In the sample info file, we have disease categorizations for each cell line. The box and whisker plots are showing the distribution of the gene effect scores for lines. The first whisker plots (in gray) is showing the distribution for that gene across all lines. The black whisker plots are the distributions for the subset with the labeled disease type.

Enrichment is being measured by a t-test, and see the bottom of the tile for the threshold used to determine which disease types it included on the tile.

Thank you very much.

Following this threat, I wonder if is possible to get the information (for a single gene) displayed as
CRISPR (DepMap 21Q3 Public+Score, Chronos): 0/1032 (from CEACAM5 DepMap Gene Summary)
from an API ?
I have gone through API - The Cancer Dependency Map at Sanger but this data does not seem to be in the options.
Thanks,
MS

We don’t have a single API for downloading the content of all the things reported on the gene summary.

If there are specific things you are looking for, I might be able to point you to the source. (For example, the essentiality tile uses the data which can be fetched at https://depmap.org/portal/api/download/gene_dep_summary )

Thanks,
Phil