Get the graphs “Dependent Cell Lines” and “Enriched Lineages” of a gene

Hi,

I try to recreate the graphs and texts of the tiles “Dependent Cell Lines” and “Enriched Lineages” (for instance PIK3CA DepMap Gene Summary) only from files downloaded here DepMap Data Downloads .

1)Dependent Cell Lines
a) About the text “CRISPR (DepMap 21Q3 Public+Score, Chronos): X/Y”
I am able to find X value by counting the number of lines where the value>0.5 in the file CRISPR_gene_dependency.csv (DepMap Public 21Q3).
I am able to find Y value by counting the number of lines in the file CRISPR_gene_dependency.csv (DepMap Public 21Q3).
Is this the correct way to do that ?

b) About the text “RNAi (Achilles+DRIVE+Marcotte, DEMETER2): X/Y”
About X, I’ve tried to count the number of columns where the value>0.5 in D2_combined_gene_dep_scores.csv (DEMETER2 Data v6), but all the value is less than 0.5, so I got X=0, instead of 62 for PIK3CA.
Is my method wrong ? Should I use another file ?
I am able to find Y value by counting the number of columns where the value is not NA in the file D2_combined_gene_dep_scores.csv (DEMETER2 Data v6).
Is this the correct way to do that ?

c) About the graph on the top (dependency)
About CRISPR, I use the file CRISPR_gene_dependency.csv (DepMap Public 21Q3) .
About RNAi, I use the file D2_combined_gene_dep_score_SDs.csv (DEMETER2 Data v6) .
Is this the correct way to do that ?

d) About the graph on the bottom (Gene effect)
About CRISPR, I use the file CRISPR_gene_effect.csv (DepMap Public 21Q3) .
About RNAi, I use the file D2_combined_gene_dep_scores.csv (DEMETER2 Data v6) .
Is this the correct way to do that ?

2)Enriched Lineages
a) About CRISPR graph
I’m thinking about using CRISPR_gene_effect.csv (DepMap Public 21Q3), but I still have problems about recreating the graph :

  • I don’t know how to link the cell line id (DepMap_ID) with its lineage. I known there is a file sample_info.csv (DepMap Public 21Q3) that we can use to do that but I don’t know which column to use for the lineage between : lineage, lineage_subtype, lineage_sub_subtype and lineage_molecular_subtype. Could you inform me of the correct column to choose, or perhaps is the information in another file ?
  • I don’t know the p-value of the lineage for a specific gene, and as a consequence, I can’t filter the lineage having p-values < 0.0005. I know that I can get this information from this page Skin DepMap Context Summary (“Dependencies enriched in Skin” section) for instance, but I would prefer to download this information as a file if possible. In what file could I find this information ?

b) About RNAi graph
I’m thinking about using D2_combined_gene_dep_scores.csv (DEMETER2 Data v6), but I still have problems about recreating the graph :

  • I don’t know how to link the cell line id (CCLE_Name) with its lineage. I known there is a file sample_info.csv (DepMap Public 21Q3) that we can use to do that but I don’t know which column to use for the lineage between : lineage, lineage_subtype, lineage_sub_subtype and lineage_molecular_subtype. Could you inform me of the correct column to choose, or perhaps is the information in another file ?
  • I don’t know the p-value of the lineage for a specific gene, and as a consequence, I can’t filter the lineage having p-values < 0.0005. I know that I can get this information from this page Skin DepMap Context Summary (“Dependencies enriched in Skin” section) for instance, but I would prefer to download this information as a file if possible. In what file could I find this information ?

Thank you very much

Hello,

1a) Yes this should be correct

1b) This is unfortunate file naming on our part. For DEMETER2 the “gene_dep_scores” files are analogous to “gene_effect” for CRISPR, so it’s a measure of effect size, with more negative scores being stronger viability effects. We’ve computed analogous ‘dependency probabilities’ for hit-calling from D2 scores internally, but it looks like we haven’t made those downloadable yet so I’ll look into that, and we should be able to make those files downloadable.
You are also correct that the Y values should reflect only the non-NA values. If this is not the case it’s probably an error on our end that we’ll need to fix.

1c) If by graph on the top you mean just the X/Y numbers that are printed then yes. For the density plot it’s gene effect values

1d) Yes this is correct.

2a) I believe we’re currently testing for enrichment across all these group levels. So some of the hits will be ‘top-level’ lineages (like “skin”) but others will be molecular subtypes, etc.
You can download all the enriched dependencies for a given context on the page you point to (there’s a button to download the data in the table). Perhaps you’re wondering about downloading all the contexts where a given gene dependency is enriched? If so I don’t believe we have that info yet but we could make it available as a downloadable file.

2b) I think this is the same question as 2a, so hopefully the info above will help with this. It should be the same, but using the D2_combined_gene_dep_scores file you point to.

Thanks for your questions, and hope that helps!

Thank you very much for your answer.

About 1b), I meant D2_combined_gene_dep_score_SDs.csv, not D2_combined_gene_dep_scores.csv . I think that the gene dependency file already exists as D2_combined_gene_dep_score_SDs.csv , and so could you please recheck my question corrected below ?
About X, I’ve tried to count the number of columns where the value>0.5 in D2_combined_gene_dep_score_SDs.csv (DEMETER2 Data v6), but all the value is less than 0.5, so I got X=0, instead of 62 for PIK3CA.
Is my method wrong ? Should I use another file ?

About 1c), this means that the density plot and the plot on the bottom use the same file CRISPR_gene_effect.csv / D2_combined_gene_dep_scores.csv , am I right ?

About 2a), I would prefer downloading all the data if possible. Could you inform me when it will be available as a downloadable file ?

Thank you very much for this.

D2_combined_gene_dep_score_SDs is actually uncertainty estimates (posterior std dev) for the gene scores, so this would not be the right file to use. It looks like the calculation of dependency probabilities for RNAi data is still internal and we don’t yet provide those files but we should be able to do that soon.

Yes, the density plot and ‘rug plots’ use those files I believe.

I don’t have a good of when we’ll have this as a downloadable file, but I’d guess we might have this by the 22Q1 release.

Hope this helps

Hi,

In this case I’ll wait until the next release.

Thank you very much.