GDSC or CTD2 data in data explorer?

Hi

As always great to have DepMap and the Data Explorer tool! Question - in Data Explorer it seems I’m able to drug screen data just for the PRISM screens (which are great) but not GDSC or CTD2. Is that intended or accident?

Cheers,
Dave

Its certainly not intentional.

I was fairly confident that GDSC and CTD2 data are availible in Data Explorer, however, when I went to look at it, I was a little surprised by what I found. The short answer is the data is technically there, but the UI is not working as originally intended at this time.

Compound datasets behave a little differently in the UI then all other types of data in Data Explorer because the data is indexed by “sample IDs” but people typically aren’t going to know the ID of a particular sample, but instead want to find things given the name of some compound. Also, users may not know which drugs were screened in which datasets. Lastly, to make things more complicated, sometimes within a single dataset, the same compound was screened multiple times. This created a few edge cases for us, which resulted in us creating some special cases in the UI to gloss over these details.

Anyway, the way it’s supposed to work is you’re supposed is that if you choose “Compound” as the feature, you can enter a compound name, and it’s supposed to show you the list of all samples for that compound across all datasets.

However, today, it appears that it’s only showing the option for repurposing, which is probably why you think the CTD2 and GDSC data is not availible. (This is a bug, and I’m not sure when it got introduced)

However, instead of using this “special case” in data explorer, you still can view CTD2 and GDSC data by selecting “Ctd2 Sample” or “Gdsc1 Sample” under “Feature”

This does not use the special case logic which attempts to search across all drug screens, but instead will search for CTD2 sample IDs specifically. (And therefore bypasses the problem in the special case feature = “Compound”). You can then type the name of any compound in the CTD2 dataset and it should come up in the autocomplete.

That being said, what I described above is a bit of a workaround and it’s clear that we have a bug that we should do something about.

Somewhat related: We’re actually planning on changing our general handling of compound data in an upcoming release which should hopefully give us the best of both worlds (compounds will no longer be a special case in the UI – but one will be able to find compounds across all datasets) So you can expect this UI flow for compounds to change a bit in the not-too-distant future.

Thanks for letting us know about this issue.

Thanks,
Phil

1 Like

Thank you for the detailed response including way for us to get at the data - very much appreciated! We’ll use that in the short term and happy to try out the new version when it’s ready.

Edit: ps it worked thank you! I was quickly able to generate the plots I was interested in!