Will the DepMap portal code become open source?

Is there any plan to make the DepMap portal open sourced? Similarly useful portals, such as the cBioPortal, have such an offering.

Iā€™ll say we donā€™t have concrete plans as yet. Weā€™re interested in ultimately doing so, but weā€™re trying to figure out how to balance that with other short term goals.

Thereā€™s also a wide range of definitions for what it means for the portal to be ā€˜open sourcedā€™ where they have different costs and pros and cons.

One key challenge that will be required however will be to rethink how we handle data updates. Since weā€™re committed to releasing data quarterly, weā€™ll need to create a mechanism to make it easy for people to import new releases of data, which may make it more difficult to add new data types, or change which data we track in the future.

So the short answer is ā€œyes weā€™d like toā€ but itā€™s going to take a fair amount of work for us to get there and we need to map out a path before we can commit or get a sense of a timeline.

Itā€™d be helpful to hear from the community: If the portal was opensourced, what would that enable you to do, that is difficult or impossible to do today?

Thanks for detailing the potential difficulties, especially around data updates. I wonder whether the community could build some consensus on defining an extensible schema for these updates. A lot of users of the DepMap data have their own in-house systems to ingest each update, so it could be that the userbase could help identify potential pain points.

The biggest use case within my company of having the DepMap portal be open-sourced would be to have an in-house instance which we could extend to include analyses that are not performed as part of the DepMap data. There are some very common questions we are asked by our non-computational collaborators which we could offer immediate answers to alongside the provided data already in the public portal, without having to burden the DepMap development team with answering every possible question we might come up with.

Thanks @pmontgom

As my colleague @sean mentioned, the DepMap portal allows for a large fraction of what we want to do, but not everything. The most common case is that we have some additional data across cell lines that weā€™d like to explore in the same style (including all the cross-correlations). So the solution would seem to be to build up something very similar to the DepMap portal (and in fact we did exactly this in the early days) but weā€™d love to avoid repeating 90% of the fantastic work that you have done.

Sometimes the data we want to add is not even private data ā€“ itā€™s CCLE datasets that havenā€™t been added to the portal (e.g. metabolomics) or arenā€™t available as all-by-all correlations. Or itā€™s metrics that have been computed from the public data (e.g. group mutations in gene A or gene B together).

Additional data is the most common case, but additional analysis modes are also something weā€™ve explored. For instance, while pearson correlation is certainly our go-to metric for exploring trends, weā€™ve found that non-linear models or small multivariate models can also be quite helpful in some cases.

So I think thereā€™s a world where the public portal actually does 99% of what we want to do, and if there are ways we can help move that forward, weā€™d be happy to contribute. From where we are now, it seems to make sense to develop our own portal, and so building off what you have already built seemed like a potentially good intermediate solution.

1 Like

I see that a lot of the functionality we mentioned (and a lot more) is available in the new release! Thanks guys, looks great.

1 Like