I’m curious if there are any plans to generate/release whole genome sequencing for additional CCLE cell lines (there are ~300 currently available as described in the 2019 paper).
Sorry if general CCLE questions (not DepMap per se) are out of scope for this forum!
We are considering the possibility of a transition from WES to WGS in DepMap. Please stay tuned.
Can I ask, what is the plan for releasing the additional WGS? I saw in November there was an announcement that DepMap will be transitioning to using WGS for cell line characterization. Will new data be added to the existing SRA record PRJNA523380 as it is generated, or a single release as part of a publication? Are any details you can provide about the cadence with which new data will be available?
We have released several WGS in 21Q1 and moving forward there will be more WGS instead of WES. At the moment we haven’t had the bandwidth to provide derivatives of this data such as SVs, noncoding mutations, etc. But we plan to generate a subset of WGS derivatives in the future.
DepMap is in the process of submitting samples to dbGaP for regular upload to SRA (or possibly GDC) but the process is quite lengthy. So at the moment we do not have a projection on when the WGS bam files will be available through dbGaP. We will update you once we have a more accurate timeline.
Thank you @jnoorbak , that is exciting to hear. When you say that WGS has been released in 21Q1, what does that mean? What is available to the community to access/download now that the WGS is released?
From what I can tell there are ~600-700 lines for which there is no WGS data. Given it will take time to sequence them, and more time for the dbGaP/SRA upload, do you have a rough idea of how long until files are available through dbGaP? Is 1 year a good upper bound, or will it take multiple years until all the lines have WGS?
The copy number and exonic mutations are derived from WGS. At the moment (unfortunately) the users would not experience much of a difference, but hopefully we will soon be able to release more data.
My guess is that releasing all of those bam files would take multiple years, as going back to old cell lines would take a lot of time. Also I should add that we have not made a definitive decision to re-sequence ‘all’ the old lines using WGS. But this may become our de facto policy depending on the feedback we receive on this from the user community.
Thank you @jnoorbak , there are a handful of lines we are particularly interested in, and trying to decide whether we should do the WGS ourselves, or to wait until it will be available shortly through the DepMap project. It is very useful to have your insight around timing/scope of the re-sequencing.