Announcing the 23Q2 Release

We are excited to share new CRISPR and omics data, a Chronos update and a new Target Discovery app in this 23Q2 release! You’ll also notice an update to our cell line page. We hope this new format will allow users to more easily navigate the information presented for each cell line.

In this release, you will likely notice some changes in the data now that Chronos 2.0 integrated CRISPR data is being produced by a joint Chronos run. This change reduces the number of false positives in the integrated dataset by 5%. Please read the full announcement for more information.

Additionally, we are aware that there is an artifact in the CRISPR data which causes background correlation in dependency between genes located on the same chromosome arm. To account for this, we’ve aligned the mean gene effect of each chromosome arm to be the same in every cell line following the original copy number correction. Overall we see an improvement in data quality, as well as a reduction in clustering by chromosome arm in UMAP embeddings.

Metadata
DepMap has added new patient and model metadata:

Patient metadata

  • Age category - Adult, Pediatric, Fetus or Unknown will now be listed for all models.
  • Treatment status - indicates whether a research subject was given a clinical therapeutic agent and at what time relating to tissue collection.
  • Model metadata
  • Plate coating - indicates whether a model grew in a coated plate and what the coating material is.

DepMap has added a new table in this release that references media formulations. All DepMap models with known media formulations are now annotated using a Media ID in the format MF-XXX-XXX. Media formulations are annotated at the model and model condition levels as a model may be grown in a different condition than it was received in.

Annotations for some normal lines have been updated when a cell model is modeling a non-cancerous syndrome.

You will now notice vendor and catalog number, or other information about where a cell model was derived on the portal when this information is available.

In response to feedback, we have added back the legacy sub-subtype and legacy molecular feature fields in Data Explorer so you can more granularly categorize models. We are continually working to improve our annotations and will be updating this field in the future releases.

CRISPR Screens

The integrated CRISPR data (files beginning with “CRISPR”) is now being produced by a joint Chronos run rather than being integrated post hoc via Harmonia. This reduces false positives in the data by about 5%. Please be aware when using this data that this will produce significant changes for some genes.

We are preparing a more detailed description of the changes to Chronos since its publication and the specific changes to the data caused by using Chronos to integrate the data. More information will be shared soon.

In the previous release, we switched our NNMD calculation to be median(essentials) - median(nonessentials) / MAD(nonessentials). We’ve now updated several other QC metrics to parallel this NNMD calculation, including: ScreenMedianEssentialDepletion, ScreenMedianNonessentialDepletion, ScreenMADEssentials, and ScreenMADNonessentials.

Omics

We are currently working on improving our mutation calling pipeline and creating a more robust annotation and filtering strategy with updated documentation. We are planning to roll out these updates in the next release.

We now provide OmicsSomaticMutationsMAFProfile.maf, a reformatted version of OmicsSomaticMutationsProfile.csv that can be directly imported to downstream analysis packages such as maftools.

WES capture kit information (Agilent/ICE) can now be found in the OmicsProfiles table.

2 Likes