I am comparing multiple releases and was curious: why are the releases scheduled according to that specific timetable? Is there a particular reason for it? Also, how do you decide which cell lines to include in the next release? Finally, would you say that for a certain gene, the results for a specific cell line are becoming more precise and improving with each release?
Our timeline has changed over time, initially releasing data every quarter.
There’s quite a lot of work that goes on behind the scenes to coordinate the various different people and teams that contribute data that ultimately is part of the DepMap released data. To reduce the overhead associated with all that coordination we decreased the frequency to bi-annually. (Also, we now have substantial number of lines screened and so each update now is increasing the number of models by a small fraction. This led us to conclude that the cost of quarterly releases weren’t worth the benefit, so we started only releasing on Q2 and Q4)
We are now changing the schedule once again and will be releasing data on Q1 and Q3 instead because we were finding that the Q4 release in particular is at risk of delays from holiday closures and people’s schedules at the end of the year.
As for which lines we release: We release everything that we can release, which is to say all unembargoed data. Why data may be embargoed has often to do with either our funding mechanisms that generated that data or agreements with other researchers who helped produce the data. The details of our policies around embargoes is too complicate to describe here, but is actively managed by our excellent operations team and we strive to release all data that we are legally permitted to release.
As for your last question: “would you say that for a certain gene, the results for a specific cell line are becoming more precise and improving with each release?”
I can only express my personal opinion which is to say: yes. Every release we reprocess all CRISPR screens with Chronos and review the results. As additional artifacts are identified and are understood, Chronos has evolved to account and correct those artifacts. While I’m not directly involved in that process, what I’ve seen from that team gives me the confidence to say that, on average, the accuracy of our gene effect estimation has improved over time.
That being said, I will add a disclaimer: We are generally evaluating the results from Chronos across all genes when making decisions about changes. Any change will impact the results for all genes and the reality is that while we make changes that are beneficial for most genes, it’s not possible to guarantee that changes are beneficial for the accuracy of all genes.