Thanks for these very insightful comments.
We agree that better highlighting non-independence relationships will facilitate improvements in certain statistical analyses of the data. Indeed, this was part of our motivation for adding ‘patient_IDs’ for each cell line in the 22Q2 sample_info file, which delineate isogenic relationships between models. We also added a column “parent_depmap_id” to indicate parental/derivative model relationships. We plan to highlight these relationships more clearly on the portal going forward.
In terms of unfiltered genetic variants: you can access these, along with the raw data, for CCLE cell lines in our Terra workspace (see Where can I find the raw genomics sequencing data?). For newer cell lines we are working to share the data in an access controlled manner. The table of germline similarity is an interesting idea as well which we will discuss.
Thanks!