The sample_info.csv Download shows 18 cell-lines annotated as MSI in the lineage_molecular_subtype field. However, the data explorer provides an option to color cell lines by Microsatellite Instability, and provides two sources for these data: CCLE (NGS) and GDSC (PCR) - both of which include ~60 cell lines. Why the discrepancy? Will the lineage_molecular_subtype field eventually show 60 cell-lines with an MSI annotation?
Thanks for bringing this into our attention. The sample_info file has not been updated for the new datasets. We will provide this information as a downloadable file in the near future, but for now please rely on the data from the data explorer.
Thanks for all your work on this project! I was wondering if updated MSI annotations are present in the " DepMap Public 22Q4 Primary Files"? The “Model.csv” file only has 18 cell lines classified as “MSI”.
It has been two years since the original question, and the model.csv file has not been updated. Is there another file you can download with the complete annotation of cell lines?
MSIscore is now a feature in OmicsSignatures.csv that gets updated for new cell lines every release. MSIscore > 20 is considered MSI, otherwise MSS. Since it is inferred from omics data, we do not include it in model.csv as part of model metadata.