Question about segtab annotations in ABSOLUTE CN file

Hi there,
I am looking at ABSOLUTE CN calls from this file: DepMap Data Downloads

I am a little confused with regards to how to interpret the columns labeled “Cancer_call_frac_a1” and “Cancer_cell_frac_a2” with regards to how they relate to the “Modal_HSCN_1” and “Modal_HSCN_2” columns, respectively. “Cancer_call_frac_a1” describes “Max. posterior estimate of fraction of sample’s cells containing somatic copy number change of allele 1”, while “Modal_HSCN_1” describes “minor (alt) allele ABSOLUTE copy number call.” So does “Cancer_call_frac_a1” describe the fraction of a sample’s cells that have a copy number equivalent to “Modal_HSCN_1”, or that are different from “Modal_HSCN_1”?

I was also wondering why the range of values for the “Subclonal_HSCN_a1” and “Subclonal_HSCN_a2” columns is only 0-1 while the range of values goes from 0-7 for the “Modal_HSCN_1” and “Modal_HSCN_2” columns?

My goal is to identify genomic regions that have a ABSOLUTE copy number of 1 for both the major and minor alleles (i.e. diploid regions). Previously, I identified those regions as rows that have “Modal_HSCN_1” == 1 AND “Modal_HSCN_2” == 1, but I am wondering if I need to take some of the other columns into consideration.

I appreciate your help and all the work done!

Best,
Elaine

Hi Elaine,

I’ll preface this by saying that I’m not an expert on the ABSOLUTE algorithm or this dataset, but from my understanding, Cancer_call_frac_a1 should describe the fraction corresponding to Modal_HSCN_1. It is also unclear to me why Subclonal_HSCN_a1 and Subclonal_HSCN_a2 are only 0-1… I’d recommend referring to the ABSOLUTE paper if you haven’t already.

I’d also like to point out that we are going to release absolute copy number data from PureCN in our upcoming 24Q2 release. It will contain more cell lines than the legacy CCLE ABSOLUTE data, as it is going to be part of the routine biannual releases from now on.

Thanks,
Simone

Hi Simone,
Thank you for your help! I previously looked at the ABSOLUTE paper, but it still wasn’t clear to me what the columns meant since it appears their software page is no longer online.

The PureCN data will be super helpful for me! Is there any way to get access to the PureCN data prior to the official 24Q2 release? Unfortunately, I am on a tight deadline and need CN data for a pre-processing step, so need to use whatever I can get access to at the moment. However, I understand if this is not possible and appreciate your response!

Best,
Elaine

Hi Elaine,

This is the repo for ABSOLUTE in case that’s helpful.

Regarding PureCN, unfortunately we cannot make the data public prior to the 24Q2 release. The release should be available in the coming month or so.

Best,
Simone