How to calculate absolute copy number from relative copy number?

Hello!

I’ve downloaded a dataset with gene relative copy numbers. As suggested in (https://forum.depmap.org/t/what-is-relative-copy-number-copy-number-ratio/), I calculated absolute copy numbers from relative copy numbers using the formula:

Absolute CN = Line ploidy * relative CN = Line ploidy * ((2 ** relative CN from the dataset) - 1)

I then rounded the results and compared some of them to the data from Absolute Copy Number section in the data explorer. While most of my results (est. 90%) agreed with the data explorer, other results were off. For example:

SKBR3 line, ploidy = 4.04 (Ploidy data from CCLE_ABSOLUTE_combined_20181227):
Gene EFCAB5, RCN = 1.626; ACN = 4.04 * ((2 ** 1.626) - 1) = 4.04 * 2.095 = 8.426 (Data Explorer, ACN = 7)
Gene NLRP3, RCN = 0.872; ACN = 4.04 * ((2 ** 0.872) - 1) = 4.04 * 0.830 = 3.356 (Data Explorer, ACN = 4)

SKMEL28 line, ploidy = 4.02
Gene C1orf194, RCN = 1.484; ACN = 4.02 * ((2 ** 1.484) - 1) = 4.02 * 1.797 = 7.225 (Data Explorer, ACN = 6)

Am I using an inaccurate formula? How is the ACN calculated? I haven’t found a relevant dataset or discussion about it.

Thanks in advance!

Hello Georgiy,

I don’t think you are making any mistake. The values you are finding are pretty close. ABSOLUTE is an algorithm made by CGA:

https://software.broadinstitute.org/cancer/cga/absolute
ABSOLUTE does some more complex computation than this simple estimation to arrive to its value.

Additionally all absolute calls were made on the CCLE2 data. It is likely that some lines got more up to date sequencing (going from WES->WGS or Affymetrix → WES/WGS) which would also create differences between ABSOLUTE values and the ones you are getting from the latest datasets.

Overall CN are not super precise measurements and finding this variance in CN measurements of Cell Line data is completely normal.

I hope it helps!

Best,

Right, thank you!

Would it be possible by any chance to publish a dataset with absolute copy numbers, considering DepMap already has the data? It would be very much appreciated. In the meantime, I’ll check the article and the algorithm.

Sincerely,

The file you are using (CCLE_ABSOLUTE_combined_20181227) is the absolute dataset.

We do not create new absolute data because there is a manual step involved, to choose between possible profile, which is quite arbitrary. Also it is not really made for cell lines (where purity is expected to be always 100%).

We are working on a simple replacement to infer karyotype based on relative CN but there is no deadline set yet.

Best,

Thanks for the answer and good luck with the work!

Sincerely,