What is relative copy number/copy number ratio?

Users have asked us: What is relative copy number/copy number ratio?

1 Like

Since we do not have matched normals, the output is a “copy ratio” or relative copy number. It is relative to the rest of the genome for that cell line. E.g. if the cell line is tetraploid we would not be able to see it from the relative copy number. These values are reported as log2(relative CN + 1) in the portal.

2 Likes

Hi,

Trying to understand in detail how to obtain relative copy number from absolute copy number values using the log2(relative CN + 1) formula:

Example:
A375 cell line, ploidy = 2.76
ENO4 gene, absolute CN = 3

Why is the relative copy number of ENO4 gene in A375 is 1.04

All values are taken from DepMap and CCLE portals


Thank you

1 Like

The ‘relative CN’ is the Segment_Mean from the segment file that covers the gene (padded and averaged across the coordinates). For the cell line you’re pointing to (ACH-000219) ENO4 is covered by the following segment in the segment file:

DepMap_ID	Chromosome	Start	End	Num_Probes	Segment_Mean	Source
ACH-000219	10	77116708	133797422	3634	1.060220	Sanger WES

The copy number value is log2(1.060220+1) = 1.04

To get the Segment_Mean we calculate 2^MEAN_LOG2_COPY_RATIO, where MEAN_LOG2_COPY_RATIO is the output from GATK4. MEAN_LOG2_COPY_RATIO is not exactly the log-transformed ‘relative CN to ploidy’ but rather the result of GATK4 tool which includes PoN correction through SVD and median adjustment plus a shift to 1.

The absolute copy numbers are from ABSOLUTE package. The inconsistency between the relative and absolute numbers can be due to incorrect inferences done by either/both algorithms.

3 Likes

Hello, thank you for the explanations. To be clear - what is the equation relating relative copy number and absolute copy number? Is it

absolute copy number = ploidy * (relative copy number)

In which case in the example above, the discrepancy is between the report value for relative copy number of 1.04, and the calculated from absolute and ploidy of:

3.0 / 2.76 = 1.09

Am I understanding that correctly?

How would you recommend getting the absolute copy number for a gene? Would it be best to get the ploidy of the cell lines and use those in combination with the relative copy number? Or is it better to go with the absolute copy number?

Related, where would you recommend getting the ploidy for cell lines?

Thank you!

Can anyone help with this question?

Hi, your interpretation seems correct to me. The ABSOLUTE results should be a better measure of ploidy, rather than using the equation that you have used here. So if you need to know the absolute copy numbers I would recommend using the results from ABSOLUTE. However, at the moment we do not update these values on a quarterly basis, so the list only contains a subset of our lines.

1 Like

@jnoorbak , what does the Segment_Mean value represent when the Source is SNP array?

The Segment_Mean for SNP array is based on a different pipeline. These are our legacy data which we have not reprocessed. The method can be found in SI2 (page 3) of the original CCLE paper.

Hi, jnoorbak,

You stated CN profile in DepMap is log2(relative CN + 1). However, in the source code of your copy number process pipeline pipeline, (key command: genecn = toGeneMatrix(gapmergedsegs, gene_mapping)), this toGeneMatrix function doesn’t take log operations (source code of this function can be found toGeneMatrix).

Could you please tell me the reason for this conflict?

Hi, the transformation happens here.

We’re in the process of refactoring and cleaning up our code, so hopefully it will be more accessable for the users in the upcoming months.

1 Like

Thank you for your help!

Hi. How should the Segment_Mean column in segment file be modified before running the ABSOLUTE? Or can the segment file downloaded from depmap be used to run ABSOLUTE directly?

Hi. Our ABSOLUTE calls are from an older pipeline that we do not use anymore. I am not quite sure what kind of format ABSOLUTE accepts as input, but my understanding is that it requires allele specific copy numbers (e.g. produced by GATK ACNV). Our segment files are from GATK CNV and are transformed as (2**x). We also do some padding at the ends of chromosomes and interpolate the regions between copy number segments. We haven’t tested this, so I’m not sure whether ABSOLUTE would be able to handle these files.

1 Like