Users have asked us: What is relative copy number/copy number ratio?
Since we do not have matched normals, the output is a “copy ratio” or relative copy number. It is relative to the rest of the genome for that cell line. E.g. if the cell line is tetraploid we would not be able to see it from the relative copy number. These values are reported as log2(relative CN + 1) in the portal.
Trying to understand in detail how to obtain relative copy number from absolute copy number values using the log2(relative CN + 1) formula:
A375 cell line, ploidy = 2.76
ENO4 gene, absolute CN = 3
Why is the relative copy number of ENO4 gene in A375 is 1.04
All values are taken from DepMap and CCLE portals…
The ‘relative CN’ is the
Segment_Mean from the segment file that covers the gene (padded and averaged across the coordinates). For the cell line you’re pointing to (ACH-000219) ENO4 is covered by the following segment in the segment file:
DepMap_ID Chromosome Start End Num_Probes Segment_Mean Source ACH-000219 10 77116708 133797422 3634 1.060220 Sanger WES
The copy number value is log2(1.060220+1) = 1.04
To get the
Segment_Mean we calculate 2^MEAN_LOG2_COPY_RATIO, where MEAN_LOG2_COPY_RATIO is the output from GATK4. MEAN_LOG2_COPY_RATIO is not exactly the log-transformed ‘relative CN to ploidy’ but rather the result of GATK4 tool which includes PoN correction through SVD and median adjustment plus a shift to 1.
The absolute copy numbers are from ABSOLUTE package. The inconsistency between the relative and absolute numbers can be due to incorrect inferences done by either/both algorithms.
Hello, thank you for the explanations. To be clear - what is the equation relating relative copy number and absolute copy number? Is it
absolute copy number = ploidy * (relative copy number)
In which case in the example above, the discrepancy is between the report value for relative copy number of 1.04, and the calculated from absolute and ploidy of:
3.0 / 2.76 = 1.09
Am I understanding that correctly?
How would you recommend getting the absolute copy number for a gene? Would it be best to get the ploidy of the cell lines and use those in combination with the relative copy number? Or is it better to go with the absolute copy number?
Related, where would you recommend getting the ploidy for cell lines?
Can anyone help with this question?
Hi, your interpretation seems correct to me. The ABSOLUTE results should be a better measure of ploidy, rather than using the equation that you have used here. So if you need to know the absolute copy numbers I would recommend using the results from ABSOLUTE. However, at the moment we do not update these values on a quarterly basis, so the list only contains a subset of our lines.
@jnoorbak , what does the Segment_Mean value represent when the Source is SNP array?
You stated CN profile in DepMap is log2(relative CN + 1). However, in the source code of your copy number process pipeline pipeline, (key command: genecn = toGeneMatrix(gapmergedsegs, gene_mapping)), this toGeneMatrix function doesn’t take log operations (source code of this function can be found toGeneMatrix).
Could you please tell me the reason for this conflict?
Hi, the transformation happens here.
We’re in the process of refactoring and cleaning up our code, so hopefully it will be more accessable for the users in the upcoming months.
Thank you for your help!