Users have asked us: What is relative copy number/copy number ratio?
Since we do not have matched normals, the output is a âcopy ratioâ or relative copy number. It is relative to the rest of the genome for that cell line. E.g. if the cell line is tetraploid we would not be able to see it from the relative copy number. These values are reported as log2(relative CN + 1) in the portal.
Hi,
Trying to understand in detail how to obtain relative copy number from absolute copy number values using the log2(relative CN + 1) formula:
Example:
A375 cell line, ploidy = 2.76
ENO4 gene, absolute CN = 3
Why is the relative copy number of ENO4 gene in A375 is 1.04
All values are taken from DepMap and CCLE portalsâŠ
Thank you
The ârelative CNâ is the Segment_Mean
from the segment file that covers the gene (padded and averaged across the coordinates). For the cell line youâre pointing to (ACH-000219) ENO4 is covered by the following segment in the segment file:
DepMap_ID Chromosome Start End Num_Probes Segment_Mean Source
ACH-000219 10 77116708 133797422 3634 1.060220 Sanger WES
The copy number value is log2(1.060220+1) = 1.04
To get the Segment_Mean
we calculate 2^MEAN_LOG2_COPY_RATIO, where MEAN_LOG2_COPY_RATIO is the output from GATK4. MEAN_LOG2_COPY_RATIO is not exactly the log-transformed ârelative CN to ploidyâ but rather the result of GATK4 tool which includes PoN correction through SVD and median adjustment plus a shift to 1.
The absolute copy numbers are from ABSOLUTE package. The inconsistency between the relative and absolute numbers can be due to incorrect inferences done by either/both algorithms.
Hello, thank you for the explanations. To be clear - what is the equation relating relative copy number and absolute copy number? Is it
absolute copy number = ploidy * (relative copy number)
In which case in the example above, the discrepancy is between the report value for relative copy number of 1.04, and the calculated from absolute and ploidy of:
3.0 / 2.76 = 1.09
Am I understanding that correctly?
How would you recommend getting the absolute copy number for a gene? Would it be best to get the ploidy of the cell lines and use those in combination with the relative copy number? Or is it better to go with the absolute copy number?
Related, where would you recommend getting the ploidy for cell lines?
Thank you!
Can anyone help with this question?
Hi, your interpretation seems correct to me. The ABSOLUTE results should be a better measure of ploidy, rather than using the equation that you have used here. So if you need to know the absolute copy numbers I would recommend using the results from ABSOLUTE. However, at the moment we do not update these values on a quarterly basis, so the list only contains a subset of our lines.
@jnoorbak , what does the Segment_Mean value represent when the Source is SNP array?
The Segment_Mean for SNP array is based on a different pipeline. These are our legacy data which we have not reprocessed. The method can be found in SI2 (page 3) of the original CCLE paper.
Hi, jnoorbak,
You stated CN profile in DepMap is log2(relative CN + 1). However, in the source code of your copy number process pipeline pipeline, (key command: genecn = toGeneMatrix(gapmergedsegs, gene_mapping)), this toGeneMatrix function doesnât take log operations (source code of this function can be found toGeneMatrix).
Could you please tell me the reason for this conflict?
Hi, the transformation happens here.
Weâre in the process of refactoring and cleaning up our code, so hopefully it will be more accessable for the users in the upcoming months.
Thank you for your help!
Hi. How should the Segment_Mean column in segment file be modified before running the ABSOLUTE? Or can the segment file downloaded from depmap be used to run ABSOLUTE directly?
Hi. Our ABSOLUTE calls are from an older pipeline that we do not use anymore. I am not quite sure what kind of format ABSOLUTE accepts as input, but my understanding is that it requires allele specific copy numbers (e.g. produced by GATK ACNV). Our segment files are from GATK CNV and are transformed as (2**x). We also do some padding at the ends of chromosomes and interpolate the regions between copy number segments. We havenât tested this, so Iâm not sure whether ABSOLUTE would be able to handle these files.