Defining "deep" deletions and amplifications

kingkongrc · July 5, 2021, 3:26pm

Hello,

I had previously used the CN calls from CCLE and earlier versions of DepMap (i.e. 18Q), where the scores were centered around 0. In those data, it was common to define “deep” deletions as < -1.28, while amplifications were set as > +0.75.

Given that the CN scores are now instead “relative copy number” (What is relative copy number/copy number ratio?), are there are any general recommendations for defining a “deep” deletion or amplification, analogous to previous versions of the data?

Thanks, and appreciate this awesome resource!
Ryan

kingkongrc · July 5, 2021, 3:52pm

I should add that I am aware of the “+,-,0” scoring in the segment-level CN data, but I’m finding that this three-level scheme is very generous in defining a deletion or amplification (e.g., regions would previously have been defined as het loss or gain, rather than del/amp).

Thanks!

cjlin · February 6, 2022, 4:55am

Hi. Did you find the answer to this question?

kingkongrc · February 6, 2022, 5:20am

Hi,

No official answer from the DepMap team as far as I know, but I saw that Mina et al, 2020 (https://pubmed.ncbi.nlm.nih.gov/32989323/) used the following cutoffs for CN:
Amp: CN > 2^0.75
Del: CN < 2^-1.2
where CN = 1 means diploid.

Last I checked, the CN values from DepMap are expressed as log2(CN+1) = x, so to get the “CN” from the DepMap values, you’ll first need to transform with 2^(x) - 1.

Hope that helps – again, this is not an official answer from the DepMap team, so take it with a grain of salt.

cjlin · February 6, 2022, 5:30am

Thanks! I have asked Dr. Marco Mina about this question before and here was his reply (hope this helps to other users):

For our purposes, we wish to classify each gene level CNV the same way GISTIC and cBioportal do, that is in 5 categories (deep deletion, het loss, diploid, gain and amplified). if you had CNV data in real linear scale y , you would expect:

a diploid gene to have CNV level of y=2 (2 copies)

a homozygous loss (thsat is, a deep deletion in cBioportal jargon) to have y=0 (0 copies)

a gain to have roughly 3 copies (y=3)

an amp to have >= 4 copies (y>4)

Now, considering that there is some noise in the estimation of the CNV, we set the following thresholds on scale y: [deep del < 0.87 < het loss < 1.32 < diploid < 2.64 < gain < 3.36 < amp]

The transformation CCLE applied to derive their log-scaled CNV values ( x ) was:

x = log2(y/2)

Following such formula, the threshold we have to apply are: (deep del < -1.2 < het loss < -0.6 < diploid < 0.4 < gain < 0.75 < amp).

*Indeed, you can see that log2(0.87/2) = -1.2 , log2(1.32/2) = -0.599 … and so on.

I wonder if these threshold can only be applied to the cell line of diploid. So according to Marco’s reply, I think the threshold for defining the categorical CN status should be:
deep del < log2(0.87/ploidy+1) < het loss < log2(1.32/ploidy+1) < diploid < log2(2.64/ploidy+1) < gain < log2(3.36/ploidy+1) < amp

jnoorbak · February 7, 2022, 11:25pm

Thanks for commenting on this thread. Unfortunately we don’t have a specific threshold recommendation for this.

One possible option beside what has been suggested here can be using the thresholds used by TCGA here. The values seem to be log2(ratios) and the upper and lower values are at -0.3 and 0.3 for gain, loss and neutral.

We have internally used the following as well but I cannot comment on its reliability for a general use case:
“Copy number calls are used to identify focal deletions, deep deletions, and gene amplifications. All these calculations start with segment level relative copy number from the CCLE dataset. A gene is considered to be focally deleted if any of its exons have a copy number of less than 0.2. The weighted copy number of the exons is also calculated and if this is less than 0.4, the gene is considered to have undergone deep deletion. A gene is considered amplified if its weighted copy number is greater than 3.”

please note that our gene level copy number is log2(CN ratio + 1) whereas the segment level copy number is approximately CN ratio.

cjlin · February 9, 2022, 5:31am

Hi. Are the weighted copy number and the corresponding thresholds (0.2, 0.4) you mentioned here in log2(Ratio + 1) or log2(Ratio)? What does the weighted CN mean here? That is, can these two values be applied to the CCLE CNV data without any transformation?

jnoorbak · February 9, 2022, 9:06pm

This is still experimental so please take it with a grain of salt (we may update some of the thresholds and make a post). But these are relative copy numbers weighted by genomic ranges. Values are from the segment file which approximately gives CN ratios (not log transformed). The segment file from depmap portal can be directly used for this.

JulesL · June 3, 2023, 7:14pm

Hello,

I am looking for a way to classify the cell lines in 3 categories (homozygously deleted/heterozygously deleted/WT, for my gene of interest) using the OmicsCNGene file and I found this conversation instructive. However, I am not sure to follow which threshold are you suggesting when using the OmicsCNGene file and not the segment profile file ? Did you update these threshold since this conversation happened and I missed it on the forum ? (and in this case I apologise)

Thanks a lot in advance for your feedback !

rahma_ijaz · March 22, 2024, 12:59pm

hi, any update on the thresholds that should be used to classify the CNA’s?

Topic		Replies	Views
Determining Copy Number Alterations for Genes/, as boolean, in DepMap Public 22Q2 Q&A	1	331	February 28, 2024
Classifying copy number alterations Q&A	3	2291	January 2, 2024
WGS CN data and functional partial deletions Q&A	2	77	June 12, 2024
Characterising copy number alterations Q&A	1	538	December 7, 2022
Values in OmicsCNGene.csv 24Q2 release Issues and Bugs	1	154	July 3, 2024

Defining "deep" deletions and amplifications

Related topics