Best practices for interpreting copy number values in CN WGS data 25Q3

nehatalluri · February 5, 2026, 5:29pm

I’m currently working with the 25Q3 WGS copy number data and am noticing some very high/extreme copy number values. Is there recommended guidance on how these extremes should be interpreted and/or dealt with.

I’m also want to do quantitative calling of amplifications and deletions. What are the recommended best practices for defining amp/del calls (for example, suggested thresholds)? One challenge I’m running into is that the copy number distributions vary widely across cell lines, so a single global threshold seems questionable. Is there a recommended way to handle this variability?

simz · February 5, 2026, 7:29pm

Hi,

In general, we don’t have specific threshold recommendations for determining amp/del using relative copy number data since it can be very context-dependent, but perhaps the discussion in this thread could be helpful.

The variability of copy numbers across cell lines is expected. If there are extremes that look concerning, do you mind sharing some specific examples so we can look into it?

Thanks,

Simone

nehatalluri · February 5, 2026, 10:17pm

Here are the distributions of log2 copy-number values for cell lines ACH-000017 and ACH-000022. These include examples of extremely high CN values (I can share more if helpful). Interpreting log2 values like 15 or 9 as copy number (2^15 or 2^9) doesn’t seem plausible to me. I assumed these are artifacts.

simz · February 6, 2026, 3:22pm

If these plots were generated using OmicsCNGeneWGS.csv, the values are in fact linear (min is 0 instead of negative).

Simone

nehatalluri · February 9, 2026, 4:01pm

Would you be able to provide/point to the documentation or a release where it mentions that the OmicsCNGeneWGS.csv file is now in linear scale?

nehatalluri · February 11, 2026, 3:51pm

I found this specific release that mentions that the values are no longer log2 transformed Announcing the 24Q2 Release that mentions “to be consistent with absolute copy number data, the relative copy number matrix is no longer log2 transformed.” ; however, it would be helpful for confirmation.

simz · February 13, 2026, 9:57pm

You are correct that it is not explicit in the file description. We will clarify it in future releases. Thanks for bringing it to our attention.

Simone

Topic		Replies	Views
Defining "deep" deletions and amplifications Q&A	9	6857	March 22, 2024
Classifying copy number alterations Q&A	3	2706	January 2, 2024
Determining Copy Number Alterations for Genes/, as boolean, in DepMap Public 22Q2 Q&A	1	418	February 28, 2024
Values in OmicsCNGene.csv 24Q2 release Issues and Bugs	1	214	July 3, 2024
Same values of copy number alteration (CNA) in 22Q1 Q&A	3	368	June 3, 2022

Best practices for interpreting copy number values in CN WGS data 25Q3

Related topics