Is there ASCAT (allele specific copy number results) for CCLE data and how to set a control sample for CNV detection with WES data

Hi DepMap team,

I got two questions here.

I want to obtain allele specific copy number results (which typically can be calculated with ASCAT software) and did not find them. Is such data available?

If such data is not available, I would like to compute the result by my own with WES data. I found all raw data of CCLE are available in NCBI SRA database. That’s awesome and thanks for the sharing. However, I cannot find any normal cell line in CCLE sample list which can be used as control in CNV calling. I am wondering if you have good suggestions?

Thanks in advance.



I searched NCBI and found the SF8657 cell line

I can’t make sure if it is a normal cell line from patients with melanoma or a tumor cell line…

Hello Shixiang,

We do not generate Allele specific copy number yet. This is in our plan but we do not have the bandwidth yet for that.

Since most of the historical lines do not have matched normal, we run the gatk CNV pipeline in tumor only mode, with a good quality PON.

I would recommend against using a pseudo-normal for your matched normal.

Not many tool can work effectively without matched normal and this is why getting allele specific CNV for Cell lines is hard.

Maybe look into the GATK documentation?

I hope it helps.


Thanks for you kind reply. Is it possible to generate the pseudo-normal with CCLE cell line? I want to call the CNV, so if it is possible to use the cell lines with few or no CNV based the existing CCLE CNV data?
Or should I use a WES data from patient’s normal, like TCGA normal sample?

As I said I don’t think using a pseudo normal would help… However all these algorithms expressly require one because they are made for cancer with tumor purity issue. Unegarding the somatic aspect ,which we don’t really care about for cell lines since the samples are pure (with subclonal cancer populations however).

To get allele specific CNV then, you want to compare phased mutation abundance of your gene. If you are interested in a specific mutation and a specific cell line, you can check at the AF of that mutation compared to the gene level copy number and try to estimate in how many copies it is present. But I don’t think low CNV samples in a somatic caller would work.

For more accuracy or scalability, you might want to create your own algorithm for this maybe using phased mutations from RNAseq and then use it to compute allele specific CN on DNAseq

Hope it helps.


1 Like