Hello dear DepMap developers.
I am using the .seg copy number file for my project. In the .seg(21q1) file there’s a column called “Status”, which is the copy number state of that segment (-1,0,1 for a loss, WT or a gain respectively). However, some segments have the value “U” in them(for example, all the segments in cell ACH-000003 What is the meaning of "U?
Notably, when I log2 transform the segment_mean column values for an IGV plot, the segments which are characterized as “U” looks as normal as the other cell lines.
if you go to the download page you might be able to find the README file.
As explained in the README:
- [CN] We are now adding amplification status (-, 0, +) for each segment in the segment copy number. X chromosome amplification status was removed due to its bias in female samples, caused by our purely male PoNs.
X chromosome amplification was removed. We have added a “U” symbol to indicate that.
As a side note, this value could be recomputed by taking note of the difference between male/female samples. We are not doing this for now and only removing the erroneous amplification annotation for all X chromosome segments.
Thank you Jkobect.
The “U” symbol in the “Status” column is present in many chromosomes, not just Chromosome X(see chromosome 1 in cell ACH-000003 in my attached picture). There are many examples for “U” symbol across the file that are not on X chromosomes.
Did I miss anything?
You are right.
Not all samples have had their status computed. We are aggregating data, such as Broad SNP, which do not have it. We do not process Broad SNPs, it directly comes from the CCLE2 dataset, we just append it to our dataset. This was not computed with the same pipeline.