Issues with copy number data

I have downloaded the copy number data for ovarian cancer cell lines. The table that I get has all of the genes listed as columns and the cell lines with the ACH designations as the rows. The problem is that the actual copy number value is the same for all genes across each row. Different rows for different cell lines have different values for cn, but for every cell line, the value is the same for all 15,000 or so genes.

Did I do something wrong? I notice this same problem when I download copy number data for all of the cell lines.

SPE

I tried reproducing this issue and I think I see what you mean, they look like all genes have the same CN value:

> cn <- read.csv("~/Downloads/Copy_Number_22Q2_Public_subsetted.csv", row.names=1)
> cn <- as.matrix(cn)
> str(cn)
 num [1:73, 1:24352] 1.168 1.001 1.038 0.823 1.159 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:73] "ACH-000520" "ACH-000657" "ACH-001278" "ACH-000713" ...
  ..$ : chr [1:24352] "DDX11L2" "WASH7P" "MIR6859.1" "MIR1302.2" ...

> cn[1,1:20]
  DDX11L2    WASH7P MIR6859.1 MIR1302.2   FAM138A     OR4F5    WASH9P MIR6859.2
 1.168424  1.168424  1.168424  1.168424  1.168424  1.168424  1.168424  1.168424
   OR4F29    OR4F16 LINC01409    FAM87B LINC01128 LINC00115    FAM41C LINC02593
 1.168424  1.168424  1.168424  1.168424  1.168424  1.168424  1.168424  1.168424
   SAMD11     NOC2L    KLHL17   PLEKHN1
 1.168424  1.168424  1.168424  1.168424

However, I don’t think the values are actually all the same.

> length(unique(cn[1,]))
[1] 482

It looks like there are 482 distinct CN values for the 18k different genes, so many genes have the same CN value.

This is actually not surprising because the copy number amplifications/deletions are typically much bigger segments than individual genes, so many genes have identical values.

Thanks,
Phil