Conflicting values in CN data

Using Copy Number Public 24Q4 data, shouldn’t these be the linear (non-log-transformed) versions of Copy Number Public 24Q4 (Log2 transformed) ?

I’ve loaded both linear and log2-transformed data in toy data frames called test1 and test2, respectively.

Some values agree, like MYCN in the cell line KELLY:

> test1 %>% filter(cell_line_display_name=="KELLY") %>% pull("MYCN")
[1] 232.3561

> test2 %>% filter(cell_line_display_name=="KELLY") %>% pull("MYCN")
[1] 7.86639

#Checks out
> log2(232.3561)
[1] 7.860194

But others do not agree, like MYCN in CHLA90

> test1 %>% filter(cell_line_display_name=="CHLA90") %>% pull("MYCN")
[1] 1.413915

> test2 %>% filter(cell_line_display_name=="CHLA90") %>% pull("MYCN")
[1] 1.271375

#Obviously wrong
> log2(1.413)
[1] 0.4987615

I don’t know which value in which gene/cell line is correct.

Hi,

Copy Number Public 24Q4 (Log2 transformed) is in fact log2(CN + 1). If you add the pseudocount to the equation, the numbers should line up.

Best,
Simone

2 Likes