I am planning to use CRISPR knock-in to add GFP to the C-terminus of the SPDEF gene in the T-47D cell line. Before proceeding, I want to check the gene copy number of SPDEF to assess the likelihood of obtaining homozygous clones.
In the DepMap database, I found two different results: the Copy Number (Absolute) database reports 3 copies, while the Copy Number Public 24Q2 (Log2 transformed) indicates 2 copies. I’m confused about which one accurately reflects the gene copy number.
I’m not sure I can answer which is more accurate if there is a discrepency, however, I can offer some context about these two datasets:
The “Copy Number (Absolute)” is a dataset which was published as part a CCLE paper many years ago, which is using the Absolute algorithm to determine the number of copies.
The “Copy Number Public 24Q2 (Log2 transformed)” is a dataset that is continuing to grow over time. However, that dataset is in units of log2 relative to ploidy. So, assuming this genome is still diploid, you’re right, it’d be 2 copies, however, I don’t know if this line experienced any genome doublings.
I can look at the reported ploidy from the CCLE paper, and yes, they report the ploidy around 3
So perhaps these two datasets are consistent, it’s just that one is a relative value and the other is an absolute count.
Looking around in the data explorer, I realize that the latest absolute (absolute as in non-relative, not the algorim) CN calls are not visible in data explorer. However, if I go to the download section I can find “OmicsAbsoluteCNGene.csv” which has absolute CN calls as compute via PureCN (we’ve switched from the Absolute tool) and in that file, this gene in this cell line is reported as “3”.
Given all of this, it seems that all the data on the portal consistently thinks there are 3 copies of this gene in that line.