Genes that overlap with segmental duplication regions and/or are flagged by repeatMasker are masked in this matrix

Hello,

I have a question regarding your DepMap Public 24Q2 dataset, specifically about the file OmicsAbsoluteCNGene.csv.

You mentioned the following:

“Genes that overlap with segmental duplication regions and/or are flagged by repeatMasker are masked in this matrix.”

However, upon reviewing some of the masked genes (such as RPL10), I found that they do not appear to meet this criterion. I would like to understand why these genes are absent in OmicsAbsoluteCNGene.csv.

Additionally, I examined some genes that are present (such as RPS9) and noticed they have a significant amount of repeatMasker overlap. I am curious to know why they remain in the dataset given your filtering criteria.

I look forward to your response.

Thank you very much.


Hi,

Currently, our absolute copy number caller PureCN excludes copy number prediction on sex chromosomes in our cell lines. Since RPL10 is located on the X chromosome, it is not included in our absolute copy number datasets. We are currently investigating this issue.

As for RPS9, it is possible that the amount of repeatMasker overlap is below our filtering threshold. Please see our documentation on gene masking for more details: depmap_omics/docs/source/dna.md at master · broadinstitute/depmap_omics · GitHub

Simone