Calculation of frequency mutated genes

CXL · April 15, 2022, 4:09am

Hi. I was trying to calculate the frequency of mutated genes from CCLE_mutations.csv file of 22Q1 release, then ranking genes by frequency.

But I was really confused by the data, since the dataset seems to be specific to mutation sites, I don’t know what is the right way to combine those together.

I wonder if someone could explain a little bit about it.
Or if there are other databases could provide such information?

Thank you for your time,
Best regards,
CXL

jkobject · April 15, 2022, 9:17pm

Hello CXL,

The data is aggregated over all available sequencing types for a given sample. Some samples have more sequencing types than others. (also, note that we are only releasing somatic coding mutations).

So a simple method would be to just to take any available mutations in our dataset un-regarding of the sequencing type.

Whatever happens your analysis will be biased by the fact that different samples have different set of sequencing, each covering more or less well a specific set of genes.

Hope this helps.

Best,

CXL · April 16, 2022, 3:42pm

Thank you so much for your reply. That really helps.

And I also have a question about ALT:REF. If I understand correctly, that ratio is the number of mutation allele by normal/reference allele, right?
So when I calculate mutation freqency of a gene, should I sum number of total ALT and REF of all entries of a gene for calculation, or I should just sum the number of entries of mutations for a gene? Which way is more reasonable and unbiased?

And when REF=0, does that really means no REF allele found at such place in sequencing?

Best regards.

jkobject · April 21, 2022, 1:33pm

Yes this is right. and ref 0 really means no reads found with that mutation.

I think both metrics represent different things and it depends on your underlying question and the point you want to make. But from what you are saying I would be inclined in computing mutation frequency as the sum of all mutations that have a high enough allele frequency.

Best,

CXL · April 22, 2022, 2:09am

I understand. Your help is precious. Thank for providing this helpful opinion!
Best regard,

Topic		Replies	Views
Mutation frequency Q&A omics , documentation	1	431	June 28, 2021
OmicsSomaticMutations.csv, multiple mutation rows of the same cell-line and the same gene Q&A omics	2	127	January 10, 2025
Somatic mutations: coverage for homozygote reference samples Q&A	1	205	January 19, 2024
Determination of 0/1/2 for multi-variant genes in OmicsSomaticMutations MatrixDamaging Q&A data	1	405	March 30, 2023
CCLE_mutation.csv formated Issues and Bugs data	3	674	December 15, 2021

Calculation of frequency mutated genes

Related topics