Dear DepMap team,
Greetings, I hope this problem resolved as soon as possible.
Although there was similar problem in Number of genes mutated in cell lines and I already asked in that page, I create a new topic once again for the answer.
I checked that there was a mismatch of the number of genes between the dataset (19536) and the main download page(18784).
When I mannually compared the data, I found the genes having Entrez_ID = 0.
Those have been updated to other HGNC ID.
So I tried to convert them to the newest version of ID as below:
But I realized that the location of mutations was also updated.
As you can see, the start point of DARC is 159176106 but now the location of DARC is 159204875…159206500
As a result, according to the original data, ACKR1(=DARC) were not mutated in any cancer cell lines.
In summary, “Is it okay to simply update the Gene symbol and ID without considering the differentiated location?”
Thank you for your reading this post.
Sincerely,
Songyeon