I have downloaded the “Methylation_(1kb_upstream_TSS)_subsetted_NAsdropped.csv” file for a particular cell line and tried to open it in Excel. It shows up as continuous text and when I change text to columns (separated by comma), it does not look correctly formatted. Is there a way to have this data in two columns only, one with the gene/coordinates and the other with the methylation values? Moreover, which genome build was used for obtaining the coordinate positions of CpG sites?
Thanks a lot in advance.
No, today the portal doesn’t have the ability to format the file in that way.
If “Custom Downloads” doesn’t support the transform you want, you might find it easier to transform the file into the format you’d like starting with the original file which can be downloaded from the “Download Files” section: DepMap Data Downloads
Thanks a lot for your reply.
I have already downloaded the original file and formatted according to our needs. It is fine now.
Could you please help to clarify one issue for me? Some coordinate entries in the original file (CCLE_RRBS_TSS1Kb_20181022.txt) have associated more than one CpG site at a time. The methylation frequency in these entries is the average of the methylation frequencies of those sites? Is there a paper that describes these datasets and how they were produced?
Thank you very much in advance.
I believe these data published as part of the paper Next-generation characterization of the Cancer Cell Line Encyclopedia | Nature and the generation of the data is described in the supplemental text.