HI,
I have some doubts about how the proteomic data is processed and how the qvalues are calculated on the custom analysis tool.
Regarding the data, there are some gene symbols associated to multiple uniprot accesion numbers. In particular, there is a set of genes (for example AMY1B, AMY1C, ERFL…) that have ~24 uniprot accesion numbers. Obviously, AMY1B (A4QPH2.3), AMY1C (A4QPH2.3), ERFL (A4QPH2.3) and such, all show the same values. According to uniprot A4QPH2.3 corresponds to PI4KAP2. In my downloaded data there are 134 entries of the type: <gene symbol (A4QPH2.3)>.
Is this expected behaviour?
Another somewhat related thing is that when running pearson correlation between MetMap500 and proteomic data, most of the qvalues are identical to the pvalues. On a glance, it seemed to me that the correlations that had values for all queried cell lines (numCellLines) did have a reasonable qvalue while the rest (although not always) just had qvalues exactly the same as pvalue.
Thank you!