Details on methodology of “Two Class Comparison”

Would it be possible to include some details on the two class comparison custom analysis method? I assume it’s a two sample hypothesis test with some form of multiple hypothesis correction, but the exact method that is used would be useful for reporting and reproducibility.

It’s a great tool overall though, really has sped up the process of working through a lot of datasets quickly.

1 Like

Hi Shovik,

As you guessed correctly , the two class comparison simply consists of a simple linear hypothesis test followed up a multiple hypothesis correction step.

In particular, for each feature/column of the selected dataset we are fitting a simple linear regression model to the chosen phenotype of interest. The estimated regression coefficient and its standard error then fed into the adaptive shrinkage method described in to obtain moderated effect sizes (posterior mean estimates) and corresponding q-values (FDR). For the binary features, this methodology is being roughly equivalent (the only difference is the shrinkage step) to using a t-test with a pooled variance estimate.

Also, for the sake of reproducibility we keep the code used in this analysis in this github repo: