Better multiple testing: using multivariate co-data for hypothesis weighting

Speaker: 
Daniel Fridljand (Yale University)
Event time: 
Wednesday, February 15, 2023 - 1:00pm
Location: 
AKW 200 See map
Event description: 

Consider a multiple testing task, where we have access to a p-value and informative covariates for each test. Independent hypothesis weighting (IHW) uses these covariates to stratify the tests into bins which are then assigned different weights. Currently, the stratification is performed by quantile slicing the covariates. However, this does not take full advantage of the data. First, quantiles are not able to capture the heterogeneities among tests. Second, the procedure becomes infeasible for high-dimensional covariates. We address this gap with a random forest based approach, where the leaves replace the bins. The trees can handle high-dimensional covariates and an orthonormal series expansion makes growing the trees computationally efficient. The objective function is chosen such that the splits are sensitive to the shape of the postulated conditional density. This yields homogeneous bins and hence increases power, while controlling the false discovery rate. We demonstrate the power increase compared to competing methods on simulations. We further apply IHW-Forest to scRNA-seq data set used to understand the biological differences between healthy and diseased states.

Event Type: 
Applied Mathematics