PI: HaiYing Wang · University of Connecticut
As the size of data explodes during the big data era, we develop a strategy to select more informative data points for building models to alleviate the computation burden. In contrast to previous studies on parametric models, our research explores the efficacy of optimal subsampling methods in gradient boosting trees, a semi-parametric method.
Data delivered over the OSDF
Jobs
Files via OSDF
CPU hours
GPU hours
Cumulative usage · Jul 2, 2025 – Jul 2, 2026
Request an access point and connect your first repository in an afternoon — facilitation is free.