Abstract
An accurate blood test for early-stage cancer (a “liquid biopsy”) is arguably the most important open problem in oncology, and the race to a solution is tantalizingly close to the finish. In this talk, we will discuss the state of this race as of 2023, and how optimization will play a role in reaching the finish line. In particular, we address a set of active learning problems that occur in the development of liquid biopsies via the lens of active sequential hypothesis testing (ASHT). This is a classic problem in which a learner seeks to identify the "true" hypothesis from among a known set of hypotheses, given a set of actions. Motivated by applications in which the number of hypotheses or actions is massive, we propose efficient (greedy, in fact) algorithms and provide the first approximation guarantees for ASHT, under two types of adaptivity. Both of our guarantees are independent of the number of actions and logarithmic in the number of hypotheses. We numerically evaluate the performance of our algorithms using both synthetic and real-world DNA mutation data, demonstrating that our algorithms outperform previously proposed heuristic policies by large margins. This is joint work with Kyra Gan, Su Jia, and Sridhar Tayur.