Is prediction enough?
This was the question animating a lively, debate-style colloquium — as well as the title of the event — April 25 on the promises and perils of machine learning. Organized by the Department of Health Research and Policy’s Division of Epidemiology, some 200 data wranglers and scientists attended the event to hear an interdisciplinary lineup of speakers present their best arguments for and against various machine-learning strategies.
A video of this half-day event can be viewed here.
Value of expert knowledge
Miguel Hernan, MD, DrPH, professor of epidemiology and biostatistics at the Harvard T.H. Chan School of Public Health, advocated for pairing algorithms with expert knowledge. “For causal questions, we need data and a good algorithm, but we also need expert knowledge,” Hernan said.
Nigam Shah, MBBS, PhD, associate professor of biomedical data science, described a project at the medical school called the “green button,” which enables Stanford physicians to submit a clinical question to a bioinformaticist, then receive a quick-and-dirty answer in a few seconds, culled from 150 million patient records. He asserted that only a handful of these clinical questions could be answered with a medical guideline or a randomized clinical trial, so health-care providers shouldn’t let perfection be the enemy of the good; rather, physicians should be allowed to apply their expert knowledge to this kind of data-driven evidence to make the best clinical decisions.
The colloquium featured tales of investigators led astray by biased data, woefully misguided causal inferences and “black box” algorithms, so called because their decision-making processes are inscrutable to outside observers. Presentations included “Data science is science’s second chance to get causal inference right: A taxonomy of data science tasks and its implications,” “Avoiding discrimination through causal reasoning” and “Learning objectives for causal inference,” “An informatics consult service for using aggregate patient data at the bedside” and “Offline policy evaluation for algorithmic decisions.”
Michael Baiocchi, PhD, a statistician and assistant professor of medicine at the School of Medicine, issued a cri de coeur to attendees to not let algorithmic learning obliterate what we know about how to practice good science: “Being a statistician means we are defenders of the scientific method. We are the ones, through several generations, who have codified it, mathematized it and quantified it. We guide the content experts beyond the gate to help them find new knowledge and bring it back.”