Today, diagnosing rare genetic diseases requires a slow process of educated guesswork. Gill Bejerano, PhD, associate professor of developmental biology and of computer science at Stanford, is working to speed it up.
In a paper published July 12 in Genetics in Medicine, Bejerano and his colleagues describe an algorithm they’ve developed that automates the most labor-intensive part of genetic diagnosis: that of matching a patient’s genetic sequence and symptoms to a disease described in the scientific literature. Without computer help, this match-up process takes 20-40 hours per patient: The expert looks at a list of around 100 of the patient’s suspicious-looking mutations, makes an educated guess about which one might cause disease, checks the scientific literature, then moves on to the next one.
The algorithm developed by Bejerano’s team cuts the time needed by 90 percent.
“Clinicians’ time is expensive; computer time is cheap,” said Bejerano, who worked with experts in computer science and pediatrics to develop the new technique. “If I’m a busy clinician, before I even open a patient’s case, the computer needs to have done all it can to make my life easier.”
A Phrank approach
The algorithm’s name, Phrank — a mashup of “phenotype” and “rank” — hints at how it works: Phrank compares a patient’s symptoms and gene data to a knowledge base of medical literature, generating a ranked list of which rare genetic diseases are most likely to be responsible for the symptoms. The clinician has a logical starting point for making a diagnosis, which can be confirmed with one to four hours of effort per case instead of 20-40 hours.
The mathematical workings of Phrank aren’t tied to a specific database, a first for this type of algorithm. This makes it much more flexible to use.
Phrank also dramatically outperforms earlier algorithms that have tried to do the same thing, according to the paper. Bejerano’s team validated Phrank on medical and genetic data from 169 patients, an important advance over earlier studies in the field. Prior studies had tested algorithms on made-up patients instead because real-patient data for this research is hard to come by.
“The problem is that this test [using synthetic patients] is just too easy,” Bejerano said. “Real patients don’t look exactly like a textbook description.” On data from real patients, one older algorithm ranked the patient’s true diagnosis 33rd, on average, on the list of potential diagnoses it generated; Phrank, on average, ranked the true diagnosis fourth.
Phrank also holds potential for helping doctors identify new genetic diseases, Bejerano said. For example, if a patient’s symptoms can’t be matched to any known human diseases, the algorithm could check for clues in a broader knowledge base. “You might get the result that mouse experiments cause phenotypes similar to your patient, that you may have found the first human patient that suffers from this disease,” Bejerano said.
Ultimately, “nobody is going to replace a clinician making a diagnosis,” he said. But new technology could help experts use their time more efficiently, helping many more patients get diagnosed, he said.
The lead authors of the paper are graduate students Karthik Jagadeesh, MS, and Johannes Birgmeier, MS. Other Stanford co-authors Jon Bernstein, MD, PhD, associate professor of pediatrics; undergraduate student Cole Deisseroth; and former graduate students Harendra Guturu, PhD, and Aaron Wenger, PhD.