MIT developing cancer-diagnosis AI software

Chuck Bednar for – @BednarChuck

Researchers from the MIT Computer Science and Artificial Intelligence Laboratory are working on a new software system capable of correctly diagnosing cancer in a patient and even determining which type he or she has, potentially speeding up the treatment process.

According to Gizmodo, Ph. D. student Yuan Luo, MIT Professor Peter Szolovits, and a team of experts from Massachusetts General Hospital have developed a software system which analyzes data from existing medical records, then suggests potential cancer diagnoses to doctors.

Using the system to improve lymphoma diagnoses

In a paper published in the Journal of the American Medical Informatics Association earlier this month, the authors demonstrate how their system can be used to identify the 50 different types of the difficult-to-diagnose cancer lymphoma. Up to 15 percent of lymphoma cases are initially misdiagnosed, the MIT team explained, which could cause unnecessary delays in treatment.

The software accessed a massive collection of pathology reports, Gizmodo said, gathering data that can be linked to relationships used by the World Health Organization (WHO) to define the sub-types of the cancer. Furthermore, the system links words that appear frequency in the medical records to each of those data points in order to provide another layer of information.

“It is important to ensure that classification guidelines are up-to-date and accurately summarized from a large number of patient cases,” Luo, the first author of the study, explained in a statement. “Our work combs through detailed medical cases to help doctors more comprehensively capture the subtle distinctions between lymphomas.”

Making doctors’ jobs easier

The researchers also emphasize that the AI models not only need to be accurate, but also doctor-friendly, meaning that clinical workers and medical personnel need to be able to easily interpret the findings. The information their system collects is converted into a graph representation that features medical concepts as the nodes and semantic dependencies as the edges.

“Clinicians’ diagnostic reasoning is based on multiple test results simultaneously,” Luo said. “Thus it is necessary for us to automatically group subgraphs in a way that corresponds to the panel of test results. This makes the model interpretable to clinicians instead of being a black-box, as they often complain about many other machine-learning models.”

Their work uses a technique known as Subgraph Augmented Non-negative Tensor Factorization (SANTF), which organized data from the roughly 800 medical cases as a three-dimensional table that can easily link test results to lymphoma subtypes. In their paper, the authors reported that SANTF was 10 percent more effective than similar methods.

“The promise of Luo’s work, if applied to very large data sets, is that the criteria that would then help to identify these clusters can inform doctors about how to understand the range of lymphomas and their clinical relationships to each other,” said co-author Peter Szolovits, adding that he is confident that that the model could led to more accurate lymphoma diagnoses.

“Our ultimate goal is to be able to focus these techniques on extremely large amounts of lymphoma data, on the order of millions of cases,” he noted. “If we can do that, and identify the features that are specific to different subtypes, then we’d go a long way towards making doctors’ jobs easier – and, maybe, patients’ lives longer.”


Follow redOrbit on Twitter, Facebook, Google+, Instagram and Pinterest.