Technology
11 August 2020
By Chris Stokel-Walker
A new way of training medical artificial intelligence (AI) systems has proven significantly more accurate at diagnosing illnesses than previous efforts.
The AI system developed by researchers at University College London and Babylon Health, a medical service provider in the UK, relies on causation rather than correlation to pinpoint what could be wrong with people. It is more accurate than pre-existing AI systems and even outperformed real-life doctors in a small, controlled trial.
Unlike traditional AI systems, which identify the most probable disease based on symptoms presented by a patient, the causal AI system more closely mimics the way a doctor diagnoses patients: by using counterfactual questions to narrow the range of possible conditions.
Advertisement
The difference between correlation and causation matters in medicine. A patient could present at a hospital with shortness of breath. A correlation-based AI may link shortness of breath with being overweight, and being overweight with having type-2 diabetes, and so recommend insulin. A causal-based system might instead focus on the link between shortness of breath and asthma and so explore other treatment options.
“We set out to put causation back into the picture, so we could really find the diseases causing the symptoms in the patient, and help them based on that,” says Ciarán Gilligan-Lee at University College London, one of the authors of the paper.
The system was given 1671 realistic medical case summaries developed by more than 20 doctors, which showed symptoms for around 350 different illnesses. A group of 44 doctors from the UK’s National Health Service (NHS) also tackled an average of 159 of these cases each, seeing if they could figure out what was wrong. They diagnosed the case correctly 71.4 per cent of the time on average, while an older correlation-based AI was right 72.5 per cent of the time. The causative AI was correct 77.3 per cent of the time.
When it came to a subset of particularly rare diseases such as non-Hodgkin’s lymphoma, the new AI still outperformed doctors. For these cases, it was roughly 30 per cent better than the older AI system. However, doctors were better at identifying more common problems – because they encounter them frequently, reckons Gilligan-Lee. He plans to seek regulatory approval and clinical validation for the system, with the goal of placing it within an app where patients can get advice on symptoms and treatment.
“They’re very much describing a new technical approach to a problem,” says Xiaoxuan Liu of the University Hospitals Birmingham NHS Foundation Trust, UK, who has conducted research into medical deep learning systems. “The methodology in the paper is very good, and the technique does seem to show some promise.”
The fact the system outperforms doctors in rare disease diagnosis is exciting, says Liu – though she cautions it is at an early stage, and the number of case summaries was comparatively small. “We need to see how it works in real world cases, where the history isn’t very clear and sometimes you have multiple diseases interacting.”
Journal reference: Nature Communications, DOI: 10.1038/s41467-020-17419-7
More on these topics: