Assessing the merit of scientific papers can be a challenging task, even for experts. The process of peer review can be lengthy and often subjective.
The existence of published studies that researchers have been unable to replicate has also raised concerns about the review process.
One survey found that more than 70% of researchers have failed to reproduce another scientist’s experiments, with more than half failing to reproduce their own research findings. Some have even described this issue as a crisis.
With no consistent method to detect which papers are reproducible and which are not, many of the latter continue to circulate through the scientific literature.
To help scientists determine which research is the most promising, a team from the Kellogg School of Management at Northwestern University in Evanston, IL, has developed a machine learning tool that takes opinion out of the process and exponentially shortens the review period.
The details of the model feature in PNAS.
Explaining the limits of peer review, Prof. Brian Uzzi, who led this study, says: “The standard process is too expensive, both financially and in terms of opportunity costs. First, it takes too long to move on to the second phase of testing, and second, when experts are spending their time reviewing other people’s work, it means they are not in the lab conducting their own research.”
Uzzi and his team have developed a form of artificial intelligence (AI) to help the scientific community make quicker decisions on which studies are most likely to yield benefits.
One of the most important tests of the quality of a study is its reproducibility — whether other scientists replicate the findings that it reports when they carry out the same experiments. The algorithm that Uzzi and his team produced predicts this factor.
The model, which combines real human input with machine intelligence, makes this prediction by analyzing the words that scientific papers use and recognizing patterns that indicate that the findings have value.
“There is a lot of valuable information in how study authors explain their results,” explains Uzzi. “The words they use reveal their own confidence in their findings, but it is hard for the average human to detect that.”
The model can pick up on word choice patterns that may be hidden to a human reviewer, who might instead focus on the strength of the statistics in a paper, the developers say. There is also a risk that reviewers may be biased toward the topic or the journal that published the paper, or that persuasive words such as “remarkable” might influence them.
The researchers first trained the model using a set of studies that were known to be reproducible and a set of those known not to be. They then tested the model on a group of studies that it had never seen before.
They compared the output with that of the Defense Advanced Research Projects Agency’s Systematizing Confidence in Open Research and Evidence (DARPA SCORE) program, which relies on subject experts to review and rate scientific studies. However, on average, the process takes the best part of a year to complete.
When the team used the model on its own, its accuracy was similar to that of the DARPA SCORE, but it was much quicker, taking minutes instead of months.
In combination with the DARPA SCORE, it predicted which findings would be replicable with even greater accuracy than either method alone. It is likely that scientists will use it this way in reality, to complement human assessments.
“This tool will help us conduct the business of science with greater accuracy and efficiency,” Uzzi says. “Now, more than ever, it’s essential for the research community to operate lean, focusing only on those studies which hold real promise.”
The team says that the rollout of the model could be immediate, so it could analyze the raft of COVID-19-related research that is currently emerging.
“In the midst of a public health crisis, it is essential that we focus our efforts on the most promising research,” says Prof. Uzzi. “This is important not only to save lives but also to quickly tamp down the misinformation that results from poorly conducted research.”
Research is taking place at an unprecedented rate, and policymakers around the world are planning to accelerate clinical trials to find a treatment or vaccine for the disease. The Northwestern researchers say that their tool could help policymakers prioritize the most promising studies when allocating resources.
“This tool is particularly useful in this crisis situation where we can’t act fast enough. It can give us an accurate estimate of what’s going to work and not work very quickly. We’re behind the ball, and this can help us catch up,” concludes Uzzi.
For live updates on the latest developments regarding the novel coronavirus and COVID-19, click here.