To apply artificial intelligence to the task, the authors of the Nature report used mammograms from about 76,000 women in Britain and 15,000 in the United States, whose diagnoses were already known, to train computers to recognize cancer.
Then, they tested the computers on images from about 25,000 other women in Britain, and 3,000 in the United States, and compared the system’s performance with that of the radiologists who had originally read the X-rays. The mammograms had been taken in the past, so the women’s outcomes were known, and the researchers could tell whether the initial diagnoses were correct.
“We took mammograms that already happened, showed them to radiologists and asked, ‘Cancer or no?’ and then showed them to A.I., and asked, ‘Cancer, or no?’” said Dr. Mozziyar Etemadi, an author of the study from Northwestern University.
This was the test that found A.I. more accurate than the radiologists.
Unlike humans, computers do not get tired, bored or distracted toward the end of a long day of reading mammograms, Dr. Etemadi said.
In another test, the researchers pitted A.I. against six radiologists in the United States, presenting 500 mammograms to be interpreted. Over all, A.I. again outperformed the humans.
But in some instances, A.I. missed a cancer that all six radiologists found — and vice versa.
“There’s no denying that in some cases our A.I. tool totally gets it wrong and they totally get it right,” Dr. Etemadi said. “Purely from that perspective it opens up an entirely new area of inquiry and study. Why is it that they missed it? Why is it that we missed it?”
Dr. Lehman, who is also developing A.I. for mammograms, said the Nature report was strong, but she had some concerns about the methods, noting that the patients studied might not be a true reflection of the general population. A higher proportion had cancer, and the racial makeup was not specified. She also said that “reader” analyses involving a small number of radiologists — this study used six — were not always reliable.