DeepMind scientists began with old-school Atari games.1 Using deep and reinforcement learning, they developed an artificial intelligence (AI) program that learned to play Breakout, Pong, and other games at super-human levels after being provided only display-pixels data and scores; see how remarkably quickly it learned at .1,2 They caught the eyes of Google executives, who promptly bought the company and convinced them to tackle the 3000-year-old game of Go. Go was an elusive goal for AI researchers, who had primarily used game-tree-search approaches that could not handle Go鈥檚 complexity. The number of possible board configurations in this deceptively simple-ruled game is estimated to be 10170, about 1090 times greater than the number of atoms in the known universe. Using machine learning similar to the Atari project, DeepMind developed AlphaGo, which learned first from human-expert games and then self-play.3 In 2015, AlphaGo became the first AI to defeat a human Go master.1 In 2016, AlphaGo defeated 8-time world champion Lee Sedol in an AI landmark event that inspired a movie.4
In this issue of JAMA Ophthalmology, Coyner et al5 report an AI milestone accomplishment of their own: the first validated autonomous AI algorithm for retinopathy of prematurity (ROP) screening. They developed and validated a deep learning model, i-ROP-DL, that assesses plus-like posterior-pole vascular changes in retinal photographs to determine whether an infant requires an ophthalmologist examination. In development, they applied robust AI methodology to 2530 images from 843 infants in the US. In validation, i-ROP-DL flagged in advance 100% of infants requiring treatment among 2244 infants from existing ROP telemedicine programs at 59 hospitals in the US and India. The broader context is important. While there are still details to study about treatment timing and modality, we generally know how to treat severe ROP to acutely reduce risk of retinal detachment. Globally, most blindness due to ROP results from inadequate screening infrastructure and too few ophthalmologists who manage ROP, particularly in low- and middle-income countries. Telemedicine can help address resource limitations, but what if there are not even enough ophthalmologists to read images? The e-ROP study showed trained nonphysicians could do the job,6 but given steady technological progress, it is likely that AI algorithms eventually will do it better than humans. This report is a fundamental step in that direction.
In the 37th move of its second game against Lee Sodon, AlphaGo made what experts at first thought was a mistake, until they realized it was a completely novel strategy. One Go master commented, 鈥淚t鈥檚 not a human move. I鈥檝e never seen a human play this move. So beautiful. Beautiful鈥. Beautiful鈥.鈥4 As the technology advanced, AlphaGo evolved into AlphaZero, which learned Go better from scratch by playing itself repeatedly, without first studying human games. The AI became simpler, stronger, and more generalizable. After only a few hours of training, AlphaZero was able to defeat all the best Go programs, and it taught itself chess just as quickly.7
As with Go, maybe AI will teach us something new about ROP and evolve to interpret images better than humans. At this point, however, how much of the screening process should we entrust to AI? Consider some current limitations. The i-ROP-DL algorithm does not determine stage, just plus-like change, though, the two are closely related. It also does not assess zone or signal mature vasculature. It has not yet been trained to identify a retinal detachment, retinoblastoma, or other unexpected finding that a human would recognize. More generally, telemedicine requires available cameras and photographers, image quality and resolution must be high, and from the infant鈥檚 point of view, getting photographs is still a repeated intervention. Finally, there is a profound issue related to the way AI represents things, because it turns out that deep learning models do not understand things deeply.
Go continued to be dominated by AI, which only got faster and stronger, until early 2023, when researchers led by Stuart Russell at University of California, Berkeley, surprised the world when their amateur Go player defeated all the leading Go programs using a telling 鈥渃ircular group sandwich鈥 strategy.8 They reasoned that the algorithms could detect patterns but had limited conceptual representations of what was happening; the AI did not understand, for example, the basic strategic concept of a group of pieces, and this weakness could be exploited.8 Without a conceptual understanding, deep learning models, which express things using giant, circuit-type designs, are vulnerable to small aberrations, which can result in mistakes or failures, and because we really do not understand what the circuits do, the mistakes are unpredictable.8 This characteristic has been seen with image recognition, where, for example, an AI algorithm changes an initially correct answer of 鈥減anda鈥 to a highly confident but incorrect answer of 鈥済ibbon鈥 after the introduction of minor 鈥渘oise鈥 pixels into the image.8 What if noise enters a retinal photograph, changing the diagnosis? i-ROP-DL does not understand there are blood vessels, or a retina, or an infant that could go blind. And we fundamentally cannot predict what or when mistakes might happen. Using even immense amounts of training data cannot address this inherent limitation.8
So is ROP AI ready to go? Humans are still the ROP masters, and until AI methodology moves beyond data-driven deep learning design, a fundamental vulnerability to make unpredictable mistakes when presented with new circumstances will persist. Therefore, the safest approach is ideally for humans to remain as involved as possible in each step of ROP management. However, where ophthalmologist supply is limited but retinal camera availability is not, iROP-DL could be used to ration examinations to babies who need them most. It will help to continue updating iROP-DL, add training to identify detachments, retinoblastomas, and other unexpected findings, and incorporate human checks wherever possible. Ultimately, however, as AI keeps advancing, it will eventually surpass us in diagnostic and prognostic abilities, justifiably be used more broadly, and perhaps teach us something new about ROP. I myself will welcome our new AI overlords with open arms. Whatever is best for the babies.
Corresponding Author: Gil Binenbaum, MD, MSCE, Division of Ophthalmology, Children鈥檚 Hospital of Philadelphia, 3500 Civic Center Blvd, 11th Floor, Philadelphia, PA 19104 (binenbaum@chop.edu).
Published Online: March 7, 2024. doi:10.1001/jamaophthalmol.2024.0218
Conflict of Interest Disclosures: Dr Binenbaum is a co-investigator on National Institutes of Health grant R21EY034179 that relates to AI and ROP work.
1.Minh
聽V锘, Kavukcuoglu
聽K锘, Silver
聽D锘,
聽et al. 聽Playing Atari with deep reinforcement learning.聽锘 arXiv. Published online December 19, 2013.
2.Deepmind artificial intelligence @FDOT14. YouTube. Accessed February 3, 2024.
3.Silver
聽D锘, Huang
聽A锘, Maddison
聽CJ锘,
聽et al. 聽Mastering the game of Go with deep neural networks and tree search.聽锘 听狈补迟耻谤别. 2016;529(7587):484-489. doi:
4.AlphaGo. IMDb. Accessed January 16, 2024. 锘
5.Coyner
聽AS锘, Murickan
聽T锘, Oh
聽MA锘,
聽et al. 聽Multinational external validation of autonomous retinopathy of prematurity screening.聽锘 聽JAMA Ophthalmol. Published online March 7, 2024. doi:
6.Quinn
聽GE锘, Ying
聽GS锘, Daniel
聽E锘,
聽et al; e-ROP Cooperative Group. 聽Validity of a telemedicine system for the evaluation of acute-phase retinopathy of prematurity.聽锘 聽JAMA Ophthalmol. 2014;132(10):1178-1184. doi:
7.Silver
聽D锘, Hubert
聽T锘, Schrittwieser
聽J锘,
聽et al. 聽A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.聽锘 听厂肠颈别苍肠别. 2018;362(6419):1140-1144. doi:
8.Marcus
聽G锘. David beats Go-liath. Substack. 2023. Accessed January 15, 2024.