AI in Neuro-Ophthalmology: Current Practice and Future Opportunities

Title	AI in Neuro-Ophthalmology: Current Practice and Future Opportunities
Creator	Rachel C. Kenney, PhD; Tim W. Requarth, PhD; Alani I. Jack, BA; Sara W. Hyman, BS, BA; Steven L. Galetta, MD; Scott N. Grossman, MD
Affiliation	Departments of Neurology (RCK, AJ, SH, SG, SNG), Population Health (RCK), and Ophthalmology (SG), New York University Grossman School of Medicine, New York, New York; and Vilcek Institute of Graduate Biomedical Sciences (TR), New York University Grossman School of Medicine, New York, New York
Abstract	Background: Neuro-ophthalmology frequently requires a complex and multi-faceted clinical assessment supported by sophisticated imaging techniques in order to assess disease status. The current approach to diagnosis requires substantial expertise and time. The emergence of AI has brought forth innovative solutions to streamline and enhance this diagnostic process, which is especially valuable given the shortage of neuro-ophthalmologists. Machine learning algorithms, in particular, have demonstrated significant potential in interpreting imaging data, identifying subtle patterns, and aiding clinicians in making more accurate and timely diagnosis while also supplementing nonspecialist evaluations of neuro-ophthalmic disease. Evidence acquisition: Electronic searches of published literature were conducted using PubMed and Google Scholar. A comprehensive search of the following terms was conducted within the Journal of Neuro-Ophthalmology: AI, artificial intelligence, machine learning, deep learning, natural language processing, computer vision, large language models, and generative AI. Results: This review aims to provide a comprehensive overview of the evolving landscape of AI applications in neuro-ophthalmology. It will delve into the diverse applications of AI, optical coherence tomography (OCT), and fundus photography to the development of predictive models for disease progression. Additionally, the review will explore the integration of generative AI into neuro-ophthalmic education and clinical practice. Conclusions: We review the current state of AI in neuro-ophthalmology and its potentially transformative impact. The inclusion of AI in neuro-ophthalmic practice and research not only holds promise for improving diagnostic accuracy but also opens avenues for novel therapeutic interventions. We emphasize its potential to improve access to scarce subspecialty resources while examining the current challenges associated with the integration of AI into clinical practice and research.
Subject	Artificial Intelligence / trends; Diagnostic Techniques, Ophthalmological / trends; Eye Diseases / diagnosis; Humans; Machine Learning; Neurology / trends; Ophthalmology; Tomography, Optical Coherence / methods
OCR Text	Show State-of-the-Art Review AI in Neuro-Ophthalmology: Current Practice and Future Opportunities Rachel C. Kenney, PhD, Tim W. Requarth, PhD, Alani I. Jack, BA, Sara W. Hyman, BS, BA, Steven L. Galetta, MD, Scott N. Grossman, MD Background: Neuro-ophthalmology frequently requires a complex and multi-faceted clinical assessment supported by sophisticated imaging techniques in order to assess disease status. The current approach to diagnosis requires substantial expertise and time. The emergence of AI has brought forth innovative solutions to streamline and enhance this diagnostic process, which is especially valuable given the shortage of neuro-ophthalmologists. Machine learning algorithms, in particular, have demonstrated signiﬁcant potential in interpreting imaging data, identifying subtle patterns, and aiding clinicians in making more accurate and timely diagnosis while also supplementing nonspecialist evaluations of neuro-ophthalmic disease. Evidence Acquisition: Electronic searches of published literature were conducted using PubMed and Google Scholar. A comprehensive search of the following terms was conducted within the Journal of Neuro-Ophthalmology: AI, artiﬁcial intelligence, machine learning, deep learning, natural language processing, computer vision, large language models, and generative AI. Results: This review aims to provide a comprehensive overview of the evolving landscape of AI applications in neuro-ophthalmology. It will delve into the diverse applications of AI, optical coherence tomography (OCT), and fundus photography to the development of predictive models for disease progression. Additionally, the review will explore the integration of generative AI into neuro-ophthalmic education and clinical practice. Conclusions: We review the current state of AI in neuroophthalmology and its potentially transformative impact. The inclusion of AI in neuro-ophthalmic practice and research not only holds promise for improving diagnostic accuracy but also opens avenues for novel therapeutic interventions. We emphasize its potential to improve access to scarce subspecialty resources while examining the current challenges associated with the integration of AI into clinical practice and research. Journal of Neuro-Ophthalmology 2024;44:308–318 doi: 10.1097/WNO.0000000000002205 Departments of Neurology (RCK, AJ, SH, SG, SNG), Population Health (RCK), and Ophthalmology (SG), New York University Grossman School of Medicine, New York, New York; and Vilcek Institute of Graduate Biomedical Sciences (TR), New York University Grossman School of Medicine, New York, New York. Supported in part by the NYU Grossman School of Medicine. The authors report no conﬂicts of interest. R. C. Kenney and T. Requarth contributed equally to the work. Address correspondence to Rachel C. Kenney, PhD, Department of Neurology, Department of Population Health, NYU Grossman School of Medicine, 222 E. 41st Street, 14th Floor, New York, NY 10017; E-mail: rachel.kenney@nyulangone.org © 2024 by North American Neuro-Ophthalmology Society 308 A rtiﬁcial Intelligence (AI) has shown considerable promise in aiding the diagnosis and management of neuroophthalmological conditions, such as optic neuropathies, papilledema, and visual pathway lesions. Years of work with AI techniques such as machine learning (ML) have succeeded in automating image analysis of complex visual data, enabling predictive modeling of disease progression, and facilitating early detection and characterization of neurological disorders affecting the visual system. More recently, generative AI models, such as ChatGPT, have shown potential in aiding with clinical reasoning, improving patient communication, and streamlining administrative work. This review will examine the various applications of AI in neuroophthalmology, focusing on AI studies using imaging techniques and possible roles for generative AI, highlighting the transformative potential of AI in enhancing diagnostic capabilities and patient care within neuro-ophthalmology, particularly in clinical settings lacking eye-care providers or subspecialty physicians like neuro-ophthalmologists. AI AND IMAGING IN NEUROOPHTHALMOLOGY Fundus Photography Fundus photos are widely used in clinical neuro-ophthalmic practice; however, current clinical practice in interpretation is primarily descriptive and limited by human analysis. AIdriven support could ﬁll numerous gaps, enhancing disease classiﬁcation and identifying clinically relevant features previously undetectable by human interpretation. Several studies underscore AI’s proﬁciency in analyzing fundus images, particularly in differentiating papilledema from other optic disc diseases and normal controls.1 Deep learning (DL) models, which are a type of ML model using neural networks modeled after human brain networks, have been trained to detect and grade papilledema using fundus images and have shown comparable performance to expert neuro-ophthalmologists.2–4 An algorithm developed by the Brain and Optic Nerve Study with Artiﬁcial Intelligence (BONSAI) using 15,000 fundus images detected papilledema with 87.5% accuracy, 96.4% sensitivity, and 84.7% speciﬁcity.2 AI has also shown potential in distinguishing between glaucomatous optic neuropathy (GON) and nonglaucomatous optic neuropathy (NGON), crucial for distinct Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. State-of-the-Art Review clinical management. A study using a convolutional neural network (CNN) to analyze over 3,000 fundus images achieved an overall accuracy of 99.1% for detecting healthy controls, GON, and NGON.5 This is signiﬁcantly better performance compared to a study where glaucoma specialists and neuro-ophthalmologists correctly identiﬁed only 75% of glaucoma cases from fundus photos.6 In some cases, AI may be able to detect clinically relevant features that elude human perception. One study developed a DL model that had higher sensitivity than a senior glaucoma expert (85.53% vs 71.05%) in differentiating GON from NGON.7 A recently developed AI model trained on 1.6 million unlabeled retinal images (fundus photos and optical coherence tomography) not only outperformed comparable DL models in diagnosing sightthreatening eye diseases but also, remarkably, was able to predict systemic disorders such as heart failure and myocardial infarction from eye images alone.8 Another DL model showed excellent performance in predicting cardiovascular risk factors such as age, gender, smoking status, blood pressure, and cardiac events using features such as the optic disc or blood vessels.9 This is signiﬁcant as these risk factors are not discernible from fundus examination by clinicians. Taken together, these studies suggest that AI-enabled detection of subtle biomarkers can supplement clinician expertise. Potential attainable areas for future applications of AI in fundus photography could develop algorithms to reﬁne clinical trajectories in optic neuropathy, predict visual ﬁeld loss in papilledema, or differentiate causes of monocular optic disc swelling by etiology.10 Given the excellent performance of the small number of AI neuro-ophthalmic studies to date, these models hold signiﬁcant potential to revise the landscape of neuro-ophthalmic diagnosis when applied to these and other new areas. Optical Coherence Tomography Optical coherence tomography (OCT) is a powerful tool for disease detection, monitoring disease progression, assessing treatment efﬁcacy, and determining retinal layer thinning or optic disc swelling in neuro-ophthalmic conditions. However, distinguishing among neuro-ophthalmic diseases with similar characteristics presents a clinical challenge. AI has substantial potential to revolutionize the diagnostic landscape for these conditions. A leading area of AI research utilizing OCT involves multiple sclerosis (MS), a complex and poorly understood autoimmune disorder characterized by central nervous system (CNS) demyelination, which is often evidenced by thinning of the axonal and neuronal retinal layers on OCT. The differentiation of MS from other CNS demyelinating disorders is a diagnostic challenge, yet numerous studies underscore AI’s potential in distinguishing these diseases using OCT. One study demonstrated Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 that DL models could distinguish between disease types using MRI.11 Additionally, researchers have found evidence supporting the use of ML on OCT to differentiate MS in children from other demyelinating disorders.12 Moreover, AI-driven OCT evaluation could assist in differentiation between other etiologies of optic neuropathy and demyelinating optic neuritis, which is common in MS. ML classiﬁers have demonstrated excellent performance in distinguishing nonarteritic anterior ischemic optic neuropathy (NAION) and optic neuritis, suggesting potential for distinguishing between these conditions in the acute setting when neuro-ophthalmology referral capacity may be limited.13 Intriguingly, ML models using OCT could even be used more broadly as a diagnostic tool for MS. One study, using support vector machine analysis incorporating OCT measures and low-contrast visual acuity, showed excellent performance in classifying MS patients from healthy controls (area under the receiver operating characteristic [AUROC] curve score = 0.89).14 Furthermore, a dilated residual convolutional neural network DL model demonstrated the feasibility of classifying prior optic neuritis in MS patients using peripapillary ring scan raw image ﬁles (AUROC = 0.90), outperforming retinal thickness measurements alone.15 In addition to applications in demyelinating disorders such as MS, AI has shown promise in detecting other neuro-ophthalmic conditions using OCT. DL algorithms have been able to detect changes on prediagnostic OCT consistent with papilledema and glaucoma that were not observed by clinicians.16 AI models have also demonstrated improved classiﬁcation performance for distinguishing glaucomatous eyes17 and in distinguishing glaucoma severity level using OCT-angiography, suggesting potential for improving glaucoma progression staging.18 One study using a DL model on OCT images of the optic nerve head classiﬁed papilledema, optic disc drusen, and healthy controls with excellent performance (AUROC = 0.99).19 ML algorithms have shown promise in disease detection, yet the utilization of OCT in clinical practice presents signiﬁcant challenges and opportunities for further AI applications. Future studies could focus on distinguishing between papilledematous and pseudopapilledematous optic nerve heads, including in children. Beyond disease detection and classiﬁcation tasks, ML algorithms can identify and segment retinal layers on OCT,20 which may save clinicians and researchers time in evaluating images. In conclusion, AI applications in OCT-based neuro-ophthalmology have shown remarkable promise in enhancing disease diagnosis and differentiation, but further research is needed to address the challenges of integrating these technologies into clinical practice and to explore additional transformative applications. 309 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. State-of-the-Art Review Ocular Motility Testing Formal eye movement recording, including scleral search coil, video-oculography (VOG), and smartphone-based systems, has become increasingly prevalent in neuroophthalmology clinics. These tools provide objective, quantiﬁable data, helping to diagnosis neurological conditions and detect subtle abnormalities not identiﬁed during standard clinical examination. Saccadic parameters like velocity, trajectory, amplitude, accuracy, and latency could be usefully analyzed with AI to help reﬁne the differential diagnosis, in collaboration with movement disorders specialists. However, inaccurate measurement, need for calibration and difﬁculty with interpretation, and a scarcity of clinical expertise pose signiﬁcant challenges. Thus, neuroophthalmologists are currently limited in their capacity to answer referral questions including whether saccadic changes represent abnormal or normal ﬁndings. DL and computer interpretation offer a powerful solution to detect neurological conditions manifesting with ocular motility abnormalities. Recent work has validated a convolutional neural network (CNN), ConVNG, enabling smartphone video nystagmography to assess ocular motility with accuracy comparable to conventional VOG.21 This technology could both empower more precision diagnosis, while the ubiquity of smartphones ensures accessibility to clinicians without expertise. Future studies could focus on further validation and optimization of AI-based tools, their integration into real-world health care systems, and the training of clinicians for their use. AI has made signiﬁcant strides in neuro-ophthalmology, particularly in imaging and diagnostics, offering valuable support to clinicians and enhancing patient outcomes (Table 1). However, it is important to acknowledge the limitations inherent in these studies, such as small sample sizes, which may reduce power or can lead to overﬁtting of data and inaccurate results, and biases in collected data that may be ampliﬁed in ML models, limiting the generalizability of results. For example, many studies to date have been single-center and the neuro-ophthalmology community will do well to expand to multicenter studies to reduce biases if possible. Despite these common challenges, the application of AI in neuro-ophthalmic imaging may signiﬁcantly enhance disease prediction and diagnostic accuracy, promising to transform this ﬁeld with more precise and efﬁcient diagnostic tools. GENERATIVE AI IN NEUROOPHTHALMOLOGY Introduction Large Language Models (LLMs), a type of generative AI that produces human-like text, have demonstrated remarkable abilities within specialized domains despite being developed as general-use technologies. Popular LLMs 310 include OpenAI’s ChatGPT (GPT-3.5 and GPT-4), Microsoft’s Bing, Google’s Bard and Gemini, and Anthropic’s Claude. Although performance varies, these LLMs all work in fundamentally the same way. LLMs undergo training using vast datasets of both public and nonpublic texts, learning to predict the subsequent word from a sequence of words and the surrounding context. It is important to emphasize that LLMs are not information retrieval systems that access their training data directly, but probabilistic prediction engines that produce syntactically and semantically plausible text. This distinction is crucial because less desirable qualities of LLMs, such as “hallucinations” (generation of plausible-sounding but false information) may be inherent to the technology and difﬁcult to discriminate without human review. Despite their shortcomings, LLMs have exhibited potential in many areas of health care, and the following section will review LLM studies relevant to the subspecialty ﬁeld of neuro-ophthalmology. Assessment of Large Language Model Knowledge of Neuro-Ophthalmology Large language models have demonstrated impressive performance on ophthalmology board-style examination questions, with frontier models such as GPT-4 achieving accuracy comparable to human test-takers.22–36 However, LLMs generally perform worse on neuro-ophthalmology subsections compared to their overall performance in other ophthalmic subspecialties (Table 2). This discrepancy may be attributed to a lack of sufﬁcient specialized neuroophthalmological content in the training data or the types of reasoning required. Providing more context beyond typical board-style questions may improve LLM performance.37–39 Neuro-ophthalmological questions also appear on the neurology board examination. In this context, the opposite pattern is observed, with LLMs answering more questions correctly on the neuro-ophthalmological subsections than on the test overall.40–42 The reasons for these differences in board performance across subspecialties remain unclear. Most studies assessing LLMs’ neuro-ophthalmic knowledge were unable to upload images to the models, either omitting image-based questions or providing text descriptions. LLMs performed worst without image descriptions, slightly better with text descriptions, and closest to human performance when image-based questions were excluded, suggesting that the human-AI performance gap in some studies may be partly attributed to the inclusion of imagebased questions. Although a few studies have performed image analysis of ocular pathologies using the vision capabilities of newer LLMs,43–45 the results have been mixed. Michalache et al found that GPT-4’s performance on ophthalmic multiple-choice questions was 82% on nonimagebased questions but only 65% for image-based questions, with a similar trend in the neuro-ophthalmological Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Reference OCT Algorithm(s) Findings Ciftci Kavaklioglu et al, 2022 To identify and differentiate structural retinal features in MS, MOGAD, NMOSDs, and monoADS using ML on OCT measures in children Multiple supervised machine learning classiﬁers Motamedi et al, 2022 To distinguish ON from healthy control eyes using DL on peripapillary ring scans Dilated residual convolutional neural network Girard et al, 2023 To develop a DL algorithm to identify ONH structures using OCT and to differentiate ODD, papilledema, and healthy ONHs Deep learning segmentation algorithm (Unet++) and classiﬁcation algorithm (random forest) Li, Tandon, Sun, Dinkin, & Oliveira, 2024 To predict disease progression toward papilledema or glaucoma using AI on OCT Modiﬁed convolutional neural network, VGG-19 model Bhargava et al, 2015 Determination of cross-sectional and longitudinal agreement of retinal layer thicknesses derived by a machine learning algorithm, applied to 2 spectral-domain OCT devices Machine learning classiﬁcation of MS from normal controls using SD-OCT and visual measured derived scores Multilayer segmentation, Random Forest Best performance: Random Forest classiﬁer with recursive feature elimination: 75% accuracy for DDs 80% accuracy for MS Classiﬁcation network: AUC of 0.86, accuracy of 0.85 pRNFL: AUC of 0.77 Segmentation: Dice coefﬁcient of 0.93 ± 0.03 on test set Classiﬁcation: AUC of 0.99 ± 0.001 for ODD detection, 0.99 ± 0.005 for papilledema detection, 0.98 ± 0.01 for detection of healthy ONHs Papilledema: 0.714 precision, 0.769 recall when model trained with RNFL thickness map; AUC of 0.826 Glaucoma: 0.682 precision, 0.857 recall when trained with extracted vertical tomogram; AUC of 0.785 Mean differences between 2 scanners: Cross-sectional, 22.16 to 0.26 mm Longitudinal, 20.195 to 0.21 mm Logistic regression, support vector machine AUC of 0.89 (95% CI 0.85–0.93) Sensitivity of 81% Speciﬁcity of 80% Composite score performed better than individual OCT measurements State-of-the-Art Review Study Objective Kenney et al, 2022 311 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. TABLE 1. Comparison of AI model and performance in neuro-ophthalmology studies using imaging Reference Bowd et al, 2022 Andrade De Jesus et al, 2020 Jalili et al, 2024 Fundus Photography Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Study Objective Algorithm(s) Findings Comparison of convolutional neural network analysis of vessel density images to GBC analysis of OCT-A vessel density measurements and OCT retinal nerve ﬁber layer thickness measurements for classifying healthy and glaucomatous eyes Development of an artiﬁcial intelligencebased procedure for classifying glaucomatous vascular damage based on Zeiss Cirrus 5000 HD-OCT imaging Evaluation of the classiﬁcation performance of a machine learning algorithm based on vessel density features of OCT-A images for classifying healthy, nonarteritic anterior ischemic optic neuropathy, and optic neuritis Convolutional neural network Gradient Boosting Classiﬁer (GBC) AUPRC: 0.97 (95% CI 0.95–0.99) GBC AUPRCs $0.87 for vessel density and RNFL thickness measurements Support Vector Machine Random Forest Gradient Boosting AUROC: 0.89 ± 0.06 AUROC: 0.86 ± 0.06 AUROC: 0.85 ± 0.07 Support Vector Machine (SVM) Random Forest (RF) Gaussian Naive Bayes All models achieved AUC of 1 and accuracy of 1, at a 50% threshold in discriminating ON from NAION SVM and RF achieved AUC of 1 and accuracy of 1 at 100% threshold in discriminating ON from normal SVM and RF achieved AUC of 1 and accuracy of 1 at 50% threshold in discriminated NAION from normal Validation set: AUC of 0.99 (95% CI, 0.98 to 0.99) for discriminating papilledema from normal and abnormal discs AUC of 0.99 (95% CI, 0.99 to 0.99) for discriminating normal from abnormal discs External testing data set: AUC of 0.96 (95% CI, 0.95 to 0.97), 96.4% sensitivity (95% CI, 93.9 to 98.3), and 84.7% speciﬁcity (95% CI, 82.3 to 87.1) for detecting papilledema Accuracy to detect GON: 93.4% sensitivity, 81.8% speciﬁcity, AUPRC showed 0.874 average precision Milea et al, 2020 To classify normal optic discs, papilledema, and abnormal optic discs using DL on fundus images Segmentation network, U-Net Classiﬁcation network DenseNet Yang et al, 2020 To differentiate between NGON and GON using DL on fundus images Convolutional neural network (ResNet-50 architecture) State-of-the-Art Review 312 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. (Continued ) Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Reference Study Objective Algorithm(s) Findings Vali et al, 2023 To differentiate GON from NGON using DL on fundus images Optic disc segmentation network and 6 classiﬁcation networks: VGG, ResNet, Inception, MobileNet, DenseNet, and Vision Transformer Poplin et al, 2018 Prediction of cardiovascular risk factors in retinal images such as age, blood pressure, and other predictors through a deep learning model Deep-neural-network model encompassing 3 models including 2 classiﬁcation and one regression model Retinal imaging Zhou et al, 2023 Self-supervised learning model RETFound Brain imaging Seok et al, 2023 Development of a foundation model for retinal images that learns from unclassiﬁed retinal images and provides a basis for a label-efﬁcient model that is adapted for disease detection To differentiate MS from NMOSD using DL on brain MRI data Algorithm with best performance, DenseNet121: 95.36% sensitivity, 95.35% precision, 92.19% speciﬁcity, 95.40% F1 score; External validation data set: 85.53% sensitivity and 89.02% speciﬁcity for DLS; 71.05% sensitivity and 82.21% speciﬁcity for glaucoma specialist Age, mean absolute error within 3.26 Systolic blood pressure, mean absolute error within 11.23 mm Hg AUC: 0.97 for gender AUC: 0.71 for smoking status AUC: 0.70 for major adverse cardiac events AUROC: 0.943 (95% CI 0.941, 0.944), 0.822 (95% CI 0.815, 0.829), and 0.884 (95% CI 0.88, 0.887) on several datasets for diabetic retinopathy classiﬁcation Pupil tracking Friedrich et al, 2023 To assess a framework based on DL and computer vision in calculating SPV from smartphone-based nystagmography Convolution neural network (modiﬁed ResNet18) ConVNG AUC of 0.85 Accuracy of 76.1% Sensitivity of 77.3% Speciﬁcity of 74.8% 76.9% PPV and 78.6% NPV Tracking accuracy: 9%–15% of average pupil diameter Median precision: 0.30°/s For all SPV calculations, ConVNG was equivalent to VOG This table includes objectives, model design, and performance results of AI studies focusing on imaging applications in neuro-ophthalmology. AI, artiﬁcial intelligence; AUC, area under the curve; AUPRC, area under the precision recall curve; AUROC, area under the receiver operating characteristic; CI, conﬁdence interval; CNN, convolutional neural network; DD, demyelinating disorders; DL, deep learning; DLS, deep learning system; FCN, fully convolutional network; GBC, gradient boosting classiﬁer; GON, glaucomatous optic neuropathy; ML, machine learning; MOGAD, myelin oligodendrocyte glycoprotein antibody-associated disease; monoADS, monophasic acquired demyelinating syndromes; MS, multiple sclerosis; NAION, nonarteritic anterior ischemic optic neuropathy; NGON, nonglaucomatous optic neuropathy; NMOSD, neuromyelitis optical spectrum disorders; NPV, negative predictive value; OCT, optical coherence tomography; OCT-A, optical coherence tomography angiography; ODD, optic disc drusen; ON, optic neuritis; ONH, optic nerve head; PPV, positive predictive value; pRNFL, peripapillary retinal nerve ﬁber layer; RNFL, retinal nerve ﬁber layer; SA, self-attention; SD-OCT, spectral domain optical coherence tomography; SPV, slow-phase velocity; VOG, video-oculography. State-of-the-Art Review 313 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. (Continued ) State-of-the-Art Review TABLE 2. Comparison of accuracy scores on neuro-ophthalmological board exam-style questions for various large language models (LLMs) and human performance across different content sources Reference LLM Antaki et al22 GPT-3.5 GPT-4 (temp=0.3) Human Antaki et al23 GPT-3.5 GPT-4 Cai et al24 Bing GPT-3.5 GPT-4 Human GPT-3.5 GPT-4 Haddad et al25 Lin26 Mihalache27 Mihalache28 Mihalache29 Moshirfar et al30 Raimondi et al31 Singer et al32 Taloni et al33 Teebagy et al34 Thirunavukarasu et al35 Sakai D et al36 GPT-3.5 GPT-4 Human Google Bard Google Gemini GPT-4 GPT-3.5 GPT-3.5 GPT-4 Human GPT-3.5 GPT-4 Google Bard Bing Chat Aeyeconsult GPT-4 GPT-3.5 GPT-4 Human GPT-3.5 GPT-4 GPT-3.5 GPT-4 LLaMA PaLM 2 Human GPT-3.5 GPT-4 Human Chen et al40 GPT-4 Human Schubert et al41 GPT-3.5 GPT-4 Human 314 Content BCSC OphthoQuestions BCSC OphthoQuestions BCSC OphthoQuestions BCSC OphthoQuestions BCSC OphthoQuestions BCSC Ophthalmology Board Review Q&A (Glass) BCSC EyeQuiz OphthoQuestions OphthoQuestions StatPearls FRCOphth exam part 2 OpthoQuestions BCSC BCSC FRCOphth exam part 2 Japanese past ophthalmology board exam questions (in Japanese) Board Vitals, neurology board review Board Vitals, neurology board review Accuracy: Overall Accuracy: Neuro-Ophth 58.8% 50.4% 75.8% 70.0% 73.3% 63% 55.8% 42.7% 59.4% 49.2% 71.2% 58.8% 71.6% 72.2% 46.8% 62.9% 50% 30% 65% 65% 73% 64% 40% 10% 48% 32% 70% 48% 56% 68% 57% 54% 63.1% 76.9% 72.6% 80% 80% 85% 46% 55.5% 73.2% 58.3% 49.6% 79.1% 51.9% 82.9% 83.4% 69.2% 65.9% 82.4% 75.7% 56.9% 81% 48% 69% 32% 56% 76% 22.4% 45.8% 65.7% 60.7% 67.9% 72.3% 67% 47% 86% 43% 69% 79% 59% 50% 100% 42% 100% w82% w68% 53.7%* 76.8%* 75.5%* 30.8% 100% 67% 92% 33% 67% 75% 22.5% 47.8% Not reported 65.8% 72.6% 74% Not reported 66.8% 85.0% 73.8% 63.0% 87.0% 72.4% Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. State-of-the-Art Review (Continued ) Reference Fonseca42 LLM GPT-3.5 Human Content American Academy of Neurology’s Question of the Day application Accuracy: Overall Accuracy: Neuro-Ophth 71.3% 69.2% 80%† Not reported *“Neuro-ophthalmology” subsection grouped with “Orbit” subsection. † “Neuro-ophthalmology” subsection grouped with “neuro-otology” subsection. The table includes results from both ophthalmology and neurology board questions. The performance of GPT-3.5, GPT-4, and other LLMs such as Bing, Google Bard, Google Gemini, LLaMA, and PaLM 2 is compared against either historical or respondent human accuracy on different question sets, including BCSC, OphthoQuestions, EyeQuiz, StatPearls, FRCOphth exam part 2, and Japanese past ophthalmology board exam questions. The accuracy scores are further broken down into overall performance and speciﬁc neuro-ophthalmology performance where available. The last 3 studies focus on neurology boards. BCSC, American Academy of Ophthalmology’s Basic and Clinical Science Course series; FRCOphth, Fellowship of The Royal College of Ophthalmologists. subsection (69% for nonimage-based questions, 54% for image-based questions).45 This suggests that the chatbot’s image-analysis capabilities are inferior to its text-analysis capabilities, implying limitations in a heavily examdependent subspecialty like neuro-ophthalmology. Despite the near-human performance of LLMs on board-style questions, their substantial hallucination rates, ranging from 18% in GPT-4 to 42.4% in GPT-3.5,26 and the difﬁculty for nonspecialists to differentiate fact from ﬁction raise concerns about the risks of over-reliance on LLMs in clinical settings. One area of promise is custom models. One such model, Aeyeconsult, is powered by GPT-4 with retrieval augmented generation, a technique that allows GPT to access reliable information from ophthalmology textbooks.32 Aeyeconsult demonstrated better performance than GPT-4 overall on ophthalmic questions, including those related to neuroophthalmology. However, this system still has its limitations. Despite the custom model being provided with all the information needed to correctly respond to questions, it did not achieve a perfect score due to failures in retrieving the right information or failure in accurately synthesizing that information into its output. In addition, the model still gave inconsistent answers when asked the same question multiple times. Future research should focus on understanding whether custom models can overcome the limitations of LLMs, or whether such limitations are unavoidable when using generative AI technology, especially in a complicated and abstract ﬁeld like neuro-ophthalmology. Another area of promise is in using LLMs as triage tools rather than diagnostic aids. Zandi et al46 found that GPT4 and Google’s Bard were signiﬁcantly better at providing appropriate triage recommendations than correct diagnoses for common ophthalmology scenarios. GPT-4 displayed an impressive 85% rate of appropriate triage, while Bard achieved 68.75%. In contrast, GPT-4’s diagnostic accuracy was only 54.75%, and Bard’s was even lower at 43.75%, likely falling below the range of clinical Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 acceptability. This is consistent with a previous study that found GPT-4’s diagnostic accuracy and especially triage recommendations across ophthalmic conditions was superior to ophthalmology trainees (18 residents and 4 fellows).47 Although these studies did not report neuroophthalmological results separately, they may be particularly relevant to help identify cases that could beneﬁt from neuro-ophthalmological care among an undifferentiated population of patients with visual complaints. They also raise the intriguing possibility that LLM use by nonspecialist physicians could help prioritize referrals to neuroophthalmologists, who are scarce in most states, with associated delayed care and incorrect diagnoses.48,49 One ﬁnal resource, Glass.health provides AI-driven clinician support for expanding differential diagnosis across clinical specialties, including neuro-ophthalmology and may provide practical support during busy clinical days. Potential of Large Language Models in Patient Communication and Education Patients often turn to the internet for health advice50 and are likely turning to AI chatbots for health-related queries as well. A study by the Mayo Clinic posed common patient questions across various ophthalmic subspecialties to GPT-4 and asked expert raters to judge GPT-4’s response appropriateness in the context of a patient information site.51 GPT-4 performed well overall, with expert rates judging its responses appropriate 79% of the time. However, it performed more poorly with nuanced queries, such as those related to neuro-ophthalmology, where its responses were deemed appropriate only 67% of the time (Fig. 1). The model made errors like overgeneralizing treatments, listing rare causes ﬁrst instead of prevalent ones, and omitting critical information. Another study found comparable results with GPT-3.5—which patients may be most likely to use, as it is free—although this study did not report results on neuro-ophthalmology separately.52 These studies highlight the promise of GPT in 315 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. State-of-the-Art Review FIG. 1. LLM performance in answering patient-style questions. A. A patient-style question input to GPT-4, yielding an appropriate response. B. A patient-style question input to GPT-4, resulting in a less appropriate response due to the absence of a critical inquiry about the presence of the symptom in one eye alone or only when both eyes are open. Question stems from Tailor et al. assisting with patient questions but also suggest that its use can lead to patient misinformation and potential harm. Even if LLMs have shortcomings for direct patient queries, AI chatbots may be suitable for generating draft replies for patient portal messaging systems. This capability would be a boon because health care providers are already overwhelmed with answering patient questions,53 often involving providing medical advice outside of clinical hours.54 The increasing volume of in-basket messages has been associated with burnout.55,56 One recent study from 2023 drew attention when it showed that ChatGPT responses to patient questions on a social media medicine forum were judged higher quality and more empathetic than answers by medical professionals.57 A study by Tailor et al58 compared the quality and empathy of responses to neuro-ophthalmology questions provided by expert-edited LLM responses, human experts, and various commercially available LLMs. In terms of quality and empathy, expert-edited responses outperformed both LLMs and human experts alone. Interestingly, this study found that using LLMs did not save time, as editing AI-generated drafts took as long as composing new responses. This ﬁnding is counterintuitive but aligns with other recent studies, suggesting that LLM use may not increase efﬁciency.59,60 However, these same studies found that LLM use did decrease cognitive burden and perceived burnout. The use of LLMs in patient communication must be carefully monitored. One study found that physician responses systematically changed when using LLMs to draft patient replies, indicating automation bias and anchoring.61 Some LLM responses could even 316 lead to severe harm or death, highlighting the risks of over-reliance on AI in patient care. As LLM outputs inﬂuence everything from draft replies to operative notes and discharge summaries,62 these studies remind us that human + AI collaboration can challenge assumptions and demonstrate the need for careful study in real-world settings. Large language models also show promise in patient education. Tao et al63 evaluated the potential of GPT-3.5 to automate patient education handouts on common neuro-ophthalmic diseases. A fellowship-trained neuroophthalmologist assessed 51 generated handouts across 17 conditions using the “Quality of Generated Language Outputs for Patients” (QGLOP) tool. The mean QGLOP score of 11.9 out of 16 points (74.4%) suggests a moderate level of satisfaction with the write-up quality; however, the handouts still require ﬁnal review and editing before dissemination. Further, their mean readability score (SMOG) of 10.9 years of education exceeded the accepted upper limit of grade 8 reading level for health-related patient handouts. Improved GPT models and prompting techniques may enhance these metrics, as shown by GPT-4’s ability to transform scientiﬁc abstracts in neuroophthalmology from a 12th grade to an 8th grade reading level without signiﬁcant loss of content.64 While the promise to generate custom patient education materials at any reading level and in any language is on the horizon, validation will be needed, especially for languages with limited training data, as highlighted by a study on ChatGPT’s performance in diagnosing retinal vascular diseases using Chinese prompts.65 Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. State-of-the-Art Review CONCLUSIONS Large language models have shown both impressive potential and signiﬁcant limitations in neuro-ophthalmology. While demonstrating basic ophthalmology knowledge, they struggle with the subspecialty’s nature and may not aid specialists’ clinical reasoning. However, with further validation, LLMs could be valuable for nonspecialists, streamlining health care by efﬁciently directing patients to neuroophthalmologists, while triaging those who would be better treated by more generalist clinicians. LLMs also show promise in patient communication and education. Because human + AI interactions are often unpredictable, and risks including data confabulation are unlikely to be completely eliminated, rigorous research in real-world settings is necessary for safe and effective integration into practice. Looking ahead, generative AI may have unforeseen applications, such as using text-to-image models to help visualize patientreported neuro-ophthalmic phenomena. Such knowledge may enable physicians to better understand and empathize with their patient.66 As LLMs rapidly evolve, neuroophthalmologists must stay informed and involved to harness their beneﬁts and avoid their pitfalls. STATEMENT OF AUTHORSHIP Conception and design: R. C. Kenney, T. Requarth, S. N. Grossman; Acquisition of data: R. C. Kenney, T. Requarth, A. Jack, S. Hyman, S. Galetta, S. N. Grossman; Analysis and interpretation of data: R. C. Kenney, T. Requarth, A. Jack, S. Hyman, S. N. Grossman. Drafting the manuscript: R. C. Kenney, T. Requarth, A. Jack, S. Hyman, S. Galetta, S. N. Grossman; Revising the manuscript for intellectual content: R. C. Kenney, T. Requarth, A. Jack, S. Hyman, S. Galetta, S. N. Grossman. Final approval of the completed manuscript: R. C. Kenney, T. Requarth, A. Jack, S. Hyman, S. Galetta, S. N. Grossman. REFERENCES 1. Bouthour W, Biousse V, Newman NJ. Diagnosis of optic disc oedema: fundus features, ocular imaging ﬁndings, and artiﬁcial intelligence. Neuroophthalmol. 2023;47:177–192. 2. Milea D, Najjar RP, Zhubo J, et al, BONSAI Group. Artiﬁcial intelligence to detect papilledema from ocular fundus photographs. N Engl J Med. 2020;382:1687–1695. 3. Vasseneix C, Najjar RP, Xu X, et al, BONSAI Group. Accuracy of a deep learning system for classiﬁcation of papilledema severity on ocular fundus photographs. Neurology. 2021;97:e369–e377. 4. Echegaray S, Zamora G, Yu H, Luo W, Soliz P, Kardon R. Automated analysis of optic nerve images for detection and staging of papilledema. Invest Ophthalmol Vis Sci. 2011;52:7470–7478. 5. Yang HK, Kim YJ, Sung JY, Kim DH, Kim KG, Hwang J-M. Efﬁcacy for differentiating nonglaucomatous versus glaucomatous optic neuropathy using deep learning systems. Am J Ophthalmol. 2020;216:140–146. 6. O’Neill EC, Danesh-Meyer HV, Kong GX, et al, Optic Nerve Study Group. Optic disc evaluation in optic neuropathies: the optic disc assessment project. Ophthalmology. 2011;118:964–970. 7. Vali M, Mohammadi M, Zarei N, et al. Differentiating glaucomatous optic neuropathy from non-glaucomatous optic Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 neuropathies using deep learning algorithms. Am J Ophthalmol. 2023;252:1–8. 8. Zhou Y, Chia MA, Wagner SK, Ayhan MS, Williamson DJ, Struyven RR, Liu T, Xu M, Lozano MG, Woodward-Court P, Kihara Y, UK Biobank Eye & Vision Consortium, Altmann A, Lee AY, Topol EJ, Denniston AK, Alexander DC, Keane PA. A foundation model for generalizable disease detection from retinal images. Nature. 2023;622:156–163. 9. Poplin R, Varadarajan AV, Blumer K, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat Biomed Eng. 2018;2:158–164. 10. Tisavipat N, Stiebel-Kalish H, Palevski D, et al. Acute optic neuropathy in older adults: differentiating between MOGAD optic neuritis and nonarteritic anterior ischemic optic neuropathy. Neurol Neuroimmunol Neuroinﬂamm. 2024;11:e200214. 11. Seok JM, Cho W, Chung YH, et al. Differentiation between multiple sclerosis and neuromyelitis optica spectrum disorder using a deep learning model. Scientiﬁc Rep. 2023;13:11625. 12. Ciftci Kavaklioglu B, Erdman L, Goldenberg A, et al. Machine learning classiﬁcation of multiple sclerosis in children using optical coherence tomography. Mult Scler. 2022;28:2253–2262. 13. Jalili J, Nadimi M, Jafari B, et al. Vessel density features of optical coherence tomography angiography for classiﬁcation of optic neuropathies using machine learning. J Neuroophthalmol. 2024;44:41–46. 14. Kenney RC, Liu M, Hasanaj L, et al. The role of optical coherence tomography criteria and machine learning in multiple sclerosis and optic neuritis diagnosis. Neurology. 2022;99:e1100–e1112. 15. Motamedi S, Yadav SK, Kenney RC, et al. Prior optic neuritis detection on peripapillary ring scans using deep learning. Ann Clin Translational Neurol. 2022;9:1682–1691. 16. Li A, Tandon AK, Sun G, Dinkin MJ, Oliveira C. Early detection of optic nerve changes on optical coherence tomography using deep learning for risk-stratiﬁcation of papilledema and glaucoma. J Neuroophthalmol. 2024;44:47–52. 17. Bowd C, Belghith A, Zangwill LM, et al. Deep learning image analysis of optical coherence tomography angiography measured vessel density improves classiﬁcation of healthy and glaucoma eyes. Am J Ophthalmol. 2022;236:298–308. 18. Andrade De Jesus D, Sánchez Brea L, Barbosa Breda J, et al. OCTA multilayer and multisector peripapillary microvascular modeling for diagnosing and staging of glaucoma. Translational Vis Sci Technol. 2020;9:58. 19. Girard MJA, Panda S, Tun TA, et al. Discriminating between papilledema and optic disc drusen using 3D structural analysis of the optic nerve head. Neurology. 2023;100:e192–e202. 20. Bhargava P, Lang A, Al-Louzi O, et al. Applying an open-source segmentation algorithm to different OCT devices in multiple sclerosis patients and healthy controls: implications for clinical trials. Mult Scler Int. 2015;2015:136295–136310. 21. Friedrich MU, Schneider E, Buerklein M, et al. Smartphone video nystagmography using convolutional neural networks: ConVNG. J Neurol. 2023;270:2518–2530. 22. Antaki F, Milad D, Chia MA, et al. Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards human-level medical question answering. Br J Ophthalmol. 2023:bjo-2023-324438. doi: 10.1136/bjo-2023-324438. 23. Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol Sci. 2023;3:100324. 24. Cai LZ, Shaheen A, Jin A, et al. Performance of generative large Language Models on ophthalmology board–style questions. Am J Ophthalmol. 2023;254:141–149. 25. Haddad F, Saade JS. Performance of ChatGPT on ophthalmology-related questions across various examination levels: observational study. JMIR Med Educ. 2024;10:e50842. 26. Lin JC, Younessi DN, Kurapati SS, Tang OY, Scott IU. Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination. Eye. 2023;37:3694–3695. 317 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited. State-of-the-Art Review 27. Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard artiﬁcial intelligence chatbot performance in ophthalmology knowledge assessment. Eye. 2024:1–6. 28. Mihalache A, Huang RS, Popovic MM, Muni RH. Performance of an upgraded artiﬁcial intelligence chatbot for ophthalmic knowledge assessment. JAMA Ophthalmol. 2023;141:798–800. 29. Mihalache A, Popovic MM, Muni RH. Performance of an artiﬁcial intelligence chatbot in ophthalmic knowledge assessment. JAMA Ophthalmol. 2023;141:589–597. 30. Moshirfar M, Altaf AW, Stoakes IM, Tuttle JJ, Hoopes PC. Artiﬁcial intelligence in ophthalmology: a comparative analysis of GPT-3.5, GPT-4, and human expertise in answering StatPearls questions. Cureus. 2023;15:e40822. 31. Raimondi R, Tzoumas N, Salisbury T, Di Simplicio S, Romano MR, North East Trainee Research in Ophthalmology Network NETRiON. Comparative analysis of large language models in the Royal College of Ophthalmologists fellowship exams. Eye. 2023;37:3530–3533. 32. Singer MB, Fu JJ, Chow J, Teng CC. Development and evaluation of Aeyeconsult: a novel ophthalmology chatbot leveraging veriﬁed textbook knowledge and GPT-4. J Surg Educ. 2024;81:438–443. 33. Taloni A, Borselli M, Scarsi V, et al. Comparative performance of humans versus GPT-4.0 and GPT-3.5 in the self-assessment program of American Academy of Ophthalmology. Sci Rep. 2023;13:18562. 34. Teebagy S, Colwell L, Wood E, Yaghy A, Faustina M. Improved performance of ChatGPT-4 on the OKAP examination: a comparative study with ChatGPT-3.5. J Acad Ophthalmol (2017). 2023;15:e184–e187. 35. Thirunavukarasu AJ, Mahmood S, Malem A, et al. Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: a head-to-head cross-sectional study. PLoS Digit Health. 2024;3:e0000341. 36. Sakai D, Maeda T, Ozaki A, et al. Performance of ChatGPT in board examinations for specialists in the Japanese ophthalmology society. Cureus. 2023;15:e49903. 37. Madadi Y, Delsoz M, Lao PA, et al. ChatGPT assisting diagnosis of neuro-ophthalmology diseases based on case reports. medRxiv. 2023:2023.09.13.23295508. 38. Shukla R, Mishra AK, Banerjee N, Verma A. The comparison of ChatGPT 3.5, Microsoft bing, and Google Gemini for diagnosing cases of neuro-ophthalmology. Cureus. 2024;16:e58232. 39. Jiao C, Edupuganti NR, Patel PA, Bui T, Sheth V. Evaluating the artiﬁcial intelligence performance growth in ophthalmic knowledge. Cureus. 2023;15:e45700. 40. Chen TC, Multala E, Kearns P, et al. Assessment of ChatGPT’s performance on neurology written board examination questions. BMJ Neurol Open. 2023;5:e000530. 41. Schubert MC, Wick W, Venkataramani V. Performance of large Language Models on a neurology board-style examination. JAMA Netw Open. 2023;6:e2346721. 42. Fonseca Â, Ferreira A, Ribeiro L, Moreira S, Duque C. Embracing the future-is artiﬁcial intelligence already better? A comparative study of artiﬁcial intelligence performance in diagnostic accuracy and decision-making. Eur J Neurol. 2024;31:e16195. 43. Sorin V, Kapelushnik N, Hecht I, et al. GPT-4 multimodal analysis on ophthalmology clinical cases including text and images. medRxiv. 2023. 44. Waisberg E, Ong J, Masalkhi M, et al. Automated ophthalmic imaging analysis in the era of Generative Pre-Trained Transformer-4. Pan Am J Ophthalmol. 2023;5:46. 45. Mihalache A, Huang RS, Popovic MM, et al. Accuracy of an artiﬁcial intelligence chatbot’s interpretation of clinical ophthalmic images. JAMA Ophthalmol. 2024;142:321–326. 46. Zandi R, Fahey JD, Drakopoulos M, et al. Exploring diagnostic precision and triage proﬁciency: a comparative study of GPT-4 and bard in addressing common ophthalmic complaints. Bioengineering. 2024;11:120. 318 47. Lyons RJ, Arepalli SR, Fromal O, Choi JD, Jain N. Artiﬁcial intelligence chatbot performance in triage of ophthalmic conditions. Can J Ophthalmol. 2023. 48. DeBusk A, Subramanian PS, Scannell Bryan M, Moster ML, Calvert PC, Frohman LP. Mismatch in supply and demand for neuro-ophthalmic care. J Neuroophthalmol. 2022;42:62–67. 49. Stunkel L, Mackay DD, Bruce BB, Newman NJ, Biousse V. Referral patterns in neuro-ophthalmology. J Neuroophthalmol. 2020;40:485–493. 50. Thapa DK, Visentin DC, Kornhaber R, West S, Cleary M. The inﬂuence of online health information on health decisions: a systematic review. Patient Educ Couns. 2021;104:770–784. 51. Tailor PD, Xu TT, Fortes BH, et al. Appropriateness of ophthalmology recommendations from an online chat-based artiﬁcial intelligence model. Mayo Clin Proc Digit Health. 2024;2:119–128. 52. Bernstein IA, Zhang Y, Govil D, et al. Comparison of ophthalmologist and large language model chatbot responses to online patient eye care questions. JAMA Netw Open. 2023;6:e2330320. 53. North F, Luhman KE, Mallmann EA, et al. A retrospective analysis of provider-to-patient secure messages: how much are they increasing, who is doing the work, and is the work happening after hours? JMIR Med Inform. 2020;8:e16521. 54. Akbar F, Mark G, Warton EM, et al. Physicians’ electronic inbox work patterns and factors associated with high inbox work duration. J Am Med Inform Assoc. 2021;28:923–930. 55. Tai-Seale M, Dillon EC, Yang Y, et al. Physicians’ well-being linked to in-basket messages generated by algorithms in electronic health records. Health Aff (Millwood). 2019;38:1073–1078. 56. Adler-Milstein J, Zhao W, Willard-Grace R, Knox M, Grumbach K. Electronic health records and burnout: time spent on the electronic health record after hours and message volume associated with exhaustion but not with cynicism among primary care clinicians. J Am Med Inform Assoc. 2020;27:531–538. 57. Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artiﬁcial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183:589–596. 58. Tailor PD, Dalvin LA, Starr MR, et al. A comparative study of large language models, human experts, and expert-edited large language models to neuro-ophthalmology questions. J Neuroophthalmol. 2024. 59. Garcia P, Ma SP, Shah S, et al. Artiﬁcial intelligence-generated draft replies to patient inbox messages. JAMA Netw Open. 2024;7:e243201. 60. Tai-Seale M, Baxter SL, Vaida F, et al. AI-generated draft replies integrated into health records and physicians’ electronic communication. JAMA Netw Open. 2024;7:e246565. 61. Chen S, Guevara M, Moningi S, et al. The effect of using a large language model to respond to patient messages. Lancet Digit Health. 2024;6:e379–e381. 62. Singh S, Djalilian A, Ali MJ. ChatGPT and ophthalmology: exploring its potential with discharge summaries and operative notes. Semin Ophthalmol. 2023;38:503–507. 63. Tao BK, Handzic A, Hua NJ, Vosoughi AR, Margolin EA, Micieli JA. Utility of ChatGPT for automated creation of patient education handouts: an application in neuro-ophthalmology: response. J Neuroophthalmol. 2024;44:119–124. 64. Spina A, Tang J, Picton B, Spiegel S. Using ChatGPT to improve patient accessibility to neuro-ophthalmology research. J Neurol Sci. 2023:455. 65. Xiaocong L, Jiageng W, An S, et al. Transforming retinal vascular disease classiﬁcation: a comprehensive analysis of ChatGPT’s performance and inference abilities on nonEnglish clinical environment. medRxiv. 2023:2023.06.28.23291931. 66. Waisberg E, Ong J, Masalkhi M, et al. Text-to-image artiﬁcial intelligence to aid clinicians in perceiving unique neuroophthalmic visual phenomena. Ir J Med Sci. 2023;192:3139–3142. Kenney et al: J Neuro-Ophthalmol 2024; 44: 308-318 Copyright © North American Neuro-Ophthalmology Society. Unauthorized reproduction of this article is prohibited.
Date	2024-09
Date Digital	2024-09
References	Bouthour W, Biousse V, Newman NJ. Diagnosis of optic disc oedema: fundus features, ocular imaging findings, and artificial intelligence. Neuroophthalmol. 2023;47:177-192. Milea D, Najjar RP, Zhubo J, et al., BONSAI Group. Artificial intelligence to detect papilledema from ocular fundus photographs. N Engl J Med. 2020;382:1687-1695. Vasseneix C, Najjar RP, Xu X, et al., BONSAI Group. Accuracy of a deep learning system for classification of papilledema severity on ocular fundus photographs. Neurology. 2021;97:e369-e377. Echegaray S, Zamora G, Yu H, Luo W, Soliz P, Kardon R. Automated analysis of optic nerve images for detection and staging of papilledema. Invest Ophthalmol Vis Sci. 2011;52:7470-7478. Yang HK, Kim YJ, Sung JY, Kim DH, Kim KG, Hwang J-M. Efficacy for differentiating nonglaucomatous versus glaucomatous optic neuropathy using deep learning systems. Am J Ophthalmol. 2020;216:140-146.
Language	eng
Format	application/pdf
Type	Text
Publication Type	Journal Article
Source	Journal of Neuro-Ophthalmology, September 2024, Volume 44, Issue 3
Collection	Neuro-Ophthalmology Virtual Education Library: Journal of Neuro-Ophthalmology Archives: https://novel.utah.edu/jno/
Publisher	Lippincott, Williams & Wilkins
Holding Institution	North American Neuro-Ophthalmology Association. NANOS Executive Office 5841 Cedar Lake Road, Suite 204, Minneapolis, MN 55416
Rights Management	© North American Neuro-Ophthalmology Society
ARK	ark:/87278/s64dr5ev
Setname	ehsl_novel_jno
ID	2901236
Reference URL	https://collections.lib.utah.edu/ark:/87278/s64dr5ev