Evaluación de los conocimientos de varios sistemas de inteligencia artificial sobre una subespecialidad de la medicina de urgencias y emergencias: la toxicología clínica

Santiago Nogué Xarau; Montserrat Amigó Tadín; José Ríos Guillermo

Evaluación de los conocimientos de varios sistemas de inteligencia artificial sobre una subespecialidad de la medicina de urgencias y emergencias: la toxicología clínica

Nogué-Xarau S ^[1] ; Amigó-Tadín M ^[2] ; Ríos-Guillermo J ^[3]
1. [1] Fundación Española de Toxicología Clínica, Barcelona, España.
2. [2] Área de Urgencias, Hospital Clínic, Barcelona, España.
3. [3] Departamento de Farmacología Clínica, Hospital Clínic, Barcelona, España.
Mostrar afiliaciones +
Localización: Revista Española de Urgencias y Emergencias, ISSN-e 2951-6552, ISSN 2951-6544, Vol. 3, Nº. 1, 2024, págs. 15-19
Idioma: español
DOI: 10.55633/s3me/reue.a005.2024
Títulos paralelos:
- Assessing 4 artificial intelligence systems’ knowledge of a subspecialty of emergency medicine: clinical toxicology
Enlaces
- Texto completo (pdf)

Dialnet Métricas: 4 Citas

Resumen
- español
  OBJETIVO. La inteligencia artificial (IA) es una disciplina de la informática que se encarga de crear sistemas capaces de realizar tareas que se atribuyen a la inteligencia humana. El objetivo principal de este estudio ha sido evaluar las respuestas de algunas IA a preguntas del campo de la toxicología clínica (TC).
  
  MATERIAL Y MÉTODOS. Se han valorado cuatro aplicaciones de IA: ChatGPT, Bing, LuzIA y Bard. Para evaluar sus conocimientos en TC se les formularon 30 preguntas sobre diversos aspectos de la TC. Cada pregunta ofrecía cinco opciones de respuesta, de las cuales sólo una era correcta. Se evaluó el acierto/error en la respuesta, así como si había apoyo bibliográfico. Si se detectaban respuestas erróneas, se reformuló la misma pregunta, pero utilizando otra forma de lenguaje para evaluar de nuevo la respuesta y ver si la misma era sensible a la calidad de la pregunta. Los datos se introdujeron en una base SPSS para su análisis estadístico. Se consideró significativo un valor de p < 0,05.
  
  RESULTADOS. Los porcentajes de respuestas acertadas fueron del 70% (Bing), 67% (ChatGPT y LuzIA) y 57% (Bard), sin diferencias estadísticamente significativas. Al reformular las preguntas en los casos en los que la respuesta de la IA había sido errónea, los porcentajes de aciertos subieron en los cuatro sistemas, pero sin diferencias significativas. En sus respuestas, Bing ofreció el acceso directo a tres citas bibliográficas y Bard a cuatro, pero su presencia en PubMed era muy baja (7,2% y 0,85% respectivamente).
  
  CONCLUSIONES. Los cuatro sistemas de IA han mostrado una capacidad de acierto en más del 50% de las preguntas formuladas de TC. No obstante, el soporte bibliográfico que proporcionan es escaso y de muy baja calidad.
- English
  BACKGROUND AND OBJECTIVE. Artificial intelligence (AI) is a branch of computer technology that develops systems able to perform tasks associated with human intelligence. The main objective of this study was to evaluate AI answers to questions related to clinical toxicology.
  
  MATERIALS AND METHODS. We evaluated 4 AI applications: ChatGPT, Bing, LuzIA, and Bard. Thirty multiple-choice test questions in Spanish about various aspects of clinical toxicology were presented to the applications, and the answers were assessed. Each question included 5 possible answers, 1 of which was correct. In addition to correctness, we evaluated the bibliographic support each application provided. If the application gave an incorrect answer, we rephrased the question, presented it again, and reevaluated the new answer to detect whether question quality influenced performance. Data were recorded for analysis with SPSS.
  
  The level of statistical significance was set at P < .05.
  
  RESULTS. The scores achieved by the AI applications were as follows: Bing, 70%; ChatGPT and LuzIA, 67% each; and Bard, 57% (P > .05). The scores improved after the incorrect questions were rephrased, but the differences were not significant. Bing included direct access to 3 references per question and Bard to 4. However, only 7.2% and 0.85% of the references, respectively, were to PubMed-indexed sources.
  
  CONCLUSIONS. All 4 AI applications were able to correctly answer more than half the questions about clinical toxicology. After rephrasing some questions, each system achieved more correct answers. The supporting references the applications provided were few and of poor quality.
Referencias bibliográficas
- Chat GPT. Autodefinición de inteligencia artif icial. (Consultado 17 Diciembre 2023). Disponible en:
- Howard J. Artificial intelligence: Implications for the future of work. Am J Ind Med. 2019; 62:917-26.
- Pham KT, Nabizadeh A, Selek S. Artificial intelligence and chatbots in psychiatry. Psychiatr Q. 2022;93:249-53.
- Zawiah M, Al-Ashwal FY, Gharaibeh L, Abu Farha R, Alzoubi KH, Abu Hammour K, et al. ChatGPT and clinical training: perception, concerns, and...
- Tsiknakis N, Trivizakis E, Vassalou EE, Papadakis GZ, Spandidos DA, Tsatsakis A, et al. Interpretable artificial intelligence framework for...
- Buelga ML, Ramírez J, Alonso GL. El reloj inteligente (smartwatch) ante el bloqueo auriculoventricular completo: un reto por encima de sus...
- Moreno E, Pueyo I, Sánchez M, Martín M, Masip J. Experiencia de Mediktor®: un nuevo evaluador de síntomas basado en inteligencia artificial...
- Nogué S. Toxicólogo y urgenciólogo: una nueva variante del cangrejo ermitaño. Emergencias. 2009;21:62-4.
- Brin D, Sorin V, Vaid A, Soroush A, Glicksberg BS, Charney AW, et al. Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments....
- González-Díaz A, Matos-Castro S, Arruabarrena-Urrestarazu N, González-Valladares E, Molina-Padilla S, Ferrer-Dufol A, et al. Evolución de...
- Marcos DN, Cuerpo S, Coll-Vinent B. Coma en paciente nomen nescio. Rev Esp Urg Emerg. 2023;2:56-7.
- Piraianu AI, Fulga A, Musat CL, Ciobotaru OR, Poalelungi DG, Stamate E, et al. Enhancing the evidence with algorithms: how artificial intelligence...
- Ayers JW, Zhu Z, Poliak A, Leas EC, Dredze M, Hogarth M, et al. Evaluating artificial intelligence responses to public health questions. JAMA...
- Kumar M, Nguyen TPN, Kaur J, Singh TG, Soni D, Singh R, et al. Opportunities and challenges in application of artificial intelligence in pharmacology....
- Alam M, Hallak JA. AI-automated referral for patients with visual impairment. Lancet Digit Health. 2021;3:e2-e3.
- Augusto JB, Davies RH, Bhuva AN, Knott KD, Seraphim A, Alfarih M, et al. Diagnosis and risk stratification in hypertrophic cardiomyopathy...
- Sechopoulos I, Teuwen J, Mann R. Artificial intelligence for breast cancer detection in mammography and digital breast tomosynthesis: State...
- Soffer S, Morgenthau AS, Shimon O, Barash Y, Konen E, Glicksberg BS, et al. Artificial intelligence for interstitial lung disease analysis...
- Myszczynska MA, Ojamies PN, Lacoste AMB, Neil D, Saffari A, Mead R, et al. Applications of machine learning to diagnosis and treatment of...
- Roman A, Al-Sharif L, Al Gharyani M. The Expanding Role of ChatGPT (Chat-Generative Pre-Trained Transformer) in Neurosurgery: A systematic...
- Fraser K, Bruckner DM, Dordick JS. Advancing predictive hepatotoxicity at the intersection of experimental, in silico, and artificial intelligence...
- Rokhshad R, Ducret M, Chaurasia A, Karteva T, Radenkovic M, Roganovic J, et al. Ethical considerations on artificial intelligence in dentistry:...
- Sonnenschein K, Stojanovic SD, Dickel N, Fiedler J, Bauersachs J, Thum T, et al. Artificial intelligence identifies an urgent need for peripheral...
- Nogue-Xarau S, Amigó-Tadin M. Ríos-Guillermo J. ¿Puede la inteligencia artificial ayudar al urgenciólogo en el diagnóstico de las intoxicaciones?...
- Mello MM, Guha N. ChatGPT and physicians' malpractice risk. JAMA Health Forum. 2023;4:e231938.
- Organización Mundial de la Salud (OMS). Informe sobre Inteligencia Artificial (IA) aplicada a la salud y seis principios rectores relativos...
- Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with chatgpt...
- Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, et al. How Does ChatGPT Perform on the United States Medical Licensing Examination?...
- Chen A, Chen DO. Accuracy of chatbots in citing journal articles. JAMA Netw Open. 2023;6:e2327647.
- Ryan DK, Maclean RH, Balston A, Scourfield A, Shah AD, Ross J. Artificial intelligence and machine learning for clinical pharmacology. Br...

Mi Enfispo

Selección

Opciones de artículo

Seleccionado

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Acceso de usuarios registrados

Evaluación de los conocimientos de varios sistemas de inteligencia artificial sobre una subespecialidad de la medicina de urgencias y emergencias: la toxicología clínica

Mi Enfispo

Opciones de artículo

Opciones de compartir

Opciones de entorno