Can co-designed educational interventions help consumers think critically about asking ChatGPT health questions? Results from a randomised-controlled trial
npj Digital Medicine
Julie Ayre, Melody Taba, Brooke Nickel, Geoffrey Edlund, Trang Vu, Julia Yan, Lorna Butters, Ivan C. K. Ma, Kirsten J. McCaffery
Abstract
This randomised controlled trial evaluated two brief co-designed health literacy educational interventions (animation; images) to help people critically reflect on asking ChatGPT health questions. Australian adults with experience of ChatGPT, and without university education, were recruited via an online panel. Primary outcomes were intention to ask ChatGPT questions in ‘lower’ and ‘higher’ risk health scenarios. The analysis sample comprised 592 of 619 participants. The animation group (n = 191) reported lower intention to use ChatGPT for higher risk scenarios (M = 2.42/5, 95%CI: 2.27 to 2.56) compared to the images group (n = 203, M = 2.69/5, 95%CI: 2.54 to 2.83, p = 0.010). Both reported lower intentions compared to a control group who had not viewed the educational content (n = 205, M = 3.12/5, 95%CI: 2.98 to 3.27, p < 0.001). There was no effect on intentions to use ChatGPT for lower risk scenarios (p = 0.800). This study represents an initial step towards addressing health literacy skills for navigating AI tools safely. Australia and New Zealand Clinical Trials registration: ACTRN 12624000641594.
Use of ChatGPT to obtain health information in Australia, 2024: insights from a nationally representative survey
The Medical Journal of Australia
Julie Ayre, Erin Cvejic, Kirsten J McCaffery
Since the launch of ChatGPT in 2022,1 people have had easy access to a generative artificial intelligence (AI) application that can provide answers to most health-related questions. Although ChatGPT could massively increase access to tailored health information, the risk of inaccurate information is also recognised, particularly with early ChatGPT versions, and its accuracy varies by task and topic.2 Generative AI tools could be a further problem for health services and clinicians, adding to the already large volume of medical misinformation.3 Discussions of the benefits and risks of the new technology for health equity, patient engagement, and safety need reliable information about who is using ChatGPT, and the types of health information they are seeking.
To examine the use of ChatGPT in Australia for obtaining health information, we surveyed a nationally representative sample of adults (18 years or older) drawn from the June 2024 wave of the Life in Australia panel.4 Participants who completed the Life in Australia survey online or by telephone were asked how often they used ChatGPT for health information purposes during the preceding six months, the type of questions they asked, and their trust in the responses. Participants who were aware of ChatGPT but had not used it for health information purposes were asked about their intentions to do so in the following six months. Health literacy was assessed using a validated single-item screener: “If you need to go to the doctor, clinic or hospital, how confident are you filling out medical forms by yourself?“5 Demographic information was derived from previously collected panel data. Residential postcode-based socio-economic standing was classified according to the Index of Relative Socio-economic Advantage and Disadvantage (IRSAD; by quintile).6 Participant responses were weighted to the Australian population using propensity scores. Associations between respondent characteristics and survey responses were assessed using simple logistic regression; we report odds ratios (ORs) with 95% confidence intervals (CIs). Analyses were conducted in SPSS 26. Unless otherwise stated, we report unweighted results (further study details: Supporting Information, part 1). Our study was approved by the University of Sydney Human Research Ethics Committee (2024/HE000247).
The quality and safety of using generative AI to produce patient-centred discharge instructions
Abstract
Patient-centred instructions on discharge can improve adherence and outcomes. Using GPT-3.5 to generate patient-centred discharge instructions, we evaluated responses for safety, accuracy and language simplification. When tested on 100 discharge summaries from MIMIC-IV, potentially harmful safety issues attributable to the AI tool were found in 18%, including 6% with hallucinations and 3% with new medications. AI tools can generate patient-centred discharge instructions, but careful implementation is needed to avoid harms.
New Frontiers in Health Literacy: Using ChatGPT to Simplify Health Information for People in the Community
Abstract
Background
Most health information does not meet the health literacy needs of our communities. Writing health information in plain language is time-consuming but the release of tools like ChatGPT may make it easier to produce reliable plain language health information.
Objective
To investigate the capacity for ChatGPT to produce plain language versions of health texts.
Design
Observational study of 26 health texts from reputable websites.
Methods
ChatGPT was prompted to ‘rewrite the text for people with low literacy’. Researchers captured three revised versions of each original text.
Main Measures
Objective health literacy assessment, including Simple Measure of Gobbledygook (SMOG), proportion of the text that contains complex language (%), number of instances of passive voice and subjective ratings of key messages retained (%).
Key Results
On average, original texts were written at grade 12.8 (SD = 2.2) and revised to grade 11.0 (SD = 1.2), p < 0.001. Original texts were on average 22.8% complex (SD = 7.5%) compared to 14.4% (SD = 5.6%) in revised texts, p < 0.001. Original texts had on average 4.7 instances (SD = 3.2) of passive text compared to 1.7 (SD = 1.2) in revised texts, p < 0.001. On average 80% of key messages were retained (SD = 15.0). The more complex original texts showed more improvements than less complex original texts. For example, when original texts were ≥ grade 13, revised versions improved by an average 3.3 grades (SD = 2.2), p < 0.001. Simpler original texts (< grade 11) improved by an average 0.5 grades (SD = 1.4), p < 0.001.
Conclusions
This study used multiple objective assessments of health literacy to demonstrate that ChatGPT can simplify health information while retaining most key messages. However, the revised texts typically did not meet health literacy targets for grade reading score, and improvements were marginal for texts that were already relatively simple.