0% found this document useful (0 votes)

10 views18 pages

Combinación de IA y Apoyo Humano en Salud Mental

The study by Palmer et al. (2024) evaluates a digital intervention combining AI and human support to alleviate symptoms of generalized anxiety, showing significant clinical effectiveness compared to traditional therapies. Participants experienced a large reduction in anxiety symptoms, with results indicating the digital program is comparable to face-to-face cognitive behavioral therapy while reducing clinician time. This research highlights the potential of scalable digital solutions to improve access to mental healthcare globally amidst rising demand.

Uploaded by

Rosa Isela

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views18 pages

Combinación de IA y Apoyo Humano en Salud Mental

Uploaded by

Rosa Isela

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024.

The copyright holder for this preprint

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Clare E Palmer1, Emily Marshall1, Edward Millgate1, Graham Warren1, Michael P. Ewbank1, Elisa Cooper 1, Samantha
Lawes1, Malika Bouazzaoui1, Alastair Smith1, Chris Hutchins-Joss1, Jessica Young1, Morad Margoum2, Sandra
Healey2, Louise Marshall 1, Shaun Mehew1, Ronan Cummins1, Valentin Tablan1, Ana Catarino1, Andrew E Welchman1
and Andrew D Blackwell1

ieso Digital Health, The Jeffrey's Building, Cowley Road, Cambridge, CB4 0DS, UK
1

2
Dorset HealthCare University NHS Foundation, Sentinel House, Nuffield Industrial Estate, Nuffield Road, Poole, UK

Escalating global mental health demand exceeds existing clinical capacity. Scalable digital solutions will be essential
to expand access to high-quality mental healthcare. This study evaluated the effectiveness of a digital intervention
to alleviate mild, moderate and severe symptoms of generalized anxiety. This structured, evidence-based program
combined an Artificial Intelligence (AI) driven conversational agent to deliver content with human clinical oversight
and user support to maximize engagement and effectiveness. The digital intervention was compared to three
propensity-matched real-world patient comparator groups: i) waiting control; ii) face-to-face cognitive behavioral
therapy (CBT); and iii) remote typed-CBT. Endpoints for effectiveness, engagement, acceptability, and safety were
collected before, during and after the intervention, and at one-month follow-up. Participants (n=299) used the
program for a median of 6 hours over 53 days. There was a large clinically meaningful reduction in anxiety symptoms
for the intervention group (per-protocol (n=169): change on GAD-7 = –7.4, d = 1.6; intention-to-treat (n=299): change
on GAD-7 = –5.4, d = 1.1) that was statistically superior to the waiting control, non-inferior to human-delivered care,
and was sustained at one-month follow-up. By combining AI and human support, the digital intervention achieved
clinical outcomes comparable to human-delivered care while significantly reducing the required clinician time.
These findings highlight the immense potential of technology to scale effective evidence-based mental healthcare,
address unmet need, and ultimately impact quality of life and economic burden globally.

NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

1
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

delivery of evidence-based protocols through digital

tools, offers the opportunity to reduce heterogeneity
across the provision of mental healthcare worldwide, and
Mental health conditions are the economic and accelerate large-scale scientific research to further
healthcare challenge of our time. Globally, one in eight enhance treatment quality and personalization (19). High-
people live with a mental health condition (1), yet only one quality, accessible digital mental healthcare has the
in four who require treatment receive it (2). Advances in potential to maximize impact globally by both improving
technology and widespread internet access have been patient quality of life and reducing the growing economic
pivotal in increasing access to high-quality mental burden of mental health on health systems and society
healthcare. However, one-to-one mental healthcare is (20,21).
inherently limited in its ability to meet the rising mental
health demand, and there remains a significant shortage In this study, we evaluated a digital program that uses this
of therapists: there are only four psychiatrists per approach to alleviate mild, moderate and severe
100,000 people globally (3), and 58% of the US symptoms of generalized anxiety in adults. The program
population live within a health workforce shortage area was designed to maximize engagement and
(4). Technology is primed to enable massive scaling of effectiveness by using i) a structured evidence-based
mental health interventions to increase both access and program drawing on principles from traditional CBT (22)
quality of support worldwide (5). including third wave approaches i.e. Acceptance and
Commitment Therapy (ACT) (23), and ii) an AI-powered
Rapid advances in computing and Artificial Intelligence conversational agent to deliver the program content in a
(AI) in recent years have led to a rise in the development personalized way. In addition, a dedicated human clinical
of digital interventions aiming to solve this scalability and user support service was designed to wrap around
problem, and there are an estimated 10,000–20,000 the digital program, following previous research that
smartphone applications available for mental health human support significantly improves engagement with
support (6,7). These solutions have the potential to enable digital interventions (12,24). This service was developed
timely access to support when needed, negate the to provide appropriate support while maintaining the
logistical challenges of attending regular appointments, scalability of the digital solution.
offer greater patient choice, and reduce burden on
therapists and healthcare services (8). However, real- This study aimed to measure engagement, clinical
world usage, and in turn effectiveness, of many digital effectiveness, acceptability and safety of this digital
mental health solutions – most of which are self-led - has intervention. Evidence of the effectiveness of a digital
been poor (9–11). Despite a reported willingness of intervention is often established through the comparison
patients to adopt smartphone applications (12), one between the intervention and a waitlist control or self-led
month retention rates are typically under 6% (13). non-digital treatment only. However, if digital programs
Moreover, a recent meta-analysis of mental health are to provide a scalable solution to global mental health
applications for symptoms of anxiety and depression need, we should expect them to provide comparable
found a small pooled clinical effect size ( g=0.26) and effectiveness to current standards of care. In this
highlighted that only 48% delivered content based on pragmatic, prospective single-intervention arm study, we
Cognitive Behavioral Therapy (CBT) principles (14) – a compared the digital program against propensity-
“gold-standard” evidence-based approach for anxiety matched external control data from three groups of real-
and depression (15). Improving access is crucial, but world NHS patients: i) a waiting control with no
equally vital is ensuring the support available to patients intervention; ii) patients receiving human-delivered face-
is engaging and effective. to-face CBT; and iii) patients receiving human-delivered
typed-CBT. While 1:1 face-to-face therapy serves as the
NHS Talking Therapies (NHS TT, formerly IAPT) is a gold-standard for comparison, 1:1 typed-therapy provides
world-leading initiative designed to increase access to a more analogous comparison to the digital program
and improve delivery of mental health treatment in the UK. under evaluation where content is predominantly
Fundamental to the success of NHS TT is systematic delivered through written communication with the
outcomes-monitoring, use of evidence-based treatment conversational agent. This study design allowed us to
protocols, and an appropriately trained and supervised evaluate the comparative clinical effectiveness of the
workforce (16). The acceleration of telehealth and digital intervention to human-delivered standard care.
expansion of care delivery through digital platforms (e.g.
typed conversations) has also enabled insights into the
relationship between the active components of evidence-
based treatments and clinical outcomes (17,18).
Combining this approach with the scalable, systematic
2
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

program from detailed qualitative analysis of these

interviews are reported in a separate publication.

The study was pre-registered (ISRCTN ID: 52546704)

and obtained ethical approval prior to recruitment (IRAS
This was a pragmatic, single-intervention arm study with ID: 327897, NHS Research Ethics Committee: West of
multiple external control groups to measure the Scotland REC 4). The trial design and participant
engagement, clinical effectiveness, acceptability and CONSORT flowchart (25) are summarized in Figure 1. In
safety of a digital program to alleviate symptoms of line with the Declaration of Helsinki, all participants
generalized anxiety in a sample of 300 UK participants. provided signed informed consent and were debriefed
This study was conducted by ieso Digital Health (“ieso”, following the study.
https://www.iesogroup.com/), an outpatient service
provider within NHS TT delivering 1:1 human-delivered
CBT via a typed modality to treat patients with common
mental health disorders. The digital program evaluated Anxiety and mood symptoms were measured before and
here (software name: IDH-DP2-001) was developed by after the intervention, as well as at the beginning of each
ieso as part of a clinical innovation program creating new module within the program (maximum 6 symptom check-
scalable digital solutions for mental health support. This ins) using the Generalized Anxiety Disorder-7 scale
was an externally controlled trial meaning comparator (GAD-7) (26) and the Patient Health Questionnaire (PHQ-
arms (sometimes referred to as synthetic control arms) 9) (27) scale. The Work and Social Adjustment Scale
were generated through 1:1 propensity-matching of (WSAS) (28) and the inflexibility scale (30 items) of the
participants with real-world patients. External propensity- Multidimensional Psychological Flexibility Inventory
matched control groups were generated to evaluate the (MPFI) (29) were collected pre-intervention, during (at the
digital intervention in comparison to no intervention (i.e. program mid-point) and post-intervention, as measures
waiting control), face-to-face CBT (gold-standard of functioning and psychological inflexibility, respectively.
benchmark), and typed-CBT. This latter group provides The following validated self-report measures were
an important comparator as it is an example of human- collected only at post intervention: the User Engagement
delivered care that closely mirrors the written content Scale (UES) (30), the System Usability Scale (SUS) (31),
delivery within the digital program. and the Service-User Technology Acceptability
Questionnaire (SUTAQ) (32). A qualitative feedback
The intervention was delivered via a smartphone survey was also administered post-intervention and at
application (iPhone & Android). Following an initial clinical one-month follow-up. Demographic data were collected
assessment with a qualified clinician, eligible participants at enrolment and are summarized in Table 1. Findings
downloaded the software on their personal smartphone from the SUS, UES, SUTAQ, MPFI, feedback surveys and
and completed the program in their own time and qualitative data from pre- and post-intervention semi-
according to a defined schedule. Participants were structured interviews are reported in a separate
required to complete the six-module program within nine publication.
weeks. Clinical outcomes were collected as part of the
program prior to each module (up to six time-points).
Additional study endpoints were collected using
validated questionnaires prior to the intervention, during
The intervention consisted of a six-module digital
the intervention (following completion of module three
program (‘ieso Digital Program’; software name: IDH-
activities), after the intervention (completion or when
DP2-001) that used a conversational agent to guide
nine-week time limit reached) and at one-month follow-
participants through a pre-defined set of activities with
up.
human clinical oversight and user support. The program
was intended as a first-line intervention for people
At the point of consent, all participants were asked if they
primarily presenting with anxiety symptoms. The program
were willing to participate in interviews with additional
was designed based on cognitive behavioral principles
compensation offered. The sub-sample (based on first-
from traditional CBT and third wave approaches, such as
come-first-served sign-up for available interview slots)
ACT (33,34) (see Supplementary Table 1 for module
attended a semi-structured interview pre- and post-
details). All of the cognitive and behavioral processes,
intervention to gather qualitative insights into the
analogies and examples within the intervention were
experience, acceptability and perceived safety of the
selected for their specificity in targeting symptoms of
digital program. Findings on acceptability of the digital
generalized anxiety.

3
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Figure 1. CONSORT diagram. Prior to assessment enrolment avenues differed for external recruits (left) and patients referred to ieso for typed
therapy (either from an NHS Provider or via a self-direct referral; right). External recruits signed-up specifically for the study via an external webpage
following social media or email advertisements. All potential participants irrespective of enrolment avenue were triaged for suitability based on a
Self-Assessment Questionnaire (SAQ). For patients, only those deemed to be potentially eligible were invited to participate. Participants were
withdrawn either actively (requested to withdraw), passively (dropped-out or disengaged from study procedures), clinician-led (withdrawn based
on clinician recommendation), or other (due to reasons such as technical issues). TAU = treatment as usual.

4
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

The six modules consisted of an introduction module, program was registered as a UKCA marked Class 1
three core modules, and two consolidation modules medical device.
(Figure 2). The three core modules each consisted of
three sessions that followed the pattern of i) learning, ii)
activity, and iii) practice. The two consolidation modules
consisted of two sessions. There were 16 sessions total. To ensure participant safety and maximize engagement
The introduction and consolidation modules consisted of and acceptability of the program, a dedicated human user
sessions designed for onboarding and learning and clinical support service was provided. Prior to
consolidation, respectively. All modules began with a enrolment, as part of the screening process, all
symptom “check-in” consisting of the GAD-7 and PHQ-9 participants received a standardized clinical assessment
within the software immediately before the first session by a trained clinician with an accredited postgraduate
within that module. Sessions were made available on a qualification via typed modality. The clinician assessed
timed schedule subject to completing the prior session the individual’s needs, determined if they were eligible for
(Figure 2). the study and obtained informed consent. Research
coordinators provided fortnightly check-in calls to all
Within each session, the software used a conversational participants throughout the program and sent weekly
agent to guide participants through a combination of emails or SMSs to remind participants only if they
videos, educational content, conversations, and deviated from the program schedule. Risk could be
worksheets written by accredited clinicians. The software flagged through symptom monitoring of GAD-7 and
used AI models for Natural Language Understanding, PHQ-9 scores or through interaction with the research
specific and tailored elements of Natural Language coordinators during check-in calls or ad hoc
Generation and a dialogue management system. Part way communication. Flagged risk was escalated to a clinician
through enrolment, with agreement from the overseeing for review. Where appropriate the participant would then
NHS Research Ethics Committee, the software was be contacted for further risk assessment by a clinician to
updated to fix bugs, improve the user experience within ensure their safety. Participants could also request an
the introductory module, and update select AI models. appointment with a clinician at any point to discuss their
The final 60 participants enrolled were offered the journey, particularly if they were unsure the program was
updated software. Software version was controlled for in working for them. At the end of the study, all participants
statistical analyses. The digital program was built in were offered a further discharge appointment with a
accordance with ISO 13485. Prior to the study, the study clinician to discuss the next steps for their care.

Figure 2. Schematic of ieso Digital Program with human clinical and user support service and study procedures. All participants received a clinical
assessment prior to enrolment and were offered a discharge appointment with a clinician following the program. Clinicians were available via
asynchronous messaging or for a review appointment whenever needed. All participants received email or SMS reminders and fortnightly check-
in calls throughout the program to maximize engagement delivered via the research team. The ieso Digital Program included 6 modules with a total
of 16 sessions. Each module started with a symptom check-in consisting of the GAD-7 and PHQ-9

5
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

The support service and study procedures are illustrated conducted prior to retrospective analysis of external
in Figure 2. In total, delivering the intervention required an control data to estimate the total sample size needed to
average of 97 minutes (1.6 hours) of clinician time (i.e. time quantify clinical effectiveness (i.e. change in GAD-7 total
spent in sessions with participants) per participant. This score) compared to an active external control. Clinical
included 299 assessments (mean 66 mins; range 31–105 effectiveness was defined as a change in GAD-7 score
mins), 46 review appointments (mean 32 mins; 14–60 over either the course of six treatment sessions or until
mins) and 174 discharge appointments (mean 44 mins; recovery was reached (if sooner than 6 sessions). A non-
range 13–76 mins). inferiority margin of a 1.8 change in GAD-7 total score was
chosen based on previous literature ((38–40); see
Supplementary Methods for more details). Using data
from patients being treated for GAD via typed-CBT, with
Adults with mild to severe symptoms of anxiety, at least six sessions or recovery, we estimated an
consistent with Generalized Anxiety Disorder (GAD), were expected standard deviation of GAD-7 change of 5.14. To
invited to participate either following referral to ieso’s estimate a sample size, we used the following equation:
𝑍𝛼 +𝑍𝛽 2
typed therapy service (either referred to ieso from the 𝑛 = 2( ) (see (41)), where Zα and Zβ are the
(𝛿+ ∆)/𝜎
NHS Provider or via self-referral direct to ieso) or in
standard normal scores for the one-sided significance
response to online advertisements or email invitation
level of 2.5% (1.96) and power of 90% (1.28) respectively, δ
through the NIHR BioResource for Translational
is the non-inferiority level 1.8 and σ is the standard
Research (https://bioresource.nihr.ac.uk/). Only
deviation 5.14. A sample size of 172 was estimated for the
participants with a main problem descriptor of GAD were
study intervention to enable a non-inferiority analysis of
eligible as established through a clinician assessment in
clinical effectiveness compared to human-delivered care.
line with the NHS TT manual (16).

All participants met the following eligibility criteria:

• over the age of 18 years at point of recruitment;
At ieso, experts-by-lived experience are involved in
• GAD-7 total score > 7;
research and development work as members of a PPI
• PHQ-9 total score < 16;
panel and as partners advising on ongoing work. For this
• access to a smartphone and internet connection;
study, all participant facing documents were reviewed by
• registered with a General Practitioner in the UK;
members of the PPI panel. In addition, focus groups with
• not currently receiving psychological therapy;
members of the PPI panel during study conceptualization
• did not have PTSD, OCD or Panic Disorder;
aimed to understand participant needs and expectations
• did not have a change in psychiatric medication in the context of “keeping safe” whilst using the digital
in the past 1 month; program, and helped developed recruitment marketing
• did not display significant risk of harm to self, to campaigns.
others or from others (as established with the
clinical assessment).

Any individuals who had previously participated in user

External comparator data were taken from two NHS TT
research for the digital program were excluded.
service providers: i) ieso typed therapy data where a
Participants were recruited between 10th October 2023
patient receives CBT through 1:1 communication with a
and 2nd February 2024. Financial incentive up to a total of
qualified therapist using real-time text-based messaging;
£60 was provided in the form of vouchers based on study
and ii) Dorset Healthcare University NHS Foundation
assessments and completion of modules within the digital
Trust (DHC) delivering face-to-face routine therapy
program. For a sub-sample that participated in additional
appointments. The information captured through the
interviews, an additional £15 voucher per semi-structured
dataset of NHS TT is intended to support the monitoring
interview was provided.
of the implementation and effectiveness of national policy
and legislation, policy development, performance
analysis and benchmarking, national analysis and
statistics, and national audit of NHS TT services. At
Previous studies have reported up to a 70% attrition rate registration, patients agree to the services’ terms and
when measuring engagement and adherence in mental conditions, including the use of deidentified data for
health digital programs (35–37), therefore we aimed to research and audit purposes, including academic
enroll 300 participants with the expectation of a 40-70% publications or conference presentations. External
attrition rate, resulting in a final sample of 90-180 control data were obtained from patients referred to: a)
participants. A non-inferiority power analysis was ieso’s typed therapy service between January 2022 and
6
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

December 2023, and b) DHC between January 2017 and 7, from baseline to final score, and estimating a within-
December 2021. subject effect size (Cohen’s d). The threshold for a
clinically meaningful reduction in symptoms was defined
as a change greater than the reliable change index of the
GAD-7 scale (minimum of a 4-point reduction; Toussaint
et al., 2020). Clinical outcomes were calculated using the
Analyses were conducted in R (42). A statistical analysis
following definitions: a) improvement was defined as a
plan was defined prior to final analyses being conducted.
reduction on the PHQ-9 or GAD-7 scales greater than or
equal to the reliable change index ( ≥4 for GAD-7; ≥6 for
PHQ-9) and no reliable increase on either measure; b)
The per-protocol (PP) sample (n=169) was defined as recovery was defined as reduction on both scales to
participants who completed the minimum meaningful below the clinical cutoff (GAD-7 score <8; PHQ-9 score
clinical dose of the program (MMCD) and the final post- <10); c) reliable recovery was defined as having both
intervention GAD-7 and PHQ-9 questionnaires. This improved and recovered; d) responder rate was defined
dose was defined a priori by three accredited cognitive as an improvement of either ≥4 on the GAD-7 or ≥6 on
behavioral therapists who evaluated the content of the the PHQ-9; and e) remission rate was defined as having
program to determine the amount of content required to either a final GAD-7 score <8 or final PHQ-9 score <10 for
deliver meaningful clinical improvement on the GAD-7 those only having started above the clinical cut-off.
scale based on their clinical experience (mean experience Definitions for improvement, recovery and reliable
of 14 years delivering psychological therapy). Based on recovery are equivalent to those used in NHS TT (44). A
this evaluation, the MMCD was defined as completing within-subjects effect-size for mean change in GAD-7
modules 1 to 3 in the digital program and the module 4 scores from post-intervention to one month follow-up
check-in. was calculated to determine the short-term durability of
any effects of the digital intervention. We also measured
The intention-to-treat (ITT) sample (n=299) included all effectiveness by calculating the change in PHQ-9 and
participants who completed questionnaires at enrolment WSAS between baseline and final score, as well as
irrespective of adherence to the digital program except between comparator groups. For the ITT sample, when
for one participant who requested that their data be calculating GAD-7 and PHQ-9 effectiveness, missing
deleted. Due to missing data for the pre-intervention post-intervention scores were imputed using last
WSAS (external recruits only), the ITT sample for all WSAS observation carried forward, such that the final score
analyses was n=295. collected prior to disengagement or withdrawal was used.

Metrics of adherence were primarily assessed with To determine whether any demographic or study
descriptive statistics of in-software usage metrics: variables were associated with adherence or
median and distribution of time spent in the digital effectiveness, a series of regression analyses were
program in hours, days since initialization of the program conducted. All regression models included age, gender,
(defined based on the date that the software was highest qualification, employment status, religion,
downloaded); and proportion of participants completing presence of a chronic physical health condition, ethnicity,
each session, module, and check-in. An “engaged” reported disability, sexuality, baseline GAD-7 severity,
software version, and enrolment path (referred to ieso’s
patient is defined as an individual who has received the
typed therapy service or externally recruited) as
minimum amount of therapy such that pre- and post-
predictors. Linear regression models were used to
treatment measures can be collected, and clinical
predict continuous dependent variables: i) number of
outcomes estimated (16). Here we used a comparable
sessions completed; ii) change in GAD-7 score from
definition of engagement based on usage of the program
baseline to final score. A logistic regression model was
(including time in the program, content delivered, and
used to predict non-adherence (i.e. participants who did
number of outcomes measured) defined as completing
not complete the necessary program sessions or study
session 1 of module 2 in the program. This is in contrast to
assessments to be in the PP sample; non-adherence
the MMCD definition which is defined based on both
coded as 1). Due to unequal sample sizes within
usage and expected improvement in symptoms.
demographic sub-categories (e.g. sexuality), groups were
truncated to aid in the interpretability of findings and
power of analyses. To determine if adherence across
Clinical effectiveness was quantified by calculating the sessions differed between groups, adherence rates were
change in anxiety symptoms, measured using the GAD- compared between the digital program, face-to-face CBT

7
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

and typed CBT by estimating the slope and confidence Enrolled participants in the intervention group were
intervals of the association between the proportion of propensity-matched to patients from these control
sample that completed each GAD-7 assessment (either groups using baseline GAD-7 scores, baseline PHQ-9
symptom check-in within the digital program or prior to scores, age, and the presence of a chronic physical health
each therapy session as standard within NHS TT) and condition (yes/no/not known). Propensity-matching was
session number. Sessions were aligned such that each conducted using the ‘MatchIT’ package (45) in R with
symptom check-in within the program was associated ‘nearest neighbor’ methodology (average treatment
with a treatment session for the control group. effect in treated patients). For the waitlist control only
participants in the PP sample were matched (n=169) due
to limited available data for matching. For the human-
delivered therapy control groups all participants were
Safety was assessed using reported serious adverse matched (n=299). Supplementary Table 2 illustrates the
events, software deficiencies, and number of cases matching of comparator groups to the intervention
withdrawn based on clinician assessment of suitability to sample.
continue with the program. Software deficiencies include
malfunctions or errors of the software that could result in In line with the a priori defined statistical analysis plan, a
issues related to safety or software performance. superiority analysis was conducted to test the hypothesis
that the clinical effectiveness of the intervention was
greater than a propensity-matched waiting control group.
A non-inferiority analysis was conducted to test the
Three propensity-matched external control groups were hypothesis that the clinical effectiveness of the
created using real-world historic patient data (see intervention was not inferior to the effectiveness of typed
External comparator data source) to compare the clinical CBT or face-to-face CBT in comparison to waiting-list.
effectiveness of the intervention to no intervention and Within and between-subject effect sizes were also
standard of care. All propensity-matched control patients estimated for the change in total score on the PHQ-9 and
had a main problem descriptor of GAD as established the WSAS to estimate the effectiveness of the
through a clinician assessment. intervention on low mood and work and social functioning
relative to the waiting control.
The control groups consisted of:
i. waiting controls (total available sample n=576);
patients referred for typed-CBT with two GAD-7
scores between 4-10 weeks apart without having The final sample for analysis included 299 participants of
started treatment during that time (same sample used whom 80% were female (n=240) with a mean age at
for PP and ITT analyses), baseline of 39.8 years (range: 18 – 75 years). Table 1
provides an overview of demographics and baseline
ii. therapist delivered typed CBT (total available sample
severity for participants in the intervention group for both
n=2,210); patients referred for typed-CBT with at least
the ITT and PP samples.
two scores on the GAD-7, who had completed a
course of typed CBT - defined by the discharge code
of ‘completed treatment’ - and discharged with a
maximum of twelve treatment sessions (PP sample), or
any patient who had entered treatment, regardless of Participants (n=299) completed a median of 6.1 hours of
completion (ITT sample), and program interaction over 53.1 days. This was higher for
the PP sample in which participants completed a median
iii. therapist delivered face-to-face CBT (total available of 8.7 hours over 59.6 days. In total, 232 participants
sample n=753); NHS TT patients referred to DHC who (78%) were engaged in the program (i.e. completed
received face-to-face CBT and had a minimum of two session 1 of module 2) involving a median of 2 hours
and a maximum of twelve treatment sessions (PP interacting with the program content over 14 days. Out of
sample), or any patient who attended treatment (ITT those engaged participants, 78% (n=180) reached the
sample). Unlike the typed-CBT comparator, due to minimum meaningful clinical dose (i.e. completing up to
unavailability of discharge codes it was not possible to check-in 4 out of 6 in the program). The overall study
use the ‘completed treatment’ to define the PP sample attrition rate (defined as the proportion of participants
for this group. who did not complete the final study questionnaires) was

8
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Table 1. Sample characteristics of the digital intervention group for both ITT and PP samples.

ITT PP
Demographic Category
(N=299) (N=169)
Age, Mean (SD) - 39.8 (12.8) 41.7 (11.8)

Gender, n (%) Female 240 (80.3) 137 (81.1)

Male 46 (15.4) 26 (15.4)
Other 4 (1.3) 2 (1.2)
Not Known 9 (3.0) 4 (2.4)
Ethnicity, n (%) White 266 (89.0) 155 (91.7)
Mixed 5 (1.7) 2 (1.2)
Asian 14 (4.7) 6 (3.6)
Black/African/Caribbean/Black British 3 (1.0) 1 (0.6)
Other 2 (0.7) 1 (0.6)
Prefer not to say 9 (3.0) 4 (2.4)
Highest Qualification, n (%) Post-graduate degree level qualification 103 (34.4) 65 (38.5)
Degree level qualification 100 (33.4) 59 (34.9)
Qualifications below degree level 84 (28.1) 41 (24.3)
No formal qualifications 2 (0.7) 1 (0.6)
Don’t know 7 (2.3) 2 (1.2)
Other 1 (0.3) -
Prefer not to say 2 (0.7) 1 (0.6)
Disability, n (%) Disability 56 (18.7) 33 (19.5)
No Perceived Disability 232 (77.6) 132 (78.1)
Prefer not to say 11 (3.7) 4 (2.4)
Long Term Condition, n (%) LTC 114 (38.1) 70 (41.4)
No LTC 167 (55.9) 91 (53.8)
Not Known 18 (6.0) 8 (4.7)

Religion, n (%) No religion 187 (62.5) 104 (61.5)

Christian 71 (23.7) 45 (26.6)
Buddhist 1 (0.3) 1 (0.6)
Hindu 5 (1.7) 3 (1.8)
Jewish 3 (1.0) 1 (0.6)
Muslim 5 (1.7) -
Sikh 1 (0.3) 1 (0.6)
Other 11 (3.7) 7 (4.1)
Prefer not to say 15 (5.0) 7 (4.1)

Sexual Orientation, n (%) Heterosexual 237 (79.3) 132 (78.1)

Gay/Lesbian 7 (2.3) 5 (3.0)
Bi-sexual 32 (10.7) 22 (13.0)
Other sexual orientation not listed 7 (2.3) 2 (1.2)
Don’t know 11 (3.7) 4 (2.4)
Prefer not to say 5 (1.7) 4 (2.4)

Employment Status, n (%) Employed 241 (80.6) 144 (85.2)

Unemployed and actively seeking work 7 (2.3) 2 (1.2)
Not working and not actively seeking work 39 (13.0) 19 (11.2)
Prefer not to say 12 (4.0) 4 (2.4)

32%. Descriptive statistics of engagement with the typed therapy based on session completion. Confidence
program across modules are outlined in Table 2. intervals for estimates of the adherence rate were
overlapping indicating no difference in adherence rates
To determine if adherence across sessions differed across groups (Figure 3; Supplementary Table 3).
between the groups, we compared adherence rates
between the digital program, face-to-face therapy and
9
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Table 2. Engagement metrics for the digital program. To investigate potential drivers of program adherence,
Median time Median time demographic and study factors were associated with i)
N since initialization interacting in adherence, defined as number of completed sessions,
(days) program (hours)
and ii) non-adherence, defined as not being included in
Per-protocol sample
total
169 59.6 8.7 the PP sample. The number of completed sessions in the
program did not show any significant associations with
Intention-to-treat
299 53.1 6.1 demographic or study factors (linear regression: F(24,
sample total
Engaged sample
274) = 1.4, p = .11, adjusted R² = 0.03; Supplementary
total (up to module 232 14.0 2.0 Table 4). Age was associated with non-adherence, with
2 session1) younger participants less likely to be in the PP sample
(logistic regression: OR = 0.97, p = .009; Supplementary
All participants by milestone
Module 1 check-in Table 5).
284 0.0 0.03

Module 2 check-in 240 13.6 1.5

Module 3 check-in 209 23.9 2.7

Module 4 check-in 180 35.0 4.1

Module 5 check-in 138 42.9 5.0

Module 6 check-in 113 49.5 5.4 On average, across the intervention sample, there was a
Median days since initialization was calculated as number of days large, clinically meaningful reduction in anxiety symptoms
since software download at onboarding for each sample: PP, ITT and from baseline to final score (PP: mean GAD-7 change = –
the engaged sample (i.e. those who completed up to session 1 of 7.4, d = 1.6; ITT: mean GAD-7 change = –5.4, d = 1.1). This
module 2). Metrics are also shown for each symptom check-in at the reduction was significantly greater than that found for the
beginning of each module.
waiting control (mean GAD-7 change = –1.9; PP between-
subject effect: p <.001, d = 1.3; ITT between-subject effect:
p <.001, d = 0.8), and statistically non-inferior to the
propensity-matched face-to-face therapy control (PP:
mean GAD-7 change = –6.4, non-inferiority effect p
<.001; ITT: mean GAD-7 change = –6.0, non-inferiority
effect p = .002). For the propensity matched typed-
therapy control, the intervention was significantly non-
inferior for the PP sample (mean GAD-7 change: –7.5;
non-inferiority p <.001), and for the ITT sample the effect
was approaching significance (mean GAD-7 change = –
6.6, p = .06; Figure 4, Table 3). Clinical outcomes for all
groups are reported in Supplementary Table 6.

The trajectory for mean reduction in anxiety symptoms

was steeper following the earlier program modules
(Figure 5; Supplementary Table 7). When stratified by
baseline GAD-7 severity into mild, moderate, and severe
groups, the severe group showed the greatest reduction
Figure 3. Adherence with program progression overlaid with
adherence across therapy sessions for the control groups. For each in anxiety symptoms (PP (n=48): mean change on GAD-7
group, adherence was defined based on the proportion of = –10.7, d = 2.0; ITT (n=87): mean change on GAD-7 = –
participants who completed each GAD-7 assessment (“symptom 7.9, d = 1.3; Supplementary Table 8). By the end of the
check”) throughout their journey. Baseline was 100%, i.e. all program, the moderate and severe groups showed a
participants/patients attended a clinical assessment and had a
mean GAD-7 score in the mild range. The clinical effect
baseline GAD-7 score. For the ieso Digital Program group, each
symptom check-in was at the beginning of each module within the
was sustained at one-month follow-up (Figure 5).
program software (total 6 instances in program). To complete each Between final score and one month follow-up, there was
symptom check-in within the program, participants had to finish the no change in GAD-7 mean score for the PP (n=166) and
previous module. For the therapy control groups, patients ITT (n=210) samples with follow-up data (both sample
completed each GAD-7 assessment as part of each attended mean difference = 0.0; Supplementary Table 7).
treatment session (either face-to-face or typed) up to 6 treatment
sessions. Within NHS TT every attended treatment session includes
a GAD-7 assessment. Adherence rates across sessions were not
significantly different between groups (Supplementary Table 3).

10
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Figure 4. Change in anxiety symptoms from baseline to final score for all groups. A) Mean change (final score – baseline) in GAD-7 scores for the
PP sample (n=169), propensity-matched waiting control group, face-to-face CBT group, and typed CBT group. B) Mean change in GAD-7 scores
for the ITT sample (n=299) and a propensity-matched waiting control group, face-to-face CBT group, and typed CBT group. C) Mean GAD-7
scores at baseline and final score with 95% confidence intervals for the PP sample (n=169) and propensity-matched waiting control group, face-
to-face CBT group, and typed CBT group. D) Mean GAD-7 scores at baseline and final score with 95% confidence intervals for the ITT sample
(n=299) and propensity-matched waiting control group, face-to-face CBT group, and typed CBT group. *** = p < .001, ** = p <.005

The associations between participant demographics, 9 change = –3.1, d =0.7; ITT: mean PHQ-9 change = –1.6,
study factors and change in GAD-7 score were explored d = 0.3) (Table 4). This mean change was significantly
with a linear regression: F(24, 274) = 3.45, p< .001, greater than the mean change in the waiting control
adjusted R2 of 0.16. Greater reductions in GAD-7 scores group for the PP sample (mean PHQ-9 change = –1.0,
were associated with higher baseline GAD-7 scores (β = between-subject effect, p < .001, d = 0.5), but not for the
0.70, SE = 0.09, t = 7.6, p< .001), and higher baseline age ITT sample (p = .11 d = 0.1). Despite this, PHQ-9 remission
(β = 0.07, SE = 0.02, t = 3.0, p = .003) (Supplementary rate (based on n=80 above the clinical cut-off at baseline)
Table 9), such that more severe, older participants saw a was 78.8% (Supplementary Table 6). Participants with
larger change in GAD-7 score. severe and moderate baseline GAD-7 scores
experienced the largest improvement in PHQ-9 scores
(Supplementary Table 8). There was minimal mean
change in scores between post intervention and follow-
As intended, given the specificity of the program for up for both PP and ITT samples (PP mean difference =
targeting symptoms of generalized anxiety, there was a 0.5; ITT mean difference = 0.4) (Supplementary Table 10).
statistically significant yet smaller effect for low mood
symptoms as measured with the PHQ-9 (PP: mean PHQ-

11
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Table 3. Change in GAD-7 score from baseline to final score for all groups.

Baseline score Change in GAD-7 score

Lower Upper Within-subject
Sample Comparator N Mean SD Mean SD
95% CI 95% CI effect size (d)
Waiting control 169 12.5 3.3 –1.9 4.0 –1.3 –2.5 0.5

Per- ieso Digital Program 169 12.4 3.4 –7.4 4.6 –6.7 –8.1 1.6
protocol
Face-to-face CBT 253 13.0 3.1 –6.4 4.8 –5.8 –7.0 1.3

Typed CBT 229 12.5 3.4 –7.5 4.1 –7.0 –8.0 1.8

Intention- ieso Digital Program 299 12.5 3.3 –5.4 5.1 –4.8 –6.0 1.1
to-treat
Face-to-face CBT 299 12.9 3.1 –6.0 4.9 –5.5 –6.6 1.2

Typed CBT 299 12.6 3.5 –6.6 4.6 –6.1 –7.1 1.4

Mean difference in GAD-7 score was calculated between baseline and final score for the intervention group (“ieso Digital Program”)
and all propensity-matched comparator arms: waiting control; face-to-face CBT; and typed-CBT. A negative mean difference denotes
a reduction in GAD-7 total scores. Within-subject effect sizes (Cohen’s d) were estimated for the mean change in GAD-7 scores for
each group. Change scores were calculated for PP and ITT samples.

The digital program was well tolerated, and no serious

adverse events were identified during the study. There
was one report of migraine and two reports of insomnia.
There were 10 software deficiencies that occurred (for 7
participants; 90% prior to the software update) for
reasons such as technical issues or difficulties with the
conversational agent understanding users. In all
instances participants were offered an appointment to
discuss any potential impact of this on their mental health
and reminded of their right to withdraw. These instances
resulted in one active participant withdrawal. Across the
study, 10 participants were withdrawn by a study clinician
Figure 5. Mean reduction in anxiety symptoms across digital following a conversation between the participant and a
program. Mean GAD-7 score for each time-point for all participants study clinician. These withdrawals were linked to the
that completed the questionnaires at each time-point. Trajectories
study exclusion criteria and suitability for the program
split by GAD-7 baseline severity: mild, moderate and severe
rather than the safety of the intervention.
(Supplementary Table 7).

There was a significant improvement in work and social

This study demonstrates that an evidence-based,
functioning measured using the WSAS from baseline to
human-supported digital intervention for adults with mild,
final score for the intervention group (PP: mean WSAS
moderate and severe anxiety produced a large clinically
change = –5.3, d = 0.9 ; ITT (n=295): mean WSAS change
meaningful reduction in anxiety symptoms significantly
= –4.7, d = 0.7) (Table 4). This mean change was
greater than a propensity-matched waiting control and
significantly greater than the mean change in the waiting
non-inferior to real-world human delivered therapy.
control group (mean WSAS change = –0.1; PP between-
Engagement with the program was high and participants
subject effect p < .001, d = 1.2; ITT between-subject effect,
adhered to the intervention at a similar rate to the external
p < .001, d = 0.8). The largest changes in functioning were
therapy control groups. The intervention achieved
for severe and moderate anxiety groups (Supplementary
comparable outcomes to human-delivered care with
Table 8).
significantly reduced clinician time. By integrating
technology and human support, this study demonstrates

Table 4. Change in PHQ-9 and WSAS score from baseline to final score for all groups.
12
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Baseline score Change in score

Lower Upper Within-subjects
Sample Comparator N Mean SD Mean SD
95% CI 95% CI effect size (d)
PHQ-9
Waiting control 169 8.4 3.4 –1.0 3.6 –0.4 –1.5 0.3

Per- ieso Digital Program 169 8.0 3.8 –3.1 4.5 –2.4 –3.8 0.7
protocol
Face-to-face CBT 253 8.5 3.7 –3.0 4.8 –2.4 –3.6 0.6

Typed CBT 229 8.1 3.5 –4.1 3.9 –3.6 –4.6 1.1

Intention- ieso Digital Program 299 8.0 3.7 –1.6 4.8 –1.1 –2.1 0.3
to-treat
Face-to-face CBT 299 8.4 3.6 –2.7 4.8 –2.2 –3.3 0.6

Typed CBT 299 8.1 3.6 –3.3 4.2 –2.9 –3.8 0.8

WSAS
Waiting control 153 10.6 6.1 –0.1 1.3 0.1 –0.3 0.1

Per- ieso Digital Program 169 15.3 6.4 –5.3 6.2 –4.4 –6.2 0.9
protocol
Face-to-face CBT 253 14.1 7.6 –4.3 8.6 –3.3 –5.4 0.5

Typed CBT 223 10.8 6.4 –4.6 5.5 –3.8 –5.3 0.8

Intention- ieso Digital Program 295 14.9 6.6 –4.7 6.5 –3.8 –5.6 0.7
to-treat
Face-to-face CBT 299 14.1 7.6 –3.9 8.3 –2.9 –4.8 0.5

Typed CBT 291 10.8 6.3 –3.9 5.7 –3.2 –4.5 0.7

Mean differences in PHQ-9 and WSAS scores were calculated between baseline and final score for the intervention group (“ieso Digital
program”) and all propensity-matched comparator arms: waiting control; face-to-face CBT; and, typed-CBT. A negative mean difference
denotes a reduction in scores. Within-subject effect sizes (Cohen’s d) were estimated for the mean change for each group. Change
scores were calculated for PP and ITT samples.

the potential to expand global access to high-quality, economic healthcare costs, therefore ensuring effects are
effective mental healthcare. durable is imperative (46–48). Incorporating cognitive
and behavioral principles into daily life through practical
The large clinical effect of the digital intervention across exercises can enable meaningful behavioral change that
participants with moderate or severe symptoms persists beyond treatment end (16). Here, both the
highlights the clinical value of the combined program persistent clinical effect at one month follow-up, and the
content and human support. Here, the PP (d = 1.3) and ITT significant improvement in the impact of anxiety on
(d = 0.8) effect sizes relative to waitlist are larger than the participants’ day-to-day functioning (as measured with
pooled effect size reported in a recent meta-analysis (n the WSAS) highlights the potential of the intervention to
comparisons = 96, g = 0.26) (14). Unlike the PP sample instigate long-lasting behavioral change. Retrospective
which is designed to demonstrate the clinical analysis of recurrence data from electronic health records
effectiveness of an intervention when the intervention is is needed to accurately measure the persistence of the
adhered to, the ITT sample provides an estimate of clinical effect in the real world over a longer follow-up
effectiveness more reflective of the real-world context by time-period.
accounting for disengagement. The large ITT effect was
significantly non-inferior to face-to-face therapy, and The engagement rate of the digital program (78%) and
approaching significance for non-inferiority to typed- time to reach “engaged” (~2 hours of program interaction
therapy (p = 0.06). Human-delivered care enables greater over 2 weeks) is comparable to engagement rates and
flexibility to respond to patient concerns and adapt time in therapy observed in NHS TT services for
content compared to a digital program, therefore the treatment of GAD (70%; 2022-2023) (49). Adherence
comparable clinical effects and adherence rates across rates across groups in the study were also similar.
groups indicates the potential of this digital intervention Average program interaction time (median 6.1 hours)
to significantly impact real-world patient outcomes. across the ITT sample was greater than that reported for
It is important to note that high relapse and recurrence similar app-based interventions (e.g. median 3.4 hours)
rates have implications for both patient quality of life and (50), indicating high engagement with the program. Study
13
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

attrition (32%) is higher than previous reports from studies rapid growth in the development of AI conversational
of conversational agent-delivered mental health agents, use of this technology remains rare in digital
interventions (22%) (51), yet similar to real-world global mental health interventions, with only ~5% using this
treatment drop-out rates (~20-40%) (52,53). This may be technology (14). The majority of these systems employ a
due to the pragmatic design of the study: 30% of the tree-based dialogue approach, where natural language
sample recruited through ieso’s therapy referrals could processing analyzes user input, and responses are
choose to withdraw at any time and immediately access selected from a predefined set of pre-written answers.
1:1 human-delivered therapy; and participants had the However, previous research has shown users find this
option to discuss their progress or any issues with the frustrating, particularly when it feels the agent does not
clinical team at any point. These factors could have understand them (64,65). Recent advances in the
increased withdrawal rates more than previous studies, development of large language models now make it
but more readily reflect real-world patient choice and possible to flexibly generate personalized language for a
clinical decision-making. more engaging user experience. In the current study, the
digital program primarily used a tree-based dialogue
To our knowledge, this study is the first to compare the system with controlled use of natural language
effectiveness of a digital intervention to standard of care generation in specific instances to enhance engagement.
using external propensity-matched comparator groups Increased use of generative technology and reduced
from real-world patient data. There is increasing reliance on tree-based approaches will continually
acceptability for the use of externally controlled clinical improve the capability of conversational agents to create
trials (54–57) made possible by the availability of large- a personalized and engaging experience. However,
scale, standardized datasets. Generating external allowing fully autonomous language generation within the
comparator groups reduces patient burden, study costs, context of mental health, where patient problems can be
and avoids delaying treatment for the comparator group nuanced, complex and require the consideration of social
receiving no intervention (58). However, creating and cultural contexts, poses a high risk for patient harm
standard of care control arms that are directly comparable and misuse (66). Stringent validation of these new AI
to a novel intervention remains difficult due to differences technologies with a phased roll out alongside human
in how to define comparable doses, treatment completion oversight will be essential to ensure patient safety (67).
and account for study-specific assessments. Moreover,
the lack of randomization means selection bias and study Finally, a ‘blended’ design of human support and
effects are not controlled for in the current study. conversational technology has been suggested to be key
Nevertheless, this is more reflective of real-world care for maximizing real-world engagement (51). Previous
where treatment outcomes are biased by a patient’s research has highlighted lack of trust, lack of user-centric
preference and choice over their treatment. design, privacy concerns, poor usability, and being
unhelpful in emergencies as key drivers of poor
The clinical effect and engagement rate reported in the engagement with digital interventions (12). To address
current study could have been driven by a combination these concerns, we mirrored a real-world treatment model
of the three key features of the digital intervention: i) a including user support services, clinician referral to the
curated and structured evidence-based program, ii) a program, proactive symptom monitoring and clinician
conversational agent to deliver the program content, and availability for collaborative decision-making with each
iii) a human user and clinical support model akin to participant. This service created a credible and
standard healthcare delivery. First, the structured trustworthy patient experience that we believe positively
evidence-based program was curated by a team of impacted patient outcomes. Although this study was not
accredited cognitive behavioral therapists with an designed to demonstrate the economic value of the
average of 14 years direct clinical experience. The intervention, the average clinician time spent per
program used principles from traditional CBT (22) participant was <2 hours, which is significantly lower than
including third wave approaches, such as ACT. This current standards of care globally: approximately 4 times
approach encourages individuals to accept their less than an average episode of treatment in the UK for
thoughts and feelings while committing to actions aligned GAD (~8 appointments between 45-60 mins; NHS Digital
with their values. There is a growing body of evidence 2021-2022) (49) and ~approximately 8 times less globally
indicating that ACT demonstrates comparable (~15 appointments; mean across reported naturalistic
effectiveness to other forms of CBT for anxiety disorders studies in (68)). This new model, combining an AI-driven
(59–61), and has been shown to be acceptable and program with clinical support, allows the current, limited
engaging within a digital program for GAD (62,63). supply of trained therapists to help more people than
current standards of care.
Second, a conversational agent was used to personalize
the content delivery and enhance engagement. Despite
14
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

Limitations of the current study include the use of flexible human language will soon be widely accessible.
compensation for time for those who volunteered to This accessibility will radically change how individuals
participate, and the selection of a sample with limited low seek mental health support. Our responsibility lies in
mood symptoms. In particular, in line with the study leveraging these advances, addressing the ethical and
exclusion criteria, individuals with severe depression social challenges inherent with AI, and combining the
symptoms were not included. Nevertheless, the best of technology with the best of clinical care to
propensity-matching across groups controls for this, i.e. increase access to effective, safe and engaging mental
all groups included patients with similar baseline anxiety health support for everyone. Rigorous evidence,
and depression symptoms. Differences in PP sample particularly to understand the optimal blend of human
sizes across the control groups were likely driven by the and computer support for different individuals, will be key
definition of PP in each context rather than engagement, to accelerate precision treatment, maintain scalability,
given similar adherence rates across the groups. Defining maximize uptake and adherence, and integrate digital
a comparable PP sample across groups is challenging interventions into health systems.
due to differences in dose intensity, delivery mechanism
and data collected, as well as significant variation across
patients in both clinical presentation of generalized We extend our gratitude to the patients who participated in the study
and to the dedicated clinicians and support staff involved. We would
anxiety and response to treatment. The PP samples for
like to thank Gerald Chan, Stephen Bruso, Andy Richards, Ann
the therapy control groups were based on completed Hayes, David Icke, Michael Black, Clare Hurley, Florian Erber,
episodes of care, therefore were agnostic of therapy dose Richard Marsh, Sam Williams, Jo Parfrey for their support and
and would have included those who received a low encouragement. We are grateful to Prof Thalia Eley for introducing
number of sessions and recovered quickly. Those us to NIHR BioResource. We thank NIHR BioResource volunteers
for their participation, and gratefully acknowledge NIHR
individuals would not have been included in the
BioResource centers, NHS Trusts and staff for their contribution. We
intervention PP sample which was conservatively defined thank the National Institute for Health and Care Research, NHS
based on minimum program interaction. Blood and Transplant, and Health Data Research UK as part of the
Digital Innovation Hub Program. The views expressed are those of
There were also limitations in terms of the diversity of the the author(s) and not necessarily those of the NHS, the NIHR or the
Department of Health and Social Care. We thank Dorset Healthcare
intervention sample, with enrolled participants
University NHS Foundation Trust (DHC) for providing external data
predominantly white, highly educated, and female. This
for comparison.
sample is reflective of the typical profile of GAD patients
in the UK and US (49,69). Although we attempted to
increase diversity in this sample through focused This research was funded by ieso Digital Health Ltd.
marketing campaigns, these efforts were not successful.
Needs differ across individuals, conditions and contexts,
and a greater understanding of the barriers to research Chief Investigator (EMa) and other investigators (CEP, EMi, GW,
MPE, EC, SL, AS, CH, JY, MB, LM, SM, RC, VT, AC, AW, AB) are
participation is required to fully understand these needs,
employees of ieso Digital Health Limited (the company funding this
particularly where groups have been systematically research) or its subsidiaries. None of these authors had a direct
excluded from research, and where there is stigma financial incentive related to the results of this study or the
around mental health. Increasing access to mental health publication of the manuscript.
support could play a substantial role in addressing unmet
need in underserved groups, therefore future work will
CEP, EMa, AW & AB conceptualized the study. CEP and EMi drafted
aim to evidence the inclusivity of this digital intervention
the paper. GW, EMi, MPE, EC, MB, AS & AC contributed to data
and its potential to counter existing health inequalities. analyses and interpretation. SL, JY, CH, EMa & CEP conducted the
study. All authors contributed to the interpretation of results and
In conclusion, this study demonstrates that a digital paper revision, and approved the final version.
intervention, designed for adults with symptoms of
generalized anxiety, produces comparable outcomes to
human-delivered CBT while significantly reducing the Owing to the potential risk of patient identification, and following
data privacy policies at ieso and DHC, individual-level data are not
required clinician time. This result indicates the potential
available. Aggregated data are available upon request, subject to a
for digital interventions to provide high quality, evidence- data-sharing agreement with ieso and DHC. Data requests should
based care at scale to address unmet need worldwide. As be sent to the corresponding author and will be responded to within
AI technologies rapidly progress, it is evident that 30 days.
generative dialogue systems that emulate creative and

15
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

https://www.businessofapps.com/guide/mobile-app-
retention/

1.World Health Organisation. Mental Health | Key facts 14.Linardon J, Torous J, Firth J, Cuijpers P, Messer M, Fuller-
[Internet]. 2022 [cited 2024 Jun 24]. Available from: Tyszkiewicz M. Current evidence on the efficacy of mental
https://www.who.int/news-room/fact-sheets health smartphone apps for symptoms of depression and
anxiety. A meta-analysis of 176 randomized controlled trials.
2.Alonso J, Liu Z, Evans-Lacko S, Sadikova E, Sampson N, World Psychiatry. 2024 Feb 1;23(1):139–49.
Chatterji S, et al. Treatment gap for anxiety disorders is global:
Results of the World Mental Health Surveys in 21 countries. 15.David D, Cristea I, Hofmann SG. Why cognitive behavioral
Depress Anxiety. 2018 Mar 1;35(3):195–208. therapy is the current gold standard of psychotherapy. Front
Psychiatry. 2018 Jan 29;9(JAN).
3.Our World in Data. Psychiatrists per 100,000 people
[Internet]. 2024 [cited 2024 Jul 3]. Available from: 16.The National Collaborating Centre for Mental Health. The
https://ourworldindata.org/grapher/psychiatrists-working-in- Improving Access to Psychological Therapies Manual The
the-mental-health-sector Improving Access to Psychological Therapies Manual -
Appendices and helpful resources. 2018.
4.Health Resources & Services Administration. Health
Workforce Shortage Areas [Internet]. 2024 [cited 2024 Jul 3]. 17.Ewbank MP, Cummins R, Tablan V, Catarino A, Buchholz S,
Available from: https://data.hrsa.gov/topics/health- Blackwell AD. Understanding the relationship between patient
workforce/shortage-areas language and outcomes in internet-enabled cognitive
behavioural therapy: A deep learning approach to automatic
5.Roland J, Lawrance E, Insel T, Christensen H. THE DIGITAL coding of session transcripts. Psychotherapy Research.
MENTAL HEALTH REVOLUTION TRANSFORMING CARE 2020;1–13.
THROUGH INNOVATION AND SCALE-UP: WISH 2020
Forum on Mental Health and Digital Technologies. 2020. 18.Ewbank MP, Cummins R, Tablan V, Bateup S, Catarino A,
Martin AJ, et al. Quantifying the Association between
6.Clay R. Mental health apps are gaining traction [Internet]. Psychotherapy Content and Clinical Outcomes Using Deep
2021 [cited 2024 Jul 3]. Available from: Learning. JAMA Psychiatry. 2019 Jan 1;77(1):35–43.
https://www.apa.org/monitor/2021/01/trends-mental-health-
apps 19.Huckvale K, Venkatesh S, Christensen H. Toward clinical
digital phenotyping: a timely opportunity to consider purpose,
7.Torous J, Roberts LW. Needed innovation in digital health quality, and safety. Vol. 2, npj Digital Medicine. Nature
and smartphone applications for mental health transparency Publishing Group; 2019.
and trust. Vol. 74, JAMA Psychiatry. American Medical
Association; 2017. p. 437–8. 20.Catarino A, Harper S, Malcolm R, Stainthorpe A, Warren G,
Margoum M, et al. Economic evaluation of 27,540 patients
8.Lattie EG, Stiles-Shields C, Graham AK. An overview of and with mood and anxiety disorders and the importance of
recommendations for more accessible digital mental health waiting time and clinical effectiveness in mental healthcare.
services. Vol. 1, Nature Reviews Psychology. Nature Nature Mental Health. 2023 Aug 31;1(9):667–78.
Publishing Group; 2022. p. 87–100.
21.Taylor HL, Menachemi N, Gilbert A, Chaudhary J, Blackburn
9.Borghouts J, Eikey E, Mark G, De Leon C, Schueller SM, J. Economic Burden Associated with Untreated Mental Illness
Schneider M, et al. Barriers to and facilitators of user in Indiana. JAMA Health Forum. 2023 Oct 13;4(10):E233535.
engagement with digital mental health interventions:
Systematic review. J Med Internet Res. 2021 Mar 1;23(3). 22.Fenn K, Byrne M. The key principles of cognitive
behavioural therapy. InnovAiT: Education and inspiration for
10.M. Ng M, Firth J, Minen M, Torous J. User engagement in general practice. 2013 Sep;6(9):579–85.
mental health apps: A review of measurement, reporting, and
validity. Psychiatric Services. 2019;70(7):538–44. 23.Wilson K, Hayes S, Strosahl K. Acceptance and
commitment therapy: an experiential approach to behavior
11.Michie S, Yardley L, West R, Patrick K, Greaves F. change. New York: Guilford Press; 2003.
Developing and evaluating digital interventions to promote
behavior change in health and health care: Recommendations 24.Gilbody S, Brabyn S, Lovell K, Kessler D, Devlin T, Smith L,
resulting from an international workshop. J Med Internet Res. et al. Telephone-supported computerised cognitive-
2017;19(6). behavioural therapy: REEACT-2 large-scale pragmatic
randomised controlled trial. British Journal of Psychiatry. 2017
12.Torous J, Nicholas J, Larsen ME, Firth J, Christensen H. May 1;210(5):362–7.
Clinical review of user engagement with mental health
smartphone apps: Evidence, theory and improvements. Vol. 21, 25.Schulz KF, Altman DG, Moher D. CONSORT 2010
Evidence-Based Mental Health. BMJ Publishing Group; 2018. Statement: Updated guidelines for reporting parallel group
p. 116–9. randomised trials. BMJ (Online). 2010 Mar 27;340(7748):698–
702.
13.Tafradzhiyski N. Business of Apps | Mobile App Retention.
[Internet]. 2023 [cited 2024 Jun 24]. Available from:

16
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

26.Spitzer RL, Kroenke K, Williams JBW, Löwe B. A Brief effective for generalized anxiety disorder: randomized
Measure for Assessing Generalized Anxiety Disorder The controlled trial. Australian & New Zealand Journal of
GAD-7. Arch Intern Med. 2006;166:1092–7. Psychiatry [Internet]. 2009;43(10):905–12. Available from:
www.virtualclinic.org.au
27.Kroenke K, Spitzer RL. The PHQ-9: A New Depression
Diagnostic and Severity Measure. Psychiatr Ann. 2002 40.Titov N, Dear BF, Johnston L, Lorian C, Zou J, Wootton B,
Sep;32(9):509–15. et al. Improving Adherence and Clinical Outcomes in Self-
Guided Internet Treatment for Anxiety and Depression:
28.Mundt JC, Marks IM, Shear MK, Greist JH. The Work and Randomised Controlled Trial. PLoS One. 2013 Jul 3;8(7).
Social Adjustment Scale: A simple measure of impairment in
functioning. British Journal of Psychiatry. 2002;180(MAY):461– 41.Rothmann MD, Wiens BL, Chan ISF. Design and Analysis of
4. Non-Inferiority Trials. Chapman and Hall/CRC; 2016.

29.Rolffs JL, Rogge RD, Wilson KG. Disentangling 42.R Core Team. R: A language and environment for statistical
Components of Flexibility via the Hexaflex Model: computing. Vienna, Austria : R Foundation for Statistical
Development and Validation of the Multidimensional Computing; 2016.
Psychological Flexibility Inventory (MPFI). Assessment. 2018
Jun 1;25(4):458–82. 43.Toussaint A, Hüsing P, Gumz A, Wingenfeld K, Härter M,
Schramm E, et al. Sensitivity to change and minimal clinically
30.O’Brien HL, Cairns P, Hall M. A practical approach to important difference of the 7-item Generalized Anxiety
measuring user engagement with the refined user Disorder Questionnaire (GAD-7). J Affect Disord. 2020 Mar
engagement scale (UES) and new UES short form. 15;265:395–401.
International Journal of Human Computer Studies. 2018 Apr
1;112:28–39. 44.Clark DM. Realizing the Mass Public Benefit of Evidence-
Based Psychological Therapies: The IAPT Program. Annu Rev
31.Brooke J. Usability Evaluation in Industry. 1st Edition. 1996. Clin Psychol. 2018 May 7;14:159–83.

32.Hirani SP, Rixon L, Beynon M, Cartwright M, Cleanthous S, 45.Ho DE, Imai K, King G, Stuart EA. MatchIt: Nonparametric
Selva A, et al. Quantifying beliefs regarding telehealth: Preprocessing for Parametric Causal Inference [Internet]. Vol.
Development of the Whole Systems Demonstrator Service 42, JSS Journal of Statistical Software. 2011. Available from:
User Technology Acceptability Questionnaire. J Telemed http://www.jstatsoft.org/
Telecare. 2017;23(4):460–9.
46.Ali S, Rhodes L, Moreea O, McMillan D, Gilbody S, Leach
33.Hayes S, Follette V, Linehan M. Mindfulness and C, et al. How durable is the effect of low intensity CBT for
Acceptance: Expanding the Cognitive-Behavioral Tradition. depression and anxiety? Remission and relapse in a
Guilford Press; 2011. longitudinal cohort study. Behaviour Research and Therapy.
2017 Jul 1;94:1–8.
34.Berg H, Akeman E, McDermott TJ, Cosgrove KT, Kirlic N,
Clausen A, et al. A randomized clinical trial of behavioral 47.Delgadillo J, Rhodes L, Moreea O, McMillan D, Gilbody S,
activation and exposure-based therapy for adults with Leach C, et al. Relapse and Recurrence of Common Mental
generalized anxiety disorder. Journal of Mood and Anxiety Health Problems after Low Intensity Cognitive Behavioural
Disorders. 2023 Jun;1:100004. Therapy: The WYLOW Longitudinal Cohort Study. Psychother
Psychosom. 2018 Mar 1;87(2):116–7.
35.Beatty C, Malik T, Meheli S, Sinha C. Evaluating the
Therapeutic Alliance With a Free-Text CBT Conversational 48.Shallcross AJ, Willroth Aaron Fisher EC, Dimidjian S, Gross
Agent (Wysa): A Mixed-Methods Study. Front Digit Health. JJ, Visvanathan Manhattan Mindfulness-Based Cognitive
2022 Apr 11;4. Behavioral Therapy Iris B Mauss PD. Relapse/Recurrence
Prevention in Major Depressive Disorder: 26-Month Follow-
36.Boucher E, Honomichl R, Ward H, Powell T, Stoeckl SE, Up of Mindfulness-Based Cognitive Therapy Versus an Active
Parks A. The Effects of a Digital Well-being Intervention on Control ScienceDirect. Behav Ther [Internet]. 2018; Available
Older Adults: Retrospective Analysis of Real-world User Data. from: www.sciencedirect.comwww.elsevier.com/locate/bt
JMIR Aging. 2022 Jul 1;5(3).
49.NHS Digital. NHS Digital. 2024 [cited 2024 May 24]. NHS
37.Cliffe B, Croker A, Denne M, Stallard P. Supported Web- Talking Therapies, for anxiety and depression, Annual reports,
Based Guided Self-Help for Insomnia for Young People 2022-23. Available from: https://digital.nhs.uk/data-and-
Attending Child and Adolescent Mental Health Services: information/publications/statistical/nhs-talking-therapies-for-
Protocol for a Feasibility Assessment. JMIR Res Protoc. 2018 anxiety-and-depression-annual-reports/2022-23#resources
Dec 1;7(12).
50.Richards D, Enrique A, Eilert N, Franklin M, Palacios J,
38.Robinson E, Titov N, Andrews G, McIntyre K, Schwencke G, Duffy D, et al. A pragmatic randomized waitlist-controlled
Solley K. Internet treatment for generlized anxiety disorder: A effectiveness and cost-effectiveness trial of digital
randomized controlled trial comparing clinician vs. technician interventions for depression and anxiety. NPJ Digit Med.
assistance. PLoS One. 2010;5(6). 2020 Dec 1;3(1).

39.Titov N, Andrews G, Robinson E, Schwencke G, Johnston

L, Solley K, et al. Clinician-assisted Internet-based treatment is
17
medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .

Palmer et al., (2024) – Combining AI and human support in mental health

51.Jabir AI, Lin X, Martinengo L, Sharp G, Theng YL, Car LT. Disorder in Adults: A Systematic Review and Network Meta-
Attrition in Conversational Agent-Delivered Mental Health Analysis of Randomized Clinical Trials. JAMA Psychiatry. 2024
Interventions: Systematic Review and Meta-Analysis. J Med Mar 6;81(3):250–9.
Internet Res. 2024 Jan 1;26(1).
62.Kelson J, Rollin A, Ridout B, Campbell A. Internet-delivered
52.Olfson M, Marcus S. National Trends in Outpatient acceptance and commitment therapy for anxiety treatment:
Psychotherapy. Am J Psychiatry . 2010;167(12):1456–63. Systematic review. Vol. 21, Journal of Medical Internet
Research. JMIR Publications Inc.; 2019.
53.Wells JE, Browne MO, Aguilar-Gaxiola S, Al-Hamzawi A,
Alonso J, Angermeyer MC, et al. Drop out from out-patient 63.Hemmings NR, Kawadler JM, Whatmough R, Ponzo S,
mental healthcare in the World Health Organization’s World Rossi A, Morelli D, et al. Development and feasibility of a
Mental Health Survey initiative. British Journal of Psychiatry. digital acceptance and commitment therapy⇓based
2013 Jan;202(1):42–9. intervention for generalized anxiety disorder: Pilot
acceptability study. JMIR Form Res. 2021 Feb 1;5(2).
54.U.S. Department of Health and Human Services Food and
Drug Administration. Considerations for the Design and 64.Coghlan S, Leins K, Sheldrick S, Cheong M, Gooding P,
Conduct of Externally Controlled Trials for Drug and Biological D’Alfonso S. To chat or bot to chat: Ethical issues with using
Products Guidance for Industry DRAFT GUIDANCE [Internet]. chatbots in mental health. Digit Health. 2023 Jan;9.
2023. Available from: https://www.fda.gov/vaccines-blood-
biologics/guidance-compliance-regulatory-information- 65.Huang YS (Sandy), Dootson P. Chatbots and service
biologics/biologics-guidances failure: When does it lead to customer aggression. Journal of
Retailing and Consumer Services. 2022 Sep 1;68.
55.National Institute for Health and Care Excellence. NICE
real-world evidence framework [Internet]. 2022. Available from: 66.Nuffield Council on Bioethics. The role of technology in
www.nice.org.uk/corporate/ecd9 mental healthcare [Internet]. 2022 [cited 2024 Jun 24].
Available from:
56.Thorlund K, Dron L, Park JJH, Mills EJ. Synthetic and https://www.nuffieldbioethics.org/assets/pdfs/The-role-of-
external controls in clinical trials – A primer for researchers. technology-in-mental-healthcare.pdf
Clin Epidemiol. 2020;12:457–67.
67.Stade EC, Stirman SW, Ungar LH, Boland CL, Schwartz HA,
57.Corrigan-Curay J, Sacks L, Woodcock J. Real-world Yaden DB, et al. Large language models could change the
evidence and real-world data for evaluating drug safety and future of behavioral healthcare: a proposal for responsible
effectiveness. Vol. 320, JAMA - Journal of the American development and evaluation. npj Mental Health Research.
Medical Association. American Medical Association; 2018. p. 2024 Apr 2;3(1).
867–8.
68.Flückiger C, Wampold BE, Delgadillo J, Rubel J, Vîslǎ A,
58.Patterson B, Boyle MH, Kivlenieks M, Van Ameringen M. Lutz W. Is There an Evidence-Based Number of Sessions in
The use of waitlists as control conditions in anxiety disorders Outpatient Psychotherapy? - A Comparison of Naturalistic
research. J Psychiatr Res. 2016 Dec 1;83:112–20. Conditions across Countries. Psychother Psychosom. 2020
Aug 1;89(5):333–5.
59.American Psychological Association. DIAGNOSIS: Mixed
Anxiety Conditions TREATMENT: Acceptance And 69.Terlizzi E, Villarroel M. Symptoms of Generalized Anxiety
Commitment Therapy For Mixed Anxiety Disorders [Internet]. Disorder Among Adults: United States, 2019 [Internet]. 2020
2015 [cited 2024 Jun 24]. Available from: Sep [cited 2024 Jul 4]. Available from:
https://div12.org/treatment/acceptance-and-commitment- https://www.cdc.gov/nchs/products/databriefs/db378.htm
therapy-for-mixed-anxiety-disorders/
70.Food and Drug Administration. Non-Inferiority
60.Han A, Kim TH. Efficacy of Internet-Based Acceptance and Clinical Trials to Establish Effectiveness. Guidance for
Commitment Therapy for Depressive Symptoms, Anxiety, Industry [Internet]. 2016 Sep [cited 2024 Jul 11]. Available from:
Stress, Psychological Distress, and Quality of Life: Systematic https://www.fda.gov/media/78504/download
Review and Meta-analysis. J Med Internet Res. 2022 Dec
1;24(12).

61.Papola D, Miguel C, Mazzaglia M, Franco P, Tedeschi F,

Romero SA, et al. Psychotherapies for Generalized Anxiety

Assessing The Effectiveness of ChatGPT Indelivering Mental Health Support A Qualitative Study
No ratings yet
Assessing The Effectiveness of ChatGPT Indelivering Mental Health Support A Qualitative Study
11 pages
The Chatbot Mental Health
No ratings yet
The Chatbot Mental Health
32 pages
Using AI Montolio
No ratings yet
Using AI Montolio
21 pages
Digital Technology and Mental Health Opp
No ratings yet
Digital Technology and Mental Health Opp
10 pages
Abstract BKGND
No ratings yet
Abstract BKGND
4 pages
10 1111@papt 12222
No ratings yet
10 1111@papt 12222
21 pages
Innovhealthknow 020
No ratings yet
Innovhealthknow 020
8 pages
Understanding Teletherapy and Its Impact
No ratings yet
Understanding Teletherapy and Its Impact
2 pages
Commentary Trustworthy and Ethical AI in Digital Mental H 2025 Internet Int
No ratings yet
Commentary Trustworthy and Ethical AI in Digital Mental H 2025 Internet Int
3 pages
Systematic Review and Meta-Analysis of AI-based Conversational Agents For Promoting Mental Health and Well-Being
No ratings yet
Systematic Review and Meta-Analysis of AI-based Conversational Agents For Promoting Mental Health and Well-Being
14 pages
Bucci 2019 The Digital Revolution and Its Impa
No ratings yet
Bucci 2019 The Digital Revolution and Its Impa
41 pages
Sensors 22 03653
No ratings yet
Sensors 22 03653
18 pages
The Impact of Digital Health Interventions Onedited
No ratings yet
The Impact of Digital Health Interventions Onedited
13 pages
001 Cameron
No ratings yet
001 Cameron
7 pages
Introduction
No ratings yet
Introduction
3 pages
AI Based Chatbots For Mental Health Care
No ratings yet
AI Based Chatbots For Mental Health Care
8 pages
Essay 3
No ratings yet
Essay 3
2 pages
World Psychiatry - 2025 - Torous - The Evolving Field of Digital Mental Health Current Evidence and Implementation Issues
No ratings yet
World Psychiatry - 2025 - Torous - The Evolving Field of Digital Mental Health Current Evidence and Implementation Issues
19 pages
QT 0 NJ 447 NK
No ratings yet
QT 0 NJ 447 NK
11 pages
Research Paper - Enhancing Mental Health Support Through Human-AI Collaboration: Toward Secure and Empathetic AI-Enabled Chatbots
No ratings yet
Research Paper - Enhancing Mental Health Support Through Human-AI Collaboration: Toward Secure and Empathetic AI-Enabled Chatbots
17 pages
Expressing Negative Thoughts - PPTX 2
No ratings yet
Expressing Negative Thoughts - PPTX 2
18 pages
Manuscript
No ratings yet
Manuscript
18 pages
Project Mini New (AutoRecovered)
No ratings yet
Project Mini New (AutoRecovered)
31 pages
Quick Paper
No ratings yet
Quick Paper
8 pages
An AI-Assisted Multi-Agent Dual Dialogue System To Support Mental Health Care Providers
No ratings yet
An AI-Assisted Multi-Agent Dual Dialogue System To Support Mental Health Care Providers
16 pages
Div Class Title Technological Innovations in Mental Healthcare Harnessing The Digital Revolution Div
No ratings yet
Div Class Title Technological Innovations in Mental Healthcare Harnessing The Digital Revolution Div
3 pages
Technology and Mental Health: Editorial
No ratings yet
Technology and Mental Health: Editorial
2 pages
Ajai 20240801 12
No ratings yet
Ajai 20240801 12
8 pages
The Use of Artificial Intelligence in Psychotherapy: Development of Intelligent Therapeutic Systems
No ratings yet
The Use of Artificial Intelligence in Psychotherapy: Development of Intelligent Therapeutic Systems
12 pages
Artificial Intelligence in The Era
No ratings yet
Artificial Intelligence in The Era
2 pages
Computers in Human Behavior: Artificial Humans: Sucharat Limpanopparat, Erin Gibson, DR Andrew Harris
No ratings yet
Computers in Human Behavior: Artificial Humans: Sucharat Limpanopparat, Erin Gibson, DR Andrew Harris
17 pages
AI Chatbots in Mental Health
No ratings yet
AI Chatbots in Mental Health
14 pages
AI Chatbots in Mental Health Apps
No ratings yet
AI Chatbots in Mental Health Apps
12 pages
It Happened To Be The Perfect Thing:: Experiences of Generative AI Chatbots For Mental Health
No ratings yet
It Happened To Be The Perfect Thing:: Experiences of Generative AI Chatbots For Mental Health
9 pages
Major Project Research Paper Draft - Final
No ratings yet
Major Project Research Paper Draft - Final
6 pages
Digital Mental Health A Practitioners Guide
No ratings yet
Digital Mental Health A Practitioners Guide
317 pages
Litreture Review CS401
No ratings yet
Litreture Review CS401
2 pages
AI in Mental Health 16 Nov 23 - PREPRINT
No ratings yet
AI in Mental Health 16 Nov 23 - PREPRINT
13 pages
Bodyguide
No ratings yet
Bodyguide
30 pages
Chatbots To Support Young Adults' Mental Health: An Exploratory Study of Acceptability
No ratings yet
Chatbots To Support Young Adults' Mental Health: An Exploratory Study of Acceptability
39 pages
(Heinz Et Al., 2025) Randomized Trial of A Generative AI Chatbot For Mental Health Treatment
100% (1)
(Heinz Et Al., 2025) Randomized Trial of A Generative AI Chatbot For Mental Health Treatment
14 pages
Nihms 1051860
No ratings yet
Nihms 1051860
18 pages
E 09 F
No ratings yet
E 09 F
23 pages
The Emergence of AI in Mental Health: A Transformative Journey
No ratings yet
The Emergence of AI in Mental Health: A Transformative Journey
5 pages
Fahad Alanezi (2024)
No ratings yet
Fahad Alanezi (2024)
12 pages
Digital Mental Health Challenges and The Horizon Ahead For 2wkamq35rn
No ratings yet
Digital Mental Health Challenges and The Horizon Ahead For 2wkamq35rn
11 pages
Transformation Based Digital Technology - The Effectivity of Psychosocial Intervention Method
No ratings yet
Transformation Based Digital Technology - The Effectivity of Psychosocial Intervention Method
41 pages
Hope
No ratings yet
Hope
19 pages
AI-Driven Behavioral Insights For Personalized
No ratings yet
AI-Driven Behavioral Insights For Personalized
11 pages
Talking Mental Health: A Battle of Wits Between Humans and AI
No ratings yet
Talking Mental Health: A Battle of Wits Between Humans and AI
11 pages
Mental Health Innovations
No ratings yet
Mental Health Innovations
3 pages
4007-Article Text-14766-4-10-20241230
No ratings yet
4007-Article Text-14766-4-10-20241230
9 pages
AI Mental Health Research Paper
No ratings yet
AI Mental Health Research Paper
2 pages
Jurnal AI and Psychotherapy
No ratings yet
Jurnal AI and Psychotherapy
11 pages
Anxiety AI
No ratings yet
Anxiety AI
26 pages
Thesis+Report Gopinath Ravichandran
No ratings yet
Thesis+Report Gopinath Ravichandran
78 pages
MMDA v. Concerned Residents of Manila Bay, GR 171947
No ratings yet
MMDA v. Concerned Residents of Manila Bay, GR 171947
2 pages
FNSACC634 Assessment Overview
No ratings yet
FNSACC634 Assessment Overview
11 pages
Cars Reading
No ratings yet
Cars Reading
15 pages
OpenAPI Spec Guide for Developers
No ratings yet
OpenAPI Spec Guide for Developers
25 pages
How To Easily Deploy Machine Learning Models Using Flask - by Abhinav Sagar - Towards Data Science
No ratings yet
How To Easily Deploy Machine Learning Models Using Flask - by Abhinav Sagar - Towards Data Science
10 pages
Data Science
No ratings yet
Data Science
5 pages
Social Media Marketing 3rd Edition Tuten
No ratings yet
Social Media Marketing 3rd Edition Tuten
297 pages
Windows Subversion Setup Guide
No ratings yet
Windows Subversion Setup Guide
10 pages
Institutional Lands-Converted - 0
No ratings yet
Institutional Lands-Converted - 0
1 page
Philippine Initiatives in The Implementation of The
No ratings yet
Philippine Initiatives in The Implementation of The
19 pages
Psychology Case Study On Emotional Intelligence
No ratings yet
Psychology Case Study On Emotional Intelligence
9 pages
R9M.413 Bottom Engine
No ratings yet
R9M.413 Bottom Engine
3 pages
Sensors 23 08196
No ratings yet
Sensors 23 08196
21 pages
Estimating Leaf Chlorophyll via Video
No ratings yet
Estimating Leaf Chlorophyll via Video
6 pages
The Road To Revolution Powerpoint
No ratings yet
The Road To Revolution Powerpoint
34 pages
Drug Dosing in Hemodialysis: Katie Cardone, Pharmd, Bcacp, FNKF, Fasn Associate Professor of Pharmacy Practice
No ratings yet
Drug Dosing in Hemodialysis: Katie Cardone, Pharmd, Bcacp, FNKF, Fasn Associate Professor of Pharmacy Practice
44 pages
Modern Vehicle Fire Hazards in Garages
No ratings yet
Modern Vehicle Fire Hazards in Garages
31 pages
Econometrics 2019 PDF
No ratings yet
Econometrics 2019 PDF
143 pages
Gtscl-Warehouse Science Bartholdi PDF
0% (1)
Gtscl-Warehouse Science Bartholdi PDF
321 pages
A Study of Porosity Formation in Pressure Die Casting Using The Taguchi Approach
No ratings yet
A Study of Porosity Formation in Pressure Die Casting Using The Taguchi Approach
11 pages
Public Officer Finals Reviewer
100% (1)
Public Officer Finals Reviewer
2 pages
Offline Marketing Tactics Guide
No ratings yet
Offline Marketing Tactics Guide
54 pages
Preview Paystub - 123PayStubs
No ratings yet
Preview Paystub - 123PayStubs
1 page
Gartner Reprint
No ratings yet
Gartner Reprint
82 pages
Zener Diode Voltage Regulation Experiment
No ratings yet
Zener Diode Voltage Regulation Experiment
8 pages
Unit2 A New Mechanic PDF
No ratings yet
Unit2 A New Mechanic PDF
32 pages
Illegal Dismissal Case: Dagasdas v. GPGS
No ratings yet
Illegal Dismissal Case: Dagasdas v. GPGS
3 pages
2 Data Transmission IGCSE Revision Sheet
No ratings yet
2 Data Transmission IGCSE Revision Sheet
3 pages

Combinación de IA y Apoyo Humano en Salud Mental

Uploaded by

Combinación de IA y Apoyo Humano en Salud Mental

Uploaded by

medRxiv preprint doi: https://doi.org/10.1101/2024.07.17.24310551; this version posted July 17, 2024.

The copyright holder for this preprint

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

delivery of evidence-based protocols through digital

Palmer et al., (2024) – Combining AI and human support in mental health

program from detailed qualitative analysis of these

The study was pre-registered (ISRCTN ID: 52546704)

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

All participants met the following eligibility criteria:

Any individuals who had previously participated in user

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

Gender, n (%) Female 240 (80.3) 137 (81.1)

Religion, n (%) No religion 187 (62.5) 104 (61.5)

Sexual Orientation, n (%) Heterosexual 237 (79.3) 132 (78.1)

Employment Status, n (%) Employed 241 (80.6) 144 (85.2)

Palmer et al., (2024) – Combining AI and human support in mental health

Module 2 check-in 240 13.6 1.5

Module 3 check-in 209 23.9 2.7

Module 4 check-in 180 35.0 4.1

Module 5 check-in 138 42.9 5.0

The trajectory for mean reduction in anxiety symptoms

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

Baseline score Change in GAD-7 score

The digital program was well tolerated, and no serious

There was a significant improvement in work and social

Palmer et al., (2024) – Combining AI and human support in mental health

Baseline score Change in score

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

Palmer et al., (2024) – Combining AI and human support in mental health

39.Titov N, Andrews G, Robinson E, Schwencke G, Johnston

Palmer et al., (2024) – Combining AI and human support in mental health

61.Papola D, Miguel C, Mazzaglia M, Franco P, Tedeschi F,

You might also like