A Generic Review of Integrating Artificial Intelligence in Cognitive Behavioral Therapy
A Generic Review of Integrating Artificial Intelligence in Cognitive Behavioral Therapy
3 School of Nursing, Joint Research Centre for Primary Health Care, The
Abstract
Cognitive Behavioral Therapy (CBT) is a well-established intervention for mit-
igating psychological issues by modifying maladaptive cognitive and behavioral
patterns. However, delivery of CBT is often constrained by resource limitations
and barriers to access. Advancements in artificial intelligence (AI) have pro-
vided technical support for the digital transformation of CBT. Particularly, the
emergence of pre-training models (PTMs) and large language models (LLMs)
holds immense potential to support, augment, optimize and automate CBT deliv-
ery. This paper reviews the literature on integrating AI into CBT interventions.
We begin with an overview of CBT. Then, we introduce the integration of AI
into CBT across various stages: pre-treatment, therapeutic process, and post-
treatment. Next, we summarized the datasets relevant to some CBT-related
tasks. Finally, we discuss the benefits and current limitations of applying AI to
CBT. We suggest key areas for future research, highlighting the need for fur-
ther exploration and validation of the long-term efficacy and clinical utility of
AI-enhanced CBT. The transformative potential of AI in reshaping the practice
of CBT heralds a new era of more accessible, efficient, and personalized mental
health interventions.
1
Keywords: Artificial intelligence, Mental health, Cognitive behavioral therapy, Large
language model, Machine learning, Deep learning
1 Introduction
Mental health issues have become increasingly prevalent, especially in the post-
pandemic era Penninx et al (2022). Psychological distress brought on by COVID-19
has added to the existing mental health crisis, leading to higher rates of depression,
anxiety, and suicidal ideation. Consequently, there is an urgent need for effective and
accessible mental health interventions. Cognitive Behavioral Therapy (CBT) is widely
recognized as one of the most important psychological interventions for addressing
a range of mental health issues Foreman and Pollard (2016); David et al (2018).
As the gold standard for treating depression and anxiety, CBT has been integrated
into healthcare systems worldwide. Its primary goal is to identify and restructure
patients’ maladaptive cognitive frameworks to help them develop coping skills, address
behavioral issues, and alleviate symptoms. However, the delivery of CBT, as it is tra-
ditionally conducted individually and face-to-face by a therapist, faces barriers in real
life, including social stigma and limited access to qualified therapists, particularly in
underserved areas. Only 27% of those receiving psychological therapy actually access
standardized care Bandelow et al (2017). To address these challenges, technological
advancements have led to the development of Computer-Based Cognitive Behavioral
Therapy (CCBT) and Internet-Based Cognitive Behavioral Therapy (ICBT) Webb
et al (2017). These variations use computer software and the Internet as a medium
to deliver CBT interventions based on established theories and techniques. Several
established online CBT platforms, such as MoodGYM Australian National Univer-
sity (2024) and 30-Day Self-Service Psychological Expert China Cognitive Behavioral
therapy professional organization (2024), have emerged. These online platforms have
partially alleviated some of the issues related to the shortage of mental health profes-
sionals and poor accessibility Christ et al (2020). However, they still face challenges,
such as limited interactivity and high dropout rates. In addition, they offer interven-
tions that address only general cognitive issues rather than individual specific needs.
Given this challenge, there is an urgent need for more flexible and adaptive forms of
CBT to better meet the diverse needs of different patient populations.
In recent years, rapid advancements in artificial intelligence (AI) technology have
led various industries to explore its innovative applications. AI’s exceptional capabil-
ities in data analysis, pattern recognition, and automation offer vast potential in the
delivery of mental health services, providing new tools and methods for assessment,
diagnosis, and treatment of mental health conditions Lee et al (2021); Malgaroli et al
(2023); Demszky et al (2023a). For instance, deep learning (DL) models can auto-
mate mental health assessments and diagnostic processes, alleviating the workload of
professionals while enhancing the accuracy and efficiency of diagnoses Vuyyuru et al
(2023). Furthermore, AI can also aid in detecting and understanding patients’ unique
2
emotional expressions, thereby offering recommendations for personalized psycholog-
ical interventions and support Assunção et al (2022). Given the shortage of skilled
mental health professionals, innovative AI approaches have been developed to guide
peer-to-peer mental health support Sharma et al (2023a). Additionally, AI-based psy-
chological counseling support systems, such as those utilizing large language models
(LLMs), have also been developed to assist junior counselors in providing online psy-
chological support Fu et al (2023). These studies demonstrate that AI not only expands
the boundaries of traditional mental health services, but also brings new opportunities
for improving their accessibility and utility globally Graham et al (2019); Demszky
et al (2023b). In the realm of CBT, the integration of AI has led to revolution-
ary advancements, particularly with the rise of LLMs. Notable research in this area
includes the development of CBT-specific prompts and tailored datasets, leading to
models like CBT-LLM, which are specifically designed for CBT delivery Na (2024).
Other efforts have resulted in conversational counseling agents based on LLMs, such
as utilizing GPT-2 in CBT to generate human-like textual narratives to provide psy-
chological support Rajagopal et al (2021), providing psychological counseling using
CBT techniques Lee et al (2024), and delivering dialogue modules focused on Socratic
questioning using LLMs like OsakaED and GPT-4 Izumi et al (2024). These advance-
ments have made cognitive behavioral interventions more personalized and precise,
better addressing the needs of diverse patients.
Given the abundance of research emerging in this field, several review articles have
summarized innovative developments in CBT. However, the exsiting reviews focused
mainly on CBT applications Huguet et al (2016); Denecke et al (2022) and the devel-
opment of ChatGPT in redesigning CBT for different age groups and genders Chandra
et al (2023). There is a lack of comprehensive review summarizing the application of
AI’s role across various stages of CBT delivery. To address this, we provide a detailed
literature review of AI’s role in enhancing CBT in this paper.
2 Background
Cognitive behavioral therapy (CBT) stands as a structured, time-limited psycholog-
ical treatment that focuses on the interplay between cognitive processes, emotional
responses, and behavioral patterns. This therapeutic approach operates on the premise
that thoughts, emotions, and behaviors are interconnected, and aims to enhance psy-
chological well-being by addressing maladaptive cognitive patterns and behaviors Beck
and Beck (2011). The theoretical foundation of CBT encompasses both cognitive and
behavioral aspects. Regarding the origins of CBT, there are varying views in the aca-
demic community. However, it is generally accepted that the popularization of the
“cognition” began with Ellis’s Rational Emotive Therapy (RET) in the 1950s, followed
by Beck’s Cognitive Therapy (CT) in the 1960s. Since then, CBT has continuously
integrated various behavioral therapies and theories, evolving into the widely used
psychological treatment seen today.
CBT interventions typically target three domains: cognition, behavior, and emo-
tion McGinn and Sanderson (2001). In the cognitive domain, CBT focuses on
individuals’ cognitive processes, including thoughts, beliefs, and interpretations. This
3
involves aiding clients in identifying and understanding negative or distorted thought
patterns, such as negative automatic thoughts and cognitive distortions (CD) (e.g.,
jumping to conclusions, all-or-nothing thinking and mental filtering). Techniques like
cognitive restructuring (CR) are used to adjust these patterns, fostering a more objec-
tive and positive perspective toward oneself and the world. In the behavioral domain,
CBT addresses individuals’ maladaptive behavioral patterns and habits. Therapists
collaborate with clients to explore their unhealthy behaviors, such as social with-
drawal, avoidance, and substance or alcohol misuse. Techniques such as behavioral
experiments, graded exposure, and behavioral activation (BA) are then employed. Fur-
ther, CBT acknowledges the interplay between the body and the mind. Physiological
responses such as tension, fear, and anxiety can impact cognition and behavior. Thus,
CBT also focuses on regulating physiological responses through techniques such as
deep breathing, progressive muscle relaxation, and physical activity to enhance emo-
tional regulation and psychological well-being. For the emotional domain, CBT targets
clients’ emotional awareness and ways to manage them by aiding clients to recog-
nize negative emotions as they arise, accurately identify them, and utilize appropriate
cognitive and behavioral strategies to support better emotional well-being. The three
domains are intertwined, and CBT typically tailors a range of techniques and strategies
based on the individual’s specific circumstances and needs, aiding in problem-solving,
improving mental health, and building resilience Dobson and Dozois (2021).
CBT has been employed to address a variety of mental health issues, including
anxiety disorder Olatunji et al (2010), depression Tymofiyeva et al (2019), schizophre-
nia Turkington et al (2004), attention deficit and hyperactivity disorder Pan et al
(2019), insomnia Benard and Lukandu (2018), eating disorders Linardon et al (2017),
bipolar disorder Driessen and Hollon (2010), substance use disorders McHugh et al
(2010), and obsessive-compulsive disorder Moody et al (2017). Beyond mental health,
CBT has been explored as a tool for managing chronic health conditions like low
back pain Piette et al (2016); Heapy et al (2017), asthma Parry et al (2012), and
tinnitus Parry et al (2012). Prior to initiating CBT, therapists typically conduct assess-
ments to determine symptom severity and treatment goals. These assessments may
involve dialogue between therapist and client or self-administered tools like the Beck
Depression Inventory (BDI-II) or the Beck Anxiety Inventory (BAI) Bech et al (1996);
Beck et al (1988); Foa et al (1993). Based on this assessment, therapists work with
the clients to establish treatment goals and offer personalized cognitive and behavioral
strategies tailored to their needs.
4
cognitive restructuring, and exposure therapy) and technology (e.g., artificial intelli-
gence, machine learning, deep learning, natural language processing, large language
model, chatbot, and virtual reality). Articles that enhance, support, or implement
CBT through AI technology will be selected. Results were categorized based on the
stage at which AI was integrated in the CBT delivery process and were synthesized. In
this section, we present the synthesized findings describing the role of AI in different
stages of the CBT treatment process, as illustrated in Figure 1.
To be clear, this article only covers the stages of CBT that current
AI technology has addressed. Due to the complexity of CBT treatment,
some details and specific treatment stages have not yet been covered by
AI, and thus are not discussed here. Future research and technological
advancements may further expand AI’s role in CBT delivery, encompassing
more stages of the treatment process.
5
3.1 Integration of AI in the Pre-Treatment Stage
A comprehensive assessment is a foundational step in all psychological treatments,
including CBT. This initial assessment involves gathering information on the client’s
history, current issues, and therapeutic goals through structured or semi-structured
interviews, questionnaires, and standardized scales. In CBT, particular emphasis is
placed on assessing the client’s cognitive and behavioral patterns, as well as their
emotional responses. The insights from this detailed assessment guides therapists in
crafting tailored CBT treatment plans that effectively address the client’s concerns,
thereby maximizing therapeutic outcomes. Traditional CBT assessment methods,
while thorough, often rely heavily on manual processes, which can be subjective and
time-consuming. The reliance on the therapist’s clinical skills and professional acumen
can may also limit diagnostic accuracy and treatment efficiency. In contrast, AI tech-
nology, with its advanced data processing and analysis capabilities, offers a powerful
alternative. AI can rapidly analyze vast amounts of client data to identify patterns
and correlations, thereby assisting therapists in assessing patient symptoms more
accurately and swiftly, and in identifying adverse emotions and cognitive distortions.
6
client in identifying and understanding potential cognitive distortions, which were
categorized into ten types by Burns theory Burns and Beck (1999). They include:
all-or-nothing thinking, over generalization, mental filter, disqualifying the positive,
jumping to conclusions, magnification and minimization, emotional reasoning, should
statements, labeling and mislabeling, and blaming oneself or others.
Currently, AI technology facilitates the identification of cognitive distortions pri-
marily through text classification techniques. Researchers utilize various textual data
as inputs to build models and algorithms that automatically identify and categorize
these cognitive distortions. However, due to the lack of publicly available structured
datasets specifically designed for detecting cognitive distortions, researchers often turn
to alternative data sources such as social media data Alhaj et al (2022); Wang et al
(2023b), personal blogs Simms et al (2017), and everyday narratives Xing et al (2017);
Shickel et al (2020); Mostafa et al (2021). Shreevastava and Foltz (2021) compared
five classification algorithms for detecting cognitive distortions using a therapist Q&A
dataset obtained from Kaggle. Tauscher et al (2023) applied NLP methods to iden-
tify cognitive distortions from text messages exchanged between patients with severe
mental illnesses and their clinical therapists. Challenges in this task include dealing
with short texts that lack contextual information and imbalanced data where certain
distortion types are underrepresented, leading to poorer classification performance.
To address these issues, researchers have proposed various solutions. Alhaj et al
(2022) suggest enriching short textual representations to improve cognitive distortions’
classification of the Arabic context over Twitter. It also utilizes a transformer-based
topic modeling algorithm (BERTopic) that employs a pre-trained language model
(AraBERT). Ding et al (2022) tackled data imbalance with approaches like data
augmentation and domain-specific models, demonstrating the effectiveness of the pre-
trained language model MentalBERT. Recent advances have expanded the scope
of cognitive distortion detection to include multimodal datasets. This cross-modal
research approach provides a more holistic perspective, enabling a more comprehensive
detection of cognitive distortions. For instance, Singh et al (2023) proposed a multitask
framework that integrates text, audio, and visual data to detect cognitive distortions,
achieving significant performance improvements over existing state-of-the-art models.
Recently, LLMs have also shown promise in complex psychological tasks such as
identifying cognitive distortions Qi et al (2023); Nazarova (2023); Chen et al (2023);
Lim et al (2024). Qi et al (2023) conduct experiments to compare LLMs and super-
vised learning in cognitive distortion identification and suicide risk classification. The
experimental results indicate that LLMs struggle with accurately identifying complex
cognitive distortions in Chinese social media data, suggesting that deep learning algo-
rithms remain the preferred solution for complex psychological tasks like cognitive
distortion identification. Nazarova (2023) developed TeaBot, an AI tool fine-tuned
on GPT-3, and employs CBT techniques to aid users in identifying and challenging
distorted thoughts. Chen et al (2023) introduced the Diagnosis of Thought (DoT)
framework, which strategically prompts the LLMs to produce diagnosis rationales per-
tinent for cognitive distortions detection. Although the DoT method demonstrates its
capability in classifying cognitive distortions, it also exhibits a notable flaw, namely,
7
the model tends to over-diagnose cognitive distortions even when the user’s state-
ments are benign. Lim et al (2024) addressed this by introducing the ERD framework,
involving extraction, reasoning, and debate among multiple LLM agents to classify
cognitive distortions from user utterances. This approach significantly mitigates the
issue of overdiagnosing cognitive distortions in the DoT method.
Emotion analysis
Emotion analysis is essential for understanding individuals’ emotional states. This
understanding enables therapists or conversational agents to offer more empathetic
support and feedback in CBT interventions, thereby strengthening the therapeutic
alliance and enhancing the interactive experience Brave et al (2005); Provoost et al
(2019). Furthermore, emotional states also serve as indicators of intrinsic goals, observ-
able behaviors, and treatment efficacy. During the therapeutic process, individuals are
more likely to benefit from treatment when they can effectively manage their emo-
tions Mehta et al (2021). Therefore, during the initial assessment phase, therapists
explore the clients’ emotional responses and assist them acquire healthier emotion reg-
ulation strategies. AI-assisted emotion analysis has become increasingly prevalent in
mental health Tanana et al (2021); Assunção et al (2022); Khare et al (2023). Provoost
et al (2019) employed an emotion mining algorithm to assess the overall sentiment
and five specific emotions expressed in texts written by patients during Internet-based
cognitive behavioral therapy (ICBT), finding moderate agreement between the algo-
rithm and human judgment in evaluating the overall sentiment, while the agreement
was low for specific emotions. Patel et al (2019) proposed an intelligent social thera-
peutic chatbot. This chatbot defined several basic emotion labels, and based on these
emotion labels, three deep learning algorithms were employed to extract emotions
from user chat data. Kozlowski et al (2023) introduced Terabot, a conversational sys-
tem that enhanced sentiment and emotion recognition by integrating CBT techniques
and replacing BERT with RoBERTa in a neural language model framework. However,
Striegl et al (2023) pointed out that some emotion recognition methods Fitzpatrick
et al (2017); Provoost et al (2019) categorize emotions into discrete classes, failing to
capture the continuum and diversity of emotions accurately. To address this, they pro-
posed a deep learning-based dimensional text emotion recognition system within the
context of CBT using the ALBERT pre-trained model Lan et al (2019), fine-tuned on
emotion-annotated data for dimensional score prediction.
In recent years, ChatGPT, as a prominent example of conversational AI, shows
potential in assisting individuals with emotion understanding and management.
Rathje et al (2023) initially assessed GPT’s capability for overall sentiment analysis
(positivity, negativity, or neutrality) in English and Arabic texts. They found that
GPT demonstrates effective multilingual sentiment analysis, achieving performance
levels comparable to top-performing machine learning models from previous years.
Furthermore, they examined GPT’s ability to accurately discern more nuanced dis-
crete emotions such as anger, joy, fear, and sadness. The results demonstrated a high
level of consistency between GPT’s performance and human judgement. Elyoseph
et al (2023) also underscored GPT’s superiority over humans in emotional cognition,
as highlighted in their study.
8
Despite AI’s capabilities in emotion detection and understanding, there remains a
perception that individuals may perceive AI-driven support as lacking genuine emo-
tional resonance compared to human interaction Yin et al (2024). Thus, future research
should focus on making AI responses more conversational and human-like to address
this perception. Additionally, to ensure rigor and utility in real clinical settings, human
cross-validation is required.
9
Additionally, there is a growing interest in whether combining CBT with medica-
tion or other treatments can enhance outcomes. Several researchers have explored this
possibility Gunlicks-Stoessel et al (2020); Pei et al (2022); Stephenson et al (2023). For
example, Gunlicks-Stoessel et al (2020) found that combining CBT with medication
is equally effective and more long-lasting than medication alone. Furthermore, they
noted that the combination of medication and CBT could increase response rates and
prolong the duration of effectiveness, especially when CBT is administered to clients
responsive to pharmacotherapy.
These research underscores the potential of AI in refining the selection of psy-
chological treatments and highlights the importance of personalized approaches in
enhancing therapeutic outcomes.
10
information becomes portable, and patients can access it easily. Heng (2021) inte-
grated CBT within immersive gaming experiences to assist individuals suffering from
generalized anxiety disorder (GAD). In this approach, CBT elements are seamlessly
interwoven with the diegetic components of the game, providing psychoeducation in
an engaging and entertaining manner.
AI-powered models facilitate personalized psychoeducation by tailoring modules to
each patient’s understanding and preferences. For example, Bhaumik et al. Bhaumik
et al (2023) introduced MindWatch, a system harnessing AI-driven language mod-
els for early symptom detection and personalized psychological education. Within
the psychological education module, they utilized the foundational Llama 2 model
within the Amazon SageMaker Studio environment to deliver tailored education to
individuals experiencing mental health issues. Numerous chatbots incorporate mod-
ules for psychoeducation as well. For instance, Jang et al (2021) developed Todaki, a
chatbot for managing Attention Deficit Hyperactivity Disorder (ADHD). This chat-
bot offers tailored psychoeducation and brief CBT sessions, enabling individuals with
ADHD to acquire self-help skills for managing their condition effectively. Similarly,
Su et al (2022) created XIAO AN, an AI-assisted psychotherapy robot utilizing multi-
modal signal recognition and natural interaction technology. XIAO AN monitors
emotions and offers brief, comprehensive psychological therapy based primarily on
CBT, integrating psychological education modules.
Through these approaches, integrating CBT principles into various technological
platforms has facilitated the dissemination of psychological education, empowering
individuals to better understand and manage their mental health.
11
(2013). Thereby alleviating negative emotions, improving mental health, and pro-
moting healthier behaviors and coping strategies. To some extent, it is considered
analogous to Cognitive Reappraisal Shurick et al (2012); Zhan et al (2024). Recent
advancements have leveraged AI models to conduct CR effectively. The majority of
these efforts aim to provide evidence that AI language models can effectively engender
reframed thinking in response to negative emotional situations, as well as to compare
the efficacy of different models in CR. For instance, de Toledo Rodriguez et al (2021)
collected a small-scale CBT dataset via the crowd-sourcing platform Prolific. They
validated Google’s T5 transformer and BERT model for their ability to transform
initial negative cognition into more positive or realistic alternatives. Human evalu-
ation revealed that T5-generated reconstructions resembled original human-written
responses more closely, while BERT showed a relatively lower performance but higher
positive sentiment. Maddela et al (2023) introduced PATTERNREFRAME, a novel
dataset that incorporates personas and classical unhelpful thought patterns, extending
the reframing task to include the identification and generation of thoughts corre-
sponding to a given persona and unhelpful pattern. They evaluated various language
models using prompt-based and fine-tuning methods for their efficacy in identifying
and reframing these thoughts. Unlike de Toledo Rodriguez et al (2021) and Mad-
dela et al (2023), who conceptualize CR tasks as a sentence rewriting task, Xiao et al
(2024) emphasize client empowerment over reliance on therapist-driven solutions. They
propose, Helping and Empowering through Adaptive Language in Mental Enhance-
ment (HealMe), a model utilizing Large Language Models (LLMs) for CR through
three steps: distinguishing between situations and thoughts for a rational perspective,
generating alternative perspectives through brainstorming to alleviate negative think-
ing, and providing suggestions that recognize the client’s efforts and promote positive
action.
Previous research primarily focused on the perspective of holistic CR, attempting
to address how language models can be utilized to generate CR. However, Sharma
et al (2023c,b) have begun to delve deeper into exploring various dimensions of CR,
and examining people’s preferences for particular types of restructuring. To be more
specific, Sharma et al (2023c) investigated how negative thinking can be restruc-
tured, how language models can be utilized to facilitate such restructuring, and which
types of restructuring individuals experiencing negative thoughts prefer. They intro-
duced a framework comprising seven linguistic attributes for reconstructing thoughts
and trained a retrieval-enhanced in-context learning model to generate reconstructed
thoughts. Additionally, they conducted a randomized field study on the Mental Health
America (MHA) website. In another work, Sharma et al (2023b) designed and eval-
uated a CR tool based on human language model interaction. This system leverages
language models to assist individuals in various steps of CR, including identifying cog-
nitive traps within thoughts and selecting more actionable, empathetic, or personalized
reconstruction suggestions when reconstructing negative thoughts. Furthermore, they
demonstrated that language models can not only be utilized for generating reassess-
ments but also aid individuals in enhancing their own reassessment abilities. Wang
et al (2024b) argue that while Sharma et al (2023c,b) explored the enhancement
of reframed thoughts within a single attribute in one generation, their efforts were
12
limited in addressing multiple features. Consequently, they developed ReframeGPT,
leveraging GPT-3 as an inference engine to generate and iteratively refine reframed
thoughts across various features, aiming to achieve high-quality reframing. Li et al
(2024) focused on aligning GPT-4’s reconstructive thinking with human performance,
enhancing model performance in CR tasks by understanding the differences between
human and AI-generated reconstructions.
Identifying a patient’s automatic thoughts and emotions is crucial for effective
CR. In traditional face-to-face CBT sessions, therapists refine identified automatic
thoughts through methods such as Socratic questioning. However, when delivering
CR in ICBT, accurately capturing thoughts and emotions poses challenges. Hence,
Furukawa et al (2023) trained the T5 model to predict emotions associated with each
automatic thought. They then compared these predictions with judgments from rele-
vant experts, demonstrating the accuracy of T5 predictions. The application of T5 in
iCBT platforms holds promise for achieving more efficient CR. Shidara et al (2022)
and Jiang et al (2024) also have both underscored the significance of automatically
identifying patients’ cognitive distortions and emotions in CR. In response to this,
corresponding efforts have been made. Particularly, Jiang et al (2024), drawing on
the ABCD model in CBT, employed the pre-trained model ERNIE 3.0 to construct a
hierarchical text classification model for extracting automatic thoughts and emotions
from user statements. Furthermore, they also validated the efficacy of LLMs in this
task.
13
While these studies primarily focus on evaluating the efficacy of BA or developing
applications to assist, prompt and motivate patients to actively engage in various
activities, without explicitly linking AI to BA. Against this background, Madhu et al
(2022) and Rathnayaka et al (2022) explore how AI technology can be utilized to
enhance the effectiveness of behavioral activation. Madhu et al (2022) proposed a
novel approach for activity recognition utilizing AI. They employed multi-modal data
(combining speech and text) and utilized a BERT model for emotion recognition, along
with logistic regression for sentiment detection. Subsequently, an activity classification
model was employed wherein emotion and sentiment were integrated with keywords
to accurately identify the appropriate activity for behavioral activation. This method
achieved accuracies exceeding 80% in emotion recognition, sentiment detection, and
activity recognition tasks. Rathnayaka et al (2022) designed and developed a chatbot
named Bunji using AI technology to provide emotional support, personalized behavior
activation, and remote health monitoring. Participatory evaluation in a pilot study
environment also demonstrated the practicality and effectiveness of the chatbot.
14
proposed and discussed the idea of utilizing Explainable Artificial Intelligence (XAI)
to enhance CBT treatment for speech anxiety in VR settings. Rahman et al (2022)
explore a range of machine learning models’ performance on the task of arousal pre-
diction using publicly available datasets. They propose a pipeline to address the model
selection issue with various parameter configurations in the context of VRET.
Homework
A CBT session typically lasts 45-60 minutes, which is often insufficient for many
patients. Homework assignments bridge the gap between sessions, allowing patients
to apply learned skills and enabling therapists to assess skill acquisition and main-
tenance Beck (1979); LeBeau et al (2013). In the professional literature, homework
assignments are frequently delineated as precise, structured therapeutic tasks upon
during sessions, intended for completion between sessions. This process entails collab-
orative delineation of therapeutic objectives for the homework, determining pertinent
activities or data collection as components of the homework, strategizing the prac-
tical execution of the homework, and subsequently reviewing the homework during
subsequent sessions Kazantzis et al (2010). Homework assignments may encompass a
range of activities within each session, such as engaging with relevant materials, doc-
umenting thoughts and emotions, practicing specific skills or behaviors, or engaging
in communication with others Tang et al (2017).
Research has found that clients who consistently complete homework derive greater
benefits from interventions compared to those who complete little or no home-
work Burns and Spangler (2000), yet traditional paper-based CBT homework pose
various impediments that can significantly undermine users’ motivation to complete
tasks as instructed. Non-compliance with homework is cited as one of the most com-
mon reasons for the failure of CBT treatments Helbig and Fehm (2004), persisting as
a prevalent issue in clinical practice. The prevalence of digital devices and the internet
has enabled the transformation of traditional homework into digital formats, thereby
enhance CBT homework compliance. However, there is a lack of guidelines for design-
ing mobile phone apps tailored for this purpose. Consequently, Tang et al (2017)
proposes six essential features of an optimal mobile app aimed at maximizing CBT
homework compliance, aiming to provide theoretical guidance for the development of
such applications.
Innovative approaches, such as the integration of traditional diary writing with
mobile technology and LLM, offer promising solutions to enhance homework compli-
ance. For example, Peretz et al (2023) developed a machine learning model capable
of recognizing the presence of homework assignments during therapy sessions based
on natural language dialogue between therapists and clients in real-world settings, as
well as determining the type of homework assigned. Such advancements hold signifi-
cant promise in bolstering therapists’ ability to assign and monitor homework tasks,
ultimately fostering enhancements in therapeutic outcomes. Nepal et al (2024) inte-
grated traditional diary writing with mobile technology and LLM to create a diary
application with contextual awareness, named MindScape. Specifically, the application
utilizes real-time analysis of behavioral data collected from smartphones and employs
LLM to provide personalized, contextually relevant writing prompts. These prompts
15
are designed to guide users in reflection and contemplation, facilitating the recording
of their thoughts within daily life contexts. This innovative approach not only fosters
a habit of regular self-reflection but also addresses the challenge of homework compli-
ance in CBT. During CBT for tinnitus alleviates the patients are typically assigned
various homework tasks, including diary writing and self-monitoring. These homework
assignments primarily consist of handwritten text data. However, analyzing this data
can be extremely time-consuming for therapists, leading to decreased treatment effi-
ciency. To address this issue, Jeong et al (2024) proposed utilizing LLMs like GPT-2
to analyze the homework data of patients undergoing CBT. Their goal is to predict
the Tinnitus Handicap Inventory (THI) scores from the homework, which can, in turn,
predict the outcomes of CBT treatment, thereby enabling the selection of more per-
sonalized and effective treatment plans. Additionally, they compared the performance
of the latest language models, particularly Google’s T5 and Flan-T5, in predicting
THI scores. Finally, they looked ahead the application of this research to monitor and
predict the effectiveness of CBT treatment in patients with depression.
In summary, AI augments various CBT strategies by leveraging natural language
processing and machine learning techniques, improving effectiveness and engagement.
16
Table 1: AI tools used in current CBT intervention
AI tools Description
1
Woebot Woebot is a chatbot that offers CBT-based therapy for depres-
sion and anxiety. It engages users in daily conversations, tracks
their emotions, and introduces CBT concepts through short
videos or interactive word games. Using decision trees and
natural language processing, Woebot responds empathetically
and provides helpful suggestions. It can also detect concerning
language and directs users to external resources if needed.
Wysa2 Wysa is a therapy chatbot designed to support mental health
issues like depression, anxiety, stress, and loneliness. It utilizes
CBT, mindfulness, and positive psychology techniques. Instead
of AI-generated responses, Wysa uses pre-crafted therapeutic
conversations developed by clinicians for safe and effective inter-
actions. Its adaptive AI understands complex user inputs and
offers empathetic feedback and tailored CBT-based tools.
Youper3 Youper is a chatbot that delivers CBT through three steps: a
personalized mental health assessment, instant support via con-
versations, and symptom monitoring. It uses a decision tree
to select responses, conducts real-time emotion analysis, and
provides CBT interventions based on the user’s emotional state.
Tess4 Tess is a psychological AI chatbot providing brief conversations
for mental health support, psychoeducation, and reminders. It
uses clinician-prepared statements to deliver interventions based
on user-reported moods. Tess adjusts its responses based on user
feedback, favoring CBT-based interventions for positive reac-
tions and offering alternatives for neutral or negative ones. The
platform is customizable to align with specific treatments or
user demographics.
BetterHelp5 BetterHelp is an online therapy platform that uses AI to match
patients with licensed therapists and offers various approaches
like CBT and psychodynamic therapy.
Rumi6 Oliveira
Rumi is a chatbot uses Rumination-focused Cognitive Behav-
et al (2021)
ioral Therapy (RFCBT) to explore the relationship between
thoughts, feelings, and actions, aiming to improve mental health
and reduce depressive and anxious symptoms.
Cloud
Bot Rizea Cloud Bot is a chatbot that utilized NLP technology to func-
(2022) tion as a psychologist. It focuses on applying a cognitive
restructuring CBT technique to address users’ issues.
17
Table 1: AI tools used in current CBT intervention (continued)
AI tools Description
Saarthi Rani
Saarthi is a chatbot using NLP and AI for delivering CBT
et al (2023)
and remote health monitoring to people with mental health
issues. It offers real-time, evidence-based treatment that is
accessible, affordable, and convenient, aiming to reduce anxiety
and depression symptoms and enable long-term mental health
monitoring.
SchizoBot Nwoye
SchizoBot is a chatbot using artificial neural networks to deliver
et al (2024)
CBT for managing schizophrenia, aiding clinicians and ensuring
consistent therapy administration for patients.
XIAO AN Su
XIAO AN is a Chinese AI psychotherapy robot designed to
et al (2022)
monitor emotions and provide effective therapy, primarily using
CBT principles. It has shown effectiveness in treating anxiety
disorders in clinical trials without replacing therapists.
Emohaa Sabour
Emohaa is a Chinese conversational agent, which consists of
et al (2023)
two platforms: one is template-based (CBT-Bot) for structured
conversations and exercises based on Cognitive Behavioral Ther-
apy principles, while the other (ES-Bot) allows for open-ended
discussions on emotional issues and provides emotional support.
1
Woebot: https://woebothealth.com/
2
Wysa: https://www.wysa.com/
3
Youper: https://www.youper.ai/
4
Tess: https://www.cass.ai/x2ai-home
5
BetterHelp: https://www.betterhelp.com/
6
Rumi: https://www.facebook.com/rumibot.bot/
18
Purkayastha (2017) noted that loneliness can be a risk factor for depression. To address
this issue, the MoodTrainer application was developed, which tracks users’ locations
and isolating behaviors in real-time and provides CBT interventions when it detects
relevant behaviors. Also, Michelle et al (2014) introduced an Android application
named CBT Assistant. This APP analyzes input data from individuals with social
anxiety disorder (SAD) to identify stressors or situations triggering their mental health
issues and assesses their severity. Additionally, some mobile applications are designed
for specific CBT treatment scenarios, such as continuously tracking the sleep pat-
terns of individuals with sleep disorders Schabus et al (2023) or providing cessation
monitoring and CBT examples to smokers Alsharif and Philip (2015), thereby enhanc-
ing their success rates. These mobile applications leverage the capabilities of mobile
devices to collect real-time data and provide personalized interventions and support
to individuals undergoing CBT treatment.
In recent years, wearable devices and smartphone applications equipped with AI
technology are emerging as a trend for monitoring patients’ psychological states.
During CBT process, AI algorithms can monitor stress levels, physical activity,
speech changes, and other indicators to detect variations in a patient’s psychologi-
cal state. These algorithms can send timely alerts to both patients and healthcare
providers. Garcia-Ceja et al (2015) embedded accelerometer sensors into smartphones
to detect stress levels using classification algorithms like naive bayes and decision
trees, achieving 71% accuracy. Additionally, the long-term collection and analysis of
large datasets can help therapists and patients gain a deeper understanding of the
patient’s mental health patterns. With the assistance of AI, therapists can access more
comprehensive information, enabling them to discern trends in mental health, iden-
tify triggers, and evaluate treatment efficacy. These insights are crucial for making
informed therapeutic decisions and designing personalized interventions. Goodwin
et al (2019) collected physiological and movement data from wrist-worn biosensors
in 20 adolescents diagnosed with ASD. They developed prediction models utiliz-
ing ridge-regularized logistic regression. These models demonstrate high accuracy
in forecasting instances of aggressive behavior towards others occurring within the
subsequent minute. Such advancements lay the groundwork for proactive behavioral
interventions and timely adaptive intervention systems in the future.
Despite the growing use of AI-equipped wearable devices and mobile applications
for monitoring psychological states and data collection, there is relatively limited
research specifically focused on their application to CBT, particularly considering the
unique demands of real-time interaction inherent in this therapeutic approach.
19
to predict individual-level response to CBT using fMRI data in patients diagnosed with
panic disorder and agoraphobia (PD/AG). Similarly, Tolmeijer et al (2018) utilze the
machine learning methods to predict how people will respond when offered CBT for
psychosis. Their two-step methodology involved first identifying potentially predictive
regions and then developing a model based on these regions to make individual-level
predictions. In a large-scale study, Kaldo et al (2021) analyzed data from over 6000
patients undergoing Internet-delivered CBT (ICBT). They evaluated the accuracy of
various machine learning algorithms in predicting treatment outcomes and explored
the integration of these algorithms into an Adaptive Treatment Strategy. Isacsson et al
(2023) further examined the clinical utility of machine learning in predicting ICBT
outcomes. They investigated the optimal timing within the treatment process for the
model’s predictive accuracy to support adaptive treatment strategies, proposing an
optimal predictive model and offering specific recommendations based on their com-
prehensive analysis. Despite advancements, traditional predictive models often exhibit
limited accuracy, particularly in assessing the effectiveness of treating adolescent social
anxiety. Under this background, Zheng et al (2022) addressed this by employing deep
learning techniques to construct a predictive model for the correlation between CBT
and adolescent social anxiety, showcasing significantly improved predictive accuracy
and reduced complexity compared to traditional models. Prasad et al (2023) seeks to
develop a state-of-the-art deep-learning framework for predicting clinical outcomes in
ICBT by leveraging large-scale, high-dimensional time-series data of client-reported
mental health symptoms and platform interaction data.
Numerous researchers have also explored the use of specific treatment outcome pre-
dictors and potential biomarkers associated with particular diseases. For instance, Wei
et al (2023) utilized the Hamilton Depression Rating Scale (HDRS) score as the pri-
mary outcome measure to investigate symptom changes in subjects undergoing CBT.
Employing machine learning algorithms, they developed a support vector regression
model, ultimately identifying left dorsolateral prefrontal cortex (DLPFC) Regional
Homogeneity (ReHo) as a neuroimaging biomarker for the therapeutic effects of CBT
in depression. Many previous studies have relied on highly selective samples to predict
the outcomes of CBT. However, few have utilized routine available socio-demographic
and clinical data to accomplish this task. Therefore, Hilbert et al (2020, 2021) applied
machine learning methods to clinical and socio-demographic data to predict the men-
tal health treatment outcomes of individual patients. Their findings suggest that using
routine data alone can feasibly predict treatment outcomes for mental disorders, with
accuracy significantly surpassing chance levels.
These studies collectively highlight the promise of integrating advanced AI method-
ologies with clinical practice to enhance the early prediction of CBT treatment
outcomes, offering potential pathways for more tailored and effective therapeutic
interventions.
20
Therapist treatment quality assessment
Given the prevalence of mental health issues, ensuring the quality of psychotherapy
is crucial to addressing the growing mental health demands and the complexities of
the social environment. Traditionally, quality assessment is performed by human eval-
uators who listen to therapy recordings and review therapy notes to assess specific
therapeutic skills. However, this approach is costly and time-consuming, resulting
in limited feasibility and hindering widespread implementation in practical settings.
To address these challenges, some researchers and technology developers have begun
exploring the use of automated techniques to monitor the quality of psychotherapy.
AI offers automated solutions for assessing therapy sessions and monitoring treatment
fidelity of CBT sessions Chen et al (2022a). For example, Ewbank et al (2020) used
a large-scale dataset containing session transcripts from more than 14000 patients
receiving internet-enabled CBT (IECBT) to train a deep learning model to automat-
ically categorize therapist utterances according to the role that they play in therapy,
generating a quantifiable measure of treatment delivered. The closer the content pro-
vided by the therapist aligns with standard CBT protocols, the more positively it
correlates with significant symptom improvement in patients. Conversely, the quan-
tity of content unrelated to therapy shows a negative association. This method allows
for the indirect assessment of the efficacy of CBT psychotherapy provided by the
therapist. However, Flemotomos et al (2021) emphasize that, for CBT, the most com-
monly utilized coding scheme is the Cognitive Therapy Rating Scale (CTRS), which
defines a set of 11 session-level codes reflecting skills and techniques specific to the
intervention. Consequently, they introduced a model for quality assessment of psy-
chotherapy sessions based on adapted BERT representations of therapy language use.
Their analysis focused on the binary classification of CBT sessions concerning the over-
all CTRS score. Chen et al (2022b) also utilized CTRS scores to assess the quality of
CBT sessions. However, unlike Flemotomos et al (2021), they proposed a hierarchical
framework for automatically evaluating the quality of transcribed CBT interactions.
Ardulov et al (2022) proposed an approach that explicitly focuses on control-affine
dynamical system models. They attempted to extract local dynamic modes from short
windows of conversation and learn to correlate the observed dynamics with CBT com-
petence. Furthermore, some studies aim to identify areas where the therapist excels
and areas where improvement is needed by analyzing recordings of therapy sessions,
comparing therapists’ language and behavior against standards of specific therapeutic
models Stirman et al (2021); Flemotomos et al (2022); Zhang et al (2023); Wang et al
(2024a). This process aids therapists in enhancing their professional skills, thereby
improving the overall quality of therapy. Particularly, Wang et al (2024a) developed
PATIENT-Ψ, a novel patient simulation framework for CBT training. Specifically,
they constructed diverse patient profiles and corresponding cognitive models based
on CBT principles, and used a large language model to act as a simulated therapy
patient. This role-playing therapeutic scenario helps mental health trainees practice
CBT skills.
21
Predictive analysis of clients treatment adherence
Treatment adherence refers to the extent to which clients actively engage in and com-
ply with the advice and instructions provided by healthcare professionals, thereby
adhering to the treatment regimen, and it directly impacts the effectiveness and out-
comes of the treatment DiMatteo et al (2002). Therefore, client adherence evaluation is
essential components of effective healthcare delivery. Particularly, against the backdrop
of ICBT, there has been a reduction in face-to-face interactions between healthcare
professionals and clients, which may lead to low client engagement and high dropout
rates. AI models can analyze client behavior and responses during therapy sessions
to evaluate their understanding, engagement, and adherence to treatment content.
Côté-Allard et al (2022) presents a minimally data-sensitive approach, based on a self-
attention deep neural network, to perform adherence forecasting of clients undergoing
G-ICBT. This study leverages between 7 to 42 days of user-interaction data (login/l-
ogout) from the eMeistring platform. This analysis can identify individuals who may
be at risk of early dropout from intervention. With this information, clinicians can
implement meaningful and targeted interventions such as reminders, scheduling direct
interactions, or modifying the treatment approach to prevent premature termination.
22
4 Datasets
Datasets play a crucial role in driving research at the intersection of CBT and AI, pro-
viding foundational material for training, testing, and validating AI algorithms, and
furnishing practitioners and researchers with the bedrock for constructing and evaluat-
ing CBT models. This section aims to review existing publicly datasets relevant to the
application of CBT and AI. In the context of disease diagnosis and assessment tasks,
numerous datasets have already been extensively reviewed in various comprehensive
articles. Consequently, we will not elaborate on these datasets further in this paper.
Moreover, it is worth noting that for the task of selecting personalized treatment strate-
gies, there is a noticeable lack of publicly available datasets. This paper will not cover
these datasets in detail. Instead, we focus on datasets for specific CBT-related tasks
such as identifying and classifying cognitive distortions, conducting cognitive restruc-
turing, and analyzing CBT conversation data. For clarity, datasets where descriptions
are unclear or ambiguous in the literature will not be discussed in this paper.
Table 2 provides a comprehensive overview of datasets used for detecting and
classifying cognitive distortions. As can be seen from this table, most of the cogni-
tive distortion data sets are in English, and fewer are in Chinese. Moreover, these
datasets present several notable challenges: First, low reliability. The labeling crite-
ria vary significantly across different datasets, leading to inconsistencies. Moreover,
the inherently subjective nature of labeling cognitive distortions further compromises
reliability. Second, there is a pronounced issue of data imbalance, with certain classes
of cognitive distortions lacking adequate representation. This imbalance hampers the
model’s ability to generalize well across all classes. Table 3 summarizes datasets rel-
evant to cognitive restructuring. Additionally, Table 4 outlines datasets that involve
CBT conversations. For these areas, high-quality, annotated datasets are particularly
scarce for both cognitive restructuring and CBT conversation analysis.
Table 2: Cognitive Distortions Dataset. For a clean presentation, we have only shown
the simplified version of the cognitive distortion dataset, see Appendix A for more
details.
Study Description
Size of Dataset: A total of 1644 data entries across 11 types
of cognitive distortions, and 2000 entries for normal cases.
Wang et al (2023b)
Language: Not reported.
Data Modalities: Text.
Size of Dataset: A total of 34370 samples across 14 types
of cognitive distortions.
Elsharawi and
Language: English.
El Bolock (2024)
Data Modalities: Text.
23
Table 2: Cognitive Distortions Dataset (continued)
Study Description
Size of Dataset: Dataset CrowdDist contained 7,666 texts
across all 15 distortions, with an average of 511 responses
per distortion. Dataset MH contained two subsets: MH-
C was annotated with 15 cognitive distortion labels with
Shickel et al (2020) 1164 distorted texts, and the MH-D dataset was annotated
with binary distorted/non-distorted labels, distorted for 1605
texts, not distorted for 194 texts.
Language: English.
Data Modalities: Text.
Size of Dataset: A total of 2530 samples across 10 types of
cognitive distortions.
Lim et al (2024)
Language: English.
Data Modalities: Text.
Size of Dataset: A total of 3000 samples, 39.2% were
marked as not distorted, while the remaining were identified
Shreevastava and to 10 type of distortions.
Foltz (2021) Language: English.
Data Modalities: Text.
Size of Dataset: A total of 200 samples across 14 types of
cognitive distortions.
de Toledo Rodriguez
Language: English.
et al (2021)
Data Modalities: Text.
Size of Dataset: A total of 1077 samples across 13 types of
cognitive distortions.
Sharma et al
Language: English.
(2023c)
Data Modalities: Text.
Size of Dataset: About 10k samples across 10 types of
cognitive distortions.
Maddela et al
Language: English.
(2023)
Data Modalities: Text.
Size of Dataset: 7,500 cognitive distortion thoughts across
7 types of common cognitive distortions.
Wang et al (2023a)
Language: Chinese.
Data Modalities: Text.
Size of Dataset: A total of 3407 posts across 12 types of
cognitive distortions.
Qi et al (2023)
Language: Chinese.
Data Modalities: Text.
24
Table 2: Cognitive Distortions Dataset (continued)
Study Description
Size of Dataset: A total of 22,327 samples across 10 types
of cognitive distortions.
Na (2024)
Language: Chinese.
Data Modalities: Text.
The corpus
labeling utilizes
Detection of one specialized
Size of Dataset: A total of 1900 sen-
Cognitive opensource
Lin et al tences.
Distortion and dataset, the
(2024) Component: original text and recon-
cognitive Chinese
struction text.
restructuring. psychological Q&A
Language: Chinese.
dataset PsyQA Sun
Data Modalities: Text.
et al (2021).
Open Source: Yes2 .
Identification and
Shidara evaluation of Size of Dataset: Not reported.
Recruit
et al automatic thoughts Language: Japanese.
participants.
(2022) and cognitive Data Modalities: Text.
restructuring. Open Source: Yes. It can be obtained
by sending an email.
5 Discussion
The integration of AI into CBT has led to significant advances in pre-treatment assess-
ment, the therapeutic process, and post-treatment follow-up. First, AI has improved
1
Sharma et al (2023c): https://github.com/behavioral-data/Cognitive-Reframing
2
Lin et al (2024):
https://github.com/405200144/Dataset-of-Cognitive-Distortion-detection-and-Positive-Reconstruction
1
Lee et al (2023): https://github.com/behavioral-data/Empathy-Mental-Health
25
Table 4: CBT session dataset.
Study Tasks Dataset source Description
Size of Dataset: Three levels of
Enhancing empathy strategies, and the num-
Empathetic Crowdsourced ber of pairs for each strategy was
Response of Large Reddit posts of as follows: “emotion reaction”=1,047,
Lee et al
Language Models mental health “exploration”=481, and “interpreta-
(2023)
Based on from Sharma et al tion”=1,436.
Psychotherapy (2020). Language: English.
Models. Data Modalities: Text.
Open Source: Yes1 .
26
The future of AI in CBT holds great potential. Autonomous learning and adap-
tive therapy systems can emulate human CBT therapists, engaging in multi-round
interactions with patients and adjusting strategies based on real-time feedback. Group
intelligence support and decision-making systems can improve both therapist guidance
and patient outcomes by aggregating the experience of experienced practitioners and
facilitating intelligent social support. Cognitive augmentation and assistive systems
can develop personalized tools to enhance cognitive function, thereby increasing the
effectiveness of CBT. Customized, personalized CBT models can adapt to users’ spe-
cific data, such as social background, culture, education, and environment, to provide
tailored responses and interventions that increase therapeutic effectiveness and user
satisfaction. Despite the potential of AI, several challenges need to be addressed. Data
security and privacy are paramount, requiring compliance with privacy regulations,
anonymization of sensitive information, and advanced encryption techniques. Ethical
and algorithmic bias must be mitigated by ensuring data diversity, involving multiple
stakeholders in development, and continuously monitoring AI systems. Model explain-
ability and transparency are essential for responsible AI decisions, requiring methods
to improve interpretability and the use of rigorously tested models. Over-reliance on
AI is a risk because the success of CBT depends on the therapist-patient relationship,
which AI cannot replicate. AI should be a complementary tool, not a replacement.
Evaluation of models in clinical practice is necessary to assess real-world effectiveness,
acceptance, trust, and usability, taking into account training costs and impact on med-
ical practice. In summary, while AI offers promising enhancements to CBT, it must
be used responsibly and ethically, complementing the guidance of professional thera-
pists. The ultimate goal is to use technology to support, not replace, human-centered
mental health care.
6 Conclusion
In this paper, we have conducted a comprehensive literature review of the integration
of AI technology into CBT . We explored the application of AI throughout the CBT
process, highlighting its significant transformative impact and existing limitations.
Subsequently, We have summarized publicly available datasets relevant to various
CBT-related tasks to provide a foundation for future research. We suggested future
research directions and acknowledged the practical challenges that AI faces in clinical
settings. Overall, our review illuminates the multifaceted integration of AI in CBT,
highlighting its potential while providing a nuanced understanding of its capabilities.
We hope that the findings will guide future research, bring new perspectives to clinical
practice, and contribute to the advancement of mental health care.
Acknowledgements. This work was supported by grants from the National Natu-
ral Science Foundation of China (grant numbers:72174152, 72304212 and 82071546),
Wuhan University Innovation and Entrepreneurship Projects for College Students
(No: 202410486100), Fundamental Research Funds for the Central Universities (grant
numbers: 2042022kf1218; 2042022kf1037), and the Young Top-notch Talent Cultiva-
tion Program of Hubei Province. Guanghui Fu is supported by a Chinese Government
Scholarship provided by the China Scholarship Council (CSC).
27
Declarations
• Competing Interests: The authors have no competing interests to declare that are
relevant to the content of this article.
28
Appendix A Cognitive distortion dataset
In this section, we provide full detail of the dataset as described in Section 4.
29
Table A1: Cognitive Distortions Dataset (continued)
Dataset
Study Tasks Description
source
Dataset
Detection Therapist Q&A Size of Dataset: A total of 2530 samples.
and comes from Label: 10 labels.
Lim et al
Classification crowd-sourced Language: English.
(2024)
of Cognitive data science Data Modalities: Text.
Distortions. repository, Open Source: Yes1 .
Kaggle.
Convert A variety of
Size of Dataset: A total of 200 samples.
negative or sources such as
de Toledo Rodriguez Label: 14 labels.
distorted CBT books,
et al Language: English.
thoughts into forums and
(2021) Data Modalities: Text.
more realistic public content
Open Source: Yes2 .
alternatives. aggregators.
30
Table A1: Cognitive Distortions Dataset (continued)
Dataset
Study Tasks Description
source
Cognitive
Reframing of
Negative Thought Size of Dataset: A total of 1077 samples.
Sharma Thoughts Records Dataset Label: 13 labels.
et al through and Mental Language: English.
(2023c) Human- Health America Data Modalities: Text.
Language (MHA) website. Open Source: Yes3 .
Model
Interaction.
Obtain a
diversity of
Size of Dataset: About 10k examples of
contexts,
Training thoughts containing unhelpful thought.
situations and
models to Label: 10 labels.
thoughts from
Mad- generate, Language: English.
PERSONA-
dela et al recognize, Data Modalities: Text.
CHAT
(2023) and reframe Open Source: Yes4 . However, this
dataset Zhang
unhelpful article says it will be shared and
et al (2018),
thoughts. gives a github link, but it does not
and ask
contain data.
crowdworkers to
rewrite them.
Cognitive
Carefully select
Distortion
and train
Detection
volunteers to
and Size of Dataset: 7,500 cognitive distor-
observe scenes
investigate tion thoughts.
and note
Wang et al the Label: 7 common cognitive distortions.
possible
(2023a) association Language: Chinese.
cognitive
between Data Modalities: Text.
distortions, then
cognitive Open Source: Yes5 .
have experts
distortions
evaluate the
and mental
results.
health.
31
Table A1: Cognitive Distortions Dataset (continued)
Dataset
Study Tasks Description
source
Get PsyQA
Size of Dataset: A total of 22,327 sam-
Questions Sun
ples.
et al (2021),
Label: 10 labels.
Enhance the which is derived
Language: Chinese.
precision and from the
Data Modalities: Text.
efficacy of Chinese online
Open Source: Yes. Follow the data copy-
Na (2024) psychological mental health
right protocols and obtain it by sending
support support forum
an email.
through Yixinli, then
Note: The questions in the dataset orig-
LLMs. utilizes CBT
inate from online mental health forum,
Prompt to
and the responses are generated by Chat-
generate
GPT, not professionals.
CBTanswers.
1
Lim et al (2024):
https://www.kaggle.com/datasets/sagarikashreevastava/cognitive-distortion-detetction-dataset
2
de Toledo Rodriguez et al (2021): https://github.com/itoledorodriguez/cbt-dataset
3
Sharma et al (2023c): https://github.com/behavioral-data/Cognitive-Reframing
4
Maddela et al (2023):
https://github.com/facebookresearch/ParlAI/tree/main/projects/reframe thoughts
5
Wang et al (2023a): https://github.com/bcwangavailable/C2D2-Cognitive-Distortion
6
Qi et al (2023):https://github.com/HongzhiQ/SupervisedVsLLM-EfficacyEval
32
Appendix B Abbreviation
In this section, as shown in Table B2, we summarize the full names and abbreviations
of various specialized terms mentioned throughout the text. By providing this list of
abbreviations, our aim is to assist readers in quickly referencing and understanding
the meanings of these terms, thereby enhancing comprehension of the content and its
context within the paper.
33
References
Abd-Alrazaq AA, Alajlani M, Alalwan AA, et al (2019) An overview of the features
of chatbots in mental health: A scoping review. International journal of medical
informatics 132:103978
Alsharif AH, Philip N (2015) Cognitive behavioural therapy embedding smoking ces-
sation program using smart phone technologies. In: 2015 5th World Congress on
Information and Communication Technologies (WICT), IEEE, pp 134–139
Ardulov V, Creed TA, Atkins DC, et al (2022) Local dynamic mode of cognitive
behavioral therapy. arXiv preprint arXiv:220509752
Ball TM, Stein MB, Ramsawh HJ, et al (2014) Single-subject anxiety treat-
ment outcome prediction using functional neuroimaging. Neuropsychopharmacology
39(5):1254–1261
34
Beck AT, Epstein N, Brown G, et al (1988) An inventory for measuring clinical anxiety:
psychometric properties. Journal of consulting and clinical psychology 56(6):893
Beck JS, Beck AT (2011) Cognitive behavior therapy. New York: Basics and beyond
Guilford Publication pp 19–20
Benard AO, Lukandu IA (2018) A q-learning model for cognitive behavioural therapy
of insomnia patients. Int J Comput Inf Technol 7(3):1–7
Brave S, Nass C, Hutchinson K (2005) Computers that care: investigating the effects
of orientation of emotion exhibited by an embodied computer agent. International
journal of human-computer studies 62(2):161–178
Burns DD, Beck AT (1999) Feeling good: The new mood therapy. Avon New York
Chen Z, Flemotomos N, Imel ZE, et al (2022a) Leveraging open data and task
augmentation to automated behavioral coding of psychotherapy conversations in
low-resource scenarios. arXiv preprint arXiv:221014254
China Cognitive Behavioral therapy professional organization (2024) China’s first ccbt
psychological self-service platform. https://www.psy.com.cn/therapy/, accessed:
2024-06-19
35
adults: systematic review and meta-analysis. Journal of medical Internet research
22(9):e17831
DiMatteo MR, Giordani PJ, Lepper HS, et al (2002) Patient adherence and medical
treatment outcomes: a meta-analysis. Medical care 40(9):794–811
36
of the 2022 conference of the North American chapter of the association for compu-
tational linguistics: human language technologies: Student Research Workshop, pp
68–75
Foa EB, Riggs DS, Dancu CV, et al (1993) Reliability and validity of a brief instrument
for assessing post-traumatic stress disorder. Journal of traumatic stress 6(4):459–473
Foreman EI, Pollard C (2016) Cognitive Behavioural Therapy (CBT): Your Toolkit to
Modify Mood, Overcome Obstructions and Improve Your Life. Icon Books, Limited
37
Furukawa TA, Iwata S, Horikoshi M, et al (2023) Harnessing ai to optimize thought
records and facilitate cognitive restructuring in smartphone cbt: An exploratory
study. Cognitive Therapy and Research 47(6):887–893
Graham S, Depp C, Lee EE, et al (2019) Artificial intelligence for mental health and
mental illnesses: an overview. Current psychiatry reports 21:1–18
Harrison V, Proudfoot J, Wee PP, et al (2011) Mobile mental health: review of the
emerging field and proof of concept study. Journal of mental health 20(6):509–524
Heapy AA, Higgins DM, Goulet JL, et al (2017) Interactive voice response–based
self-management for chronic back pain: the copes noninferiority randomized trial.
JAMA Internal Medicine 177(6):765–773
Helbig S, Fehm L (2004) Problems with homework in cbt: Rare exception or rather
frequent? Behavioural and cognitive psychotherapy 32(3):291–301
38
Hilbert K, Jacobi T, Kunas SL, et al (2021) Identifying cbt non-response among ocd
outpatients: A machine-learning approach. Psychotherapy Research 31(1):52–62
Jacobson NS, Dobson KS, Truax PA, et al (1996) A component analysis of cognitive-
behavioral treatment for depression. Journal of Consulting and Clinical Psychology
64(2):295
Jameel L (2020) Virtual-reality assisted cbt for social difficulties: a feasibility study
in early intervention for psychosis services. PhD thesis, King’s College London
Jang S, Kim JJ, Kim SJ, et al (2021) Mobile app-based chatbot to deliver cognitive
behavioral therapy and psychoeducation for adults with attention deficit: A devel-
opment and feasibility/usability study. International journal of medical informatics
150:104440
Jeong Y, Song JJ, Yang J, et al (2024) Advancing tinnitus therapeutics: Gpt-2 driven
clustering analysis of cognitive behavioral therapy sessions and google t5-based
predictive modeling for thi score assessment. IEEE Access
Khare SK, Blanes-Vidal V, Nadimi ES, et al (2023) Emotion recognition and artifi-
cial intelligence: A systematic review (2014–2023) and research recommendations.
Information Fusion p 102019
39
Kozlowski M, Gabor-Siatkowska K, Stefaniak I, et al (2023) Enhanced emotion and
sentiment recognition for empathetic dialogue system using big data and deep learn-
ing methods. In: International Conference on Computational Science, Springer, pp
465–480
Lan Z, Chen M, Goodman S, et al (2019) Albert: A lite bert for self-supervised learning
of language representations. arXiv preprint arXiv:190911942
LeBeau RT, Davies CD, Culver NC, et al (2013) Homework compliance counts in
cognitive-behavioral therapy. Cognitive behaviour therapy 42(3):171–179
Li JZ, Herderich A, Goldenberg A (2024) Skill but not effort drive gpt overperformance
over humans in cognitive reframing of negative scenarios. Preprint posted online on
April
Lim S, Kim Y, Choi CH, et al (2024) Erd: A framework for improving llm reasoning
for cognitive distortion classification. arXiv preprint arXiv:240314255
40
Madhu SH, Kumar SS, Pal M, et al (2022) Activity recognition for behavioral acti-
vation in depression with artificial intelligence. In: 2022 IEEE 4th PhD Colloquium
on Emerging Domain Innovation and Technology for Society (PhD EDITS), IEEE,
pp 1–2
Malgaroli M, Hull TD, Zech JM, et al (2023) Natural language processing for mental
health interventions: a systematic review and research framework. Translational
Psychiatry 13(1):309
Maples-Keller JL, Bunnell BE, Kim SJ, et al (2017) The use of virtual reality tech-
nology in the treatment of anxiety and other psychiatric disorders. Harvard review
of psychiatry 25(3):103–113
McHugh RK, Hearon BA, Otto MW (2010) Cognitive behavioral therapy for substance
use disorders. Psychiatric Clinics 33(3):511–525
Mehta A, Niles AN, Vargas JH, et al (2021) Acceptability and effectiveness of artificial
intelligence therapy for anxiety and depression (youper): Longitudinal observational
study. Journal of medical Internet research 23(6):e26771
Michelle TQY, Jarzabek S, Wadhwa B (2014) Cbt assistant: Mhealth app for psy-
chotherapy. In: 2014 IEEE Global Humanitarian Technology Conference-South Asia
Satellite (GHTC-SAS), IEEE, pp 135–140
Na H (2024) Cbt-llm: A chinese large language model for cognitive behavioral therapy-
based mental health question answering. arXiv preprint arXiv:240316008
41
Nazarova D (2023) Application of artificial intelligence in mental healthcare: Genera-
tive pre-trained transformer 3 (gpt-3) and cognitive distortions. In: Proceedings of
the Future Technologies Conference, Springer, pp 204–219
Olatunji BO, Cisler JM, Deacon BJ (2010) Efficacy of cognitive behavioral therapy for
anxiety disorders: a review of meta-analytic findings. Psychiatric Clinics 33(3):557–
577
Oliveira ALS, Matos LN, Junior MC, et al (2021) An initial assessment of a chatbot
for rumination-focused cognitive behavioral therapy (rfcbt) in college students. In:
Computational Science and Its Applications–ICCSA 2021: 21st International Con-
ference, Cagliari, Italy, September 13–16, 2021, Proceedings, Part VI 21, Springer,
pp 549–564
Ørskov PT, Lichtenstein MB, Ernst MT, et al (2022) Cognitive behavioral ther-
apy with adaptive virtual reality exposure vs. cognitive behavioral therapy with in
vivo exposure in the treatment of social anxiety disorder: A study protocol for a
randomized controlled trial. Frontiers in psychiatry 13:991755
Pan MR, Huang F, Zhao MJ, et al (2019) A comparison of efficacy between cognitive
behavioral therapy (cbt) and cbt combined with medication in adults with attention-
deficit/hyperactivity disorder (adhd). Psychiatry research 279:23–33
Parry GD, Cooper CL, Moore JM, et al (2012) Cognitive behavioural intervention
for adults with anxiety complications of asthma: prospective randomised trial.
Respiratory Medicine 106(6):802–810
42
Penninx BW, Benros ME, Klein RS, et al (2022) How covid-19 shaped mental health:
from infection to pandemic effects. Nature medicine 28(10):2027–2037
Peretz G, Taylor CB, Ruzek JI, et al (2023) Machine learning model to predict assign-
ment of therapy homework in behavioral treatments: Algorithm development and
validation. JMIR Formative Research 7:e45156
Petersen TJ, Sprich SE, Wilhelm S, et al (2016) The massachusetts general hospital
handbook of cognitive behavioral therapy. Tech. rep., Springer
Piette JD, Krein SL, Striplin D, et al (2016) Patient-centered pain care using artificial
intelligence and mobile health tools: protocol for a randomized study funded by the
us department of veterans affairs health services research and development program.
JMIR research protocols 5(2):e4995
Rahman MA, Brown DJ, Shopland N, et al (2022) Towards machine learning driven
self-guided virtual reality exposure therapy based on arousal state detection from
multimodal data. In: International Conference on Brain Informatics, Springer, pp
195–209
43
Reger GM, Hoffman J, Riggs D, et al (2013) The “pe coach” smartphone applica-
tion: An innovative approach to improving implementation, fidelity, and homework
adherence during prolonged exposure. Psychological services 10(3):342
Rizea A (2022) Deep learning-based solution for mental health issues. DATABASE
SYSTEMS p 57
Rose RD, Buckey Jr JC, Zbozinek TD, et al (2013) A randomized controlled trial
of a self-guided, multimedia, stress management and resilience training program.
Behaviour research and therapy 51(2):106–112
Sabour S, Zhang W, Xiao X, et al (2023) A chatbot for mental health support: explor-
ing the impact of emohaa on reducing mental distress in china. Frontiers in digital
health 5:1133987
Sharma A, Lin IW, Miner AS, et al (2023a) Human–ai collaboration enables more
empathic conversations in text-based peer-to-peer mental health support. Nature
Machine Intelligence 5(1):46–57
44
Sharma A, Rushton K, Lin IW, et al (2023b) Facilitating self-guided mental health
interventions through human-language model interaction: A case study of cognitive
restructuring. arXiv preprint arXiv:231015461
Shurick AA, Hamilton JR, Harris LT, et al (2012) Durable effects of cognitive
restructuring on conditioned fear. Emotion 12(6):1393
Stirman SW, Gutner CA, Gamarra J, et al (2021) A novel approach to the assessment
of fidelity to a cognitive behavioral therapy for ptsd using clinical worksheets: A
proof of concept with cognitive processing therapy. Behavior therapy 52(3):656–672
45
randomized controlled trial protocol. Frontiers in Psychiatry 12:799917
Sun H, Lin Z, Zheng C, et al (2021) Psyqa: A chinese dataset for generating long
counseling text for mental health support. In: Findings of the Association for
Computational Linguistics: ACL-IJCNLP 2021, pp 1489–1503
Tanana MJ, Soma CS, Kuo PB, et al (2021) How do you feel? using natural lan-
guage processing to automatically rate emotion in psychotherapy. Behavior research
methods pp 1–14
Tolmeijer E, Kumari V, Peters E, et al (2018) Using fmri and machine learning to pre-
dict symptom improvement following cognitive behavioural therapy for psychosis.
NeuroImage: Clinical 20:1053–1061
Vieira S, Liang X, Guiomar R, et al (2022) Can we predict who will benefit from
cognitive-behavioural therapy? a systematic review and meta-analysis of machine
learning studies. Clinical Psychology Review 97:102193
Vuyyuru VA, Krishna GV, Mary SSC, et al (2023) A transformer-cnn hybrid model
for cognitive behavioral therapy in psychological assessment and intervention for
enhanced diagnostic accuracy and treatment efficiency. International Journal of
Advanced Computer Science and Applications 14(7)
46
Wang B, Deng P, Zhao Y, et al (2023a) C2d2 dataset: A resource for the cognitive
distortion analysis and its impact on mental health. In: The 2023 Conference on
Empirical Methods in Natural Language Processing
Wang R, Milani S, Chiu JC, et al (2024a) PATIENT-Ψ: Using large language mod-
els to simulate patients for training mental health professionals. arXiv preprint
arXiv:240519660
Wang X, Sharma D, Kumar D (2024b) Cognitive reframing via large language models
for enhanced linguistic attributes. In: The Second Tiny Papers Track at ICLR 2024
Whiteside SP, Ale CM, Vickers Douglas K, et al (2014) Case examples of enhanc-
ing pediatric ocd treatment with a smartphone application. Clinical Case Studies
13(1):80–94
Yang S, Gao B, Jiang L, et al (2018) Iot structured long-term wearable social sensing
for mental wellbeing. IEEE Internet of Things Journal 6(2):3652–3662
Yin Y, Jia N, Wakslak CJ (2024) Ai can help people feel heard, but an ai
label diminishes this impact. Proceedings of the National Academy of Sciences
121(14):e2319112121
Zhan H, Zheng A, Lee YK, et al (2024) Large language models are capable of offering
cognitive reappraisal, if guided. arXiv preprint arXiv:240401288
47
Zhang X, Tanana M, Weitzman L, et al (2023) You never know what you are
going to get: Large-scale assessment of therapists’ supportive counseling skill use.
Psychotherapy 60(2):149
48