+ Benke Et Al., 2022. Understantinf The Impact of Emotion Aware Chatbots
+ Benke Et Al., 2022. Understantinf The Impact of Emotion Aware Chatbots
A R T I C L E I N F O A B S T R A C T
Keywords: Emotion-aware chatbots that can sense human emotions are becoming increasingly prevalent. However, the
Emotion-aware chatbot exposition of emotions by emotion-aware chatbots undermines human autonomy and users’ trust. One way to
Autonomy ensure autonomy is through the provision of control. Offering too much control, in turn, may increase users’
Trust
cognitive effort. To investigate the impact of control over emotion-aware chatbots on autonomy, trust, and
Cognitive effort
Control strategy
cognitive effort, as well as user behavior, we carried out an experimental study with 176 participants. The
Experiment participants interacted with a chatbot that provided emotional feedback and were additionally able to control
different chatbot dimensions (e.g., timing, appearance, and behavior). Our findings show, first, that higher
control levels increase autonomy and trust in emotion-aware chatbots. Second, higher control levels do not
significantly increase cognitive effort. Third, in our post hoc behavioral analysis, we identify four behavioral
control strategies based on control feature usage timing, quantity, and cognitive effort. These findings shed light
on the individual preferences of user control over emotion-aware chatbots. Overall, our study contributes to the
literature by showing the positive effect of control over emotion-aware chatbots and by identifying four
behavioral control strategies. With our findings, we also provide practical implications for future design of
emotion-aware chatbots.
* Corresponding author.
E-mail addresses: [email protected] (I. Benke), [email protected] (U. Gnewuch), [email protected] (A. Maedche).
https://doi.org/10.1016/j.chb.2021.107122
Received 31 March 2021; Received in revised form 26 November 2021; Accepted 27 November 2021
Available online 30 November 2021
0747-5632/© 2021 Elsevier Ltd. All rights reserved.
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
For example, users could be in control of its overall presence (e.g., in the 2. Theoretical background and hypothesis development
form of a power button to turn the chatbot on or off), appearance, or
behavior (Dale, 2016; Feine et al., 2019). While previous research has 2.1. Autonomy and trust in emotion-aware chatbots
predominantly focused on how to develop chatbots and how their design
influences user perceptions and behavior (Li & Sung, 2021), it is unclear 2.1.1. Emotion-aware chatbots
whether and how control over emotion-aware chatbots can help to When collaborating via computers, humans face unique challenges
ensure autonomy and help users developing trust. Therefore, we artic regarding management of their emotions (Pitts et al., 2012). The ability
ulate the following research question: to manage emotions, however, has far-reaching consequences for the
individual and the team (Kelly & Barsade, 2001). A promising approach
RQ1. How does the provision of control over emotion-aware chatbots in
to supporting emotion management is the application of AI-based
fluence the autonomy and trust of users?
chatbots. Chatbots are software-based systems designed to interact
However, the provision of control over emotion-aware chatbots may with humans using natural language (Dale, 2016) and date back to the
also backfire (Cummings & Mitchell, 2008). While we assume that an 1960s (Weizenbaum, 1966). Usually applied as basic service channels in
increased level of control increases autonomy, previous research on domains such as customer service or health (Araujo, 2018), they have
machine automation has identified negative effects such as increased also successfully acted as collaboration facilitators on collaboration
cognitive effort when overextending the control level (Parasuraman platforms such as Microsoft Teams or Slack (e.g., Brandtzaeg & Følstad,
et al., 2008). These findings suggest a trade-off between positive effects 2017).
on autonomy and negative effects on cognitive effort. This has an impact In recent years, based on the concept of affective computing (Mar
on the experience with and usage of such chatbots (Shaw et al., 2010). tínez-Miranda & Aldea, 2005), chatbots can be enriched with the ability
Taken together, the effects of different levels of control over to sense and understand human emotions. These chatbots are referred to
emotion-aware chatbots on autonomy and cognitive effort are still un as emotion-aware chatbots (McDuff & Czerwinski, 2018).
explored. Therefore, we propose the following second research question: Emotion-aware chatbots recognize the emotions of their human col
laborators via textual, audio, or visual modalities and adapt their
RQ2. How do differences between low and high control levels over emotion-
behavior based on the collected information (McDuff & Czerwinski,
aware chatbots influence the autonomy and cognitive effort of users?
2018). They have been applied and studied in different contexts, such as
Finally, there might be differences in autonomy, trust, and cognitive social media and customer service (Chattaraman et al., 2019; Xu et al.,
effort and the subsequent control behavior between different users 2017; L.; Zhou et al., 2020), learning (Edwards et al., 2016; Graesser
(Zhou et al., 2019). Since users are distinct in their preferences and et al., 2017), health (Liu & Sundar, 2018), and enterprise tasks and
experiences, their control behavior may also differ (Skinner, 1996). collaboration support (Kimani et al., 2019; Peng et al., 2019). In general,
Similar to users applying privacy strategies to achieve their individual they have shown positive results in emotion perception, communication
goals of privacy (Lankton et al., 2017), users may apply control strate efficiency, and performance (Samrose et al., 2018). However, previous
gies to achieve a comfortable level of control (Frazier et al., 2011). To research has also identified negative outcomes when users perceived the
the best of our knowledge, research on control strategies for chatbot’s ability to recognize emotions as threatening and displeasing,
emotion-aware chatbots is scarce. However, investigating this gap is which led to a decrease in autonomy, trust, and, thereby, aversion to
important since it provides a better understanding of users’ decreased using the chatbot (Benke et al., 2020; McDuff & Czerwinski, 2018).
autonomy and increased distrust. By observing how users exercise
control, it may be possible to determine why such downsides occur. 2.1.2. Control over emotion-aware chatbots and autonomy
Moreover, understanding users’ control strategies may inform the If chatbots do not provide the user with the necessary abilities to
design of future emotion-aware chatbots and their adaptation to distinct achieve the intended goals in a pleasant and satisfactory manner, au
user needs. Given these implications, our goal is to identify behavioral tonomy is reduced (Friedman & Nissenbaum, 1997). Autonomy, also
control strategies for emotion-aware chatbots by analyzing user known as human autonomy, is defined as “[…] feeling free to show the
behavior. Thus, we formulate a third research question as follows: behaviors of choice; nonautonomous behaviors […] are reactions to
others’ agendas and not freely chosen” (Patrick et al., 1993, p. 782).
RQ3. What are different behavioral control strategies for emotion-aware
One way of giving autonomy to users might be through the provision
chatbots that can be identified by user behavior?
of control. Control is described as a person’s estimate that a given
To answer these research questions, we conducted an online behavior will lead to certain outcomes following “response-outcome
between-participants experiment with 176 participants. In the experi expectancies” (Bandura, 1977, p. 193, as cited in Skinner, 1996). In the
ment, the participants were part of a group chat with an emotion-aware concept of human agency, nothing is more central than people’s beliefs
chatbot that provided feedback on the team members’ emotions. in their capability to exercise control over their functional and envi
Depending on the experimental condition, participants had access to a ronmental events (Bandura, 1989, p. 1). Exemplary control features are
chatbot control center that allowed them to control different dimensions location-based service settings for navigation applications (e.g., through
of the chatbot (e.g., timing, appearance, and behavior). Our findings range sliders) (Ataei et al., 2018) or general privacy settings for
reveal that a higher level of control increases the autonomy of the users Internet-based services (e.g., adjustment of cookies) (Schaub et al.,
and, subsequently, trust in emotion-aware chatbots. Moreover, higher- 2017).
level control leads to higher autonomy compared to lower-level con Exercising control allows users to act freely from others’ agendas and
trol, but does not significantly affect cognitive effort. Finally, applying a supports autonomy (Deci & Ryan, 1987; Patrick et al., 1993). Such au
cluster analysis, we identified four behavioral control strategies that tonomy support operates through multiple pathways. It provides users
shed light on the individual users’ control preferences. With our work, with choice and minimizes the use of pressure to promote certain be
we contribute to research by identifying the positive effect of control haviors (Deci & Ryan, 1987). It minimizes external controls such as the
levels over emotion-aware chatbots, the beneficial effects of reducing uncontrollable behavior of the chatbot (Ryan et al., 2011). Similarly,
cognitive effort for users, and four user control strategies that allow us to increasing the level of control is associated with distress and positive
better understand human control behavior of emotion-aware chatbots. well-being (Spector, 1986). Based on the effect of autonomy-supportive
From a practical point of view, we provide empirical evidence for the elements in theory, we argue that the provision of control over
importance of instantiating chatbot control through control features. emotion-aware chatbots ensures autonomy of the user. Therefore, we
formulate the following hypothesis:
2
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
H1. Control over emotion-aware chatbots increases users’ autonomy. accomplish the task’s requirements (Hart & Wickens, 1990). Increases in
cognitive effort further have negative effects on communication effec
2.1.3. Autonomy and trust in emotion-aware chatbots tiveness and task performance (Parasuraman et al., 2008; Scerbo, 2007).
Supporting autonomy has been shown to lead to higher trust be For emotion-aware chatbots, findings on adverse effects of different
tween humans (Deci & Ryan, 1987). Such interpersonal trust is defined control levels are limited, mainly due to their innovative nature and
as the generalized expectancy held by an individual that the promise of recent appearance. Based on the derived implications, we translate
the other individual can be relied upon (Rotter, 1980). Additionally, in previous findings to the context of emotion-aware chatbots and propose
human-computer interaction, trust is one of the most important affective the following hypothesis:
elements (de Visser et al., 2016). Trust in automated agents is described
H3b. A higher level of control over an emotion-aware chatbot leads to
as “[…] the attitude that an agent will help achieve an individual’s goals
higher cognitive effort compared to a lower level of control.
in a situation characterized by uncertainty and vulnerability” (Lee &
See, 2004, p. 54). It is also an important factor in the acceptance and
3. Material and methods
usage of automated agents (Banks, 2019; Lee & Choi, 2017). Early
studies showed that humans with higher autonomy develop trust in their
3.1. Experimental design and procedure
superiors in working relationships, while lower autonomy leads to less
trust (Deci & Ryan, 1987). The underlying reason is that autonomy
To test our hypotheses, we conducted an online experiment with
creates an environment conducive to trust development through greater
three experimental conditions (between-participants design). We
interest, less pressure and tension, and higher self-esteem (Wiener et al.,
assigned participants to be part of a prescripted, fictitious group chat
2016). In turn, lower levels of autonomy applied through less user
with an emotion-aware chatbot that provided feedback on the team
control and higher automation have led to lower trust levels (Deci &
members’ emotions. For the emotion-aware chatbot, we instantiated
Ryan, 1987; Vimalkumar et al., 2021). Simultaneously, trust is essential
several control features in the form of a chatbot control center as a user
in the context of automated agents since it directly affects the willing
interface element. The control features allowed to control the di
ness of people to accept its actions, to decide upon their own actions, and
mensions of a chatbot with regards to general activation (i.e., power on/
to benefit from it (Gaudiello et al., 2016; Hancock et al., 2011).
off), appearance, and intervention behavior. On this basis, we intro
Emotion-aware chatbots are automated agents and apply human-like
duced three control levels as experimental conditions (see Fig. 1): a
features through their ability to sense and understand human emotions.
baseline condition with no chatbot control (NC condition), a low-level
Consequently, we argue that increased autonomy from emotion-aware
control treatment condition with a general activation feature to turn
chatbots triggers trust. Therefore, we propose the following hypothesis:
the chatbot on or off (LCT condition), and a high-level control treatment
H2. Autonomy of the user has a positive effect on trust in the emotion- condition in which all control features were present (HCT condition).
aware chatbot. Detailed stimuli descriptions are provided in Section 3.3.
In the experiment, participants were randomly assigned to one of the
conditions. In each condition, participants were first welcomed, asked to
2.2. Control levels of emotion-aware chatbots and their impact on
provide informed consent, and introduced to the scenario through an
autonomy and cognitive effort
explanatory video. The fictitious scenario consisted of an agile work
team with three team members who have an online group chat about
A higher level of control in the form of additional behavioral options
upcoming product decisions. Each participant took the role of one of the
that can be applied to a system is associated with an increased sense of
team members. In the group chat, an emotion-aware chatbot called
control (Berberian et al., 2012). This may lead to a higher degree of
“Affective Chatbot” sensed team emotions by applying sentiment anal
autonomy when interacting with emotion-aware chatbots.
ysis and intervened several times in the group chat to address potentially
To understand this ascending relationship, the machine automation
arising conflicts between team members (for examples, see middle and
literature has framed different control degrees as levels of automation
bottom panel of Fig. 1). An example of a situation that prompted the
and developed a taxonomy for increasing levels of automation (Endsley
emotion-aware chatbot to intervene was when one team member (Kim)
& Kaber, 1999). In this taxonomy, a high level of automation represents
expressed displeasure over another team member (Alex) being late to
no human control, while a low level of automation, in turn, stands for
the meeting, resulting in wasted time and lower productivity for
high control by users. These different levels of automation have specific
everyone. In such situations, the “Affective Chatbot” identified the po
effects on human cognition and behavior. Adapting this taxonomy to the
tential for conflict and intervened by sending specific emotional mes
control levels of emotion-aware chatbots, we argue that higher levels of
sages and employing behavioral interventions (e.g., chat breaks). These
control are more likely to ensure autonomy than lower levels. Therefore,
messages contained information about the team’s current emotional
we pose the following hypothesis:
state, motivational phrases, and emoticons (see Fig. 1).
H3a. A higher level of control over an emotion-aware chatbot leads to more In the control condition, participants had no control over the “Af
autonomy compared to a lower level of control. fective Chatbot”. In the other two treatment conditions, participants were
able to control the emotion-aware chatbot through control features
However, an increased behavioral control level may also come with
available in an “Affective Chatbot Control Center”, which was positioned
adverse effects on the users’ capacity since it increases the system’s
next to the chat window. These groups were given a short introduction
complexity (Evans & Fendley, 2017). In detail, previous research on
to the features of the control stimuli (i.e., the “Affective Chatbot Control
ergonomics in air traffic and human-robot interaction describes com
Center”) of the emotion-aware chatbot in a short tutorial to help the
plementary negative effects through arising challenges when having
participants become familiar with the stimuli before the experiment
additional manual control (Scerbo, 2007). Since controlling specific
started. In all conditions, participants ran through a prescripted, ficti
system aspects requires more dedicated attention to lower-level cogni
tious group chat conversation in which the emotion-aware chatbot
tive activities, fewer resources are available for higher-level cognitive
intervened. The chat was previously developed in a workshop by six
tasks such as communication and strategic coordination between team
agile professionals from a large national energy company (>40,000
members in this study’s context (Cummings & Mitchell, 2008). In
employees) and consisted out of 81 individual messages. It represents a
particular, the main effect of a higher attentional demand is an increase
real-world conversation from an agile team including abbreviations,
in cognitive effort (Hancock & Warm, 1989). Cognitive effort in our
grammar mistakes, and other real-world communication turns and was
context is defined as the attentional, cognitive, or response resources
confirmed as a realistic chat experience. Completing the chat took
required by the human element of a human-machine system to
3
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Fig. 1. Experimental stimuli: Chat windows of the no control (NC) condition (top), low-level control treatment (LCT) condition (center), and high-level control
treatment (HCT) condition (bottom). (Note: To avoid image rights violations, the images in Fig. 1 (center) have been blurred).
4
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
participants on average 7.43 min (SD = 3.01). After completing the chat, 3.4. Measures
participants filled out a questionnaire about their perceptions of au
tonomy, cognitive effort, and trust in the chatbot. Finally, they received We collected both perceived and behavioral measures. For perceived
a debriefing. measures, we used established items for all constructs (see Appendix B
for all measurement items). The items were adapted to the study context
3.2. Participants and measured on 7-point Likert scales (ranging from 1 = “Strongly
disagree” to 7 = “Strongly agree”). Autonomy was assessed using three
We recruited 214 participants for the experiment via the experi items based on Pirkkalainen et al. (2017) and Bergeron et al. (1990) by
mental platforms cloudresearch.com and Amazon Mechanical Turk asking the participants if they were able to experience the
(AMT). The experiment lasted 22 minutes on average, and AMT workers emotion-aware chatbot according to their will. To measure trust, we
received $4.50 for full participation. We applied a rigorous exclusion used a three-item trust scale conceptualized by McKnight et al. (2002).
strategy based on both survey answers and behavioral measures Finally, we assessed cognitive effort with a four-item scale by Ragu-
collected in the form of log events. As an attention check, we included Nathan et al. (2008) and Wang and Benbasat (2009). For cognitive
two reverse-coded items and one sample question instructing partici effort, participants evaluated the degree of effort invested in interaction
pants to check the second leftmost answer option. Additionally, we with the affective chatbot. In addition, we assessed the following control
asked about the experienced level of control in the form of control variables in the analysis: age, gender, education level, experience with
functionalities with three distinct items. Furthermore, participants with chatbots, experience with collaboration technologies, experience with
unintended and uncontrollable behavior (e.g., repeated browser loading agile work practices, and disposition to trust. The three experience
that led to restarting of the chat) were identified. We applied this strong variables were measured on a 5-point Likert scale.
exclusion strategy to assure valid and realistic behavioral data for the To complement the survey responses and identify distinct behavioral
analysis. control strategies, we logged participants’ behavior in the form of in
Consequently, we excluded 26 participants who failed one of the teractions with the control features (e.g., power, visibility, or blocking),
attention checks and 12 participants whose behavioral data were not as well as the start and end of the group chat. The logs contained the
analyzable (e.g., through repeated loading of the browser window). The participant identification number, a timestamp, and the event name
final sample included 176 participants. Of the final sample, 64 partici describing the specific control feature usage event (e.g., a click on the
pants were female (36.37%) and 112 were male (63.63%), and the power feature). Based on this data, we derived several behavioral
average age was 39.56 (SD = 10.02). The data collection took place measures that meaningfully represent interaction behavior. These
between November 2020 and January 2021. include the overall frequency of participant interaction with control
features (Freqoverall), the number of changes of the power functionality
3.3. Experimental stimuli (Countpower), the final status of the power change (i.e., if the emotion-
aware chatbot was on or off, Powerfinal), and the start time of each
For the experiment, we designed an emotion-aware chatbot that can participant relative to the control feature usage behavior (Startrelative).
serve as a moderator in the group chat and intervene when negative This metric was calculated by the following equation: Startrelative =
emotions arise. It employs design features, such as social cues and (Relative position of the start of the group chat)/(Overall number of
behavioral interventions (e.g., chat breaks and feedback), as well as control events). For example, if a participant started the chat after
comparing images and inspirational speech (Araujo, 2018). Through interacting six times with control features and, in total, performed 14
this design, it aims to stimulate the emotional capabilities of virtual interactions, the group chat start position was seven, and Startrelative was
team members (Benke et al., 2020; McDuff & Czerwinski, 2018; Peng equal to 0.5. A high value may be interpreted as control-conscious
et al., 2019). behavior before the chat, while a low value represents simultaneous
As experimental stimuli, we developed a chatbot control panel (i.e., control behavior of the emotion-aware chatbot during the chat. We used
the “Affective Chatbot Control Center”) to allow users to control di these behavioral measures in our cluster analysis to identify distinct
mensions of the emotion-aware chatbot based on privacy control behavioral control strategies.
research (Schaub et al., 2017). Privacy control research has successfully
elaborated control dimensions that effectively provide choices over 3.5. Data analysis
relevant usage aspects for the user, which fulfils a key requirement for
autonomy (Schaub et al., 2017). An exemplary control dimension is the We followed a three-step approach in our data analysis to answer the
accessible range of visibility of individual information on a social research questions. To analyze whether control (in the form of control
network for providing personalized advertisements, which a user can features) influences autonomy compared to the absence of control, we
adjust (Lee & Rha, 2016). The resulting chatbot control panel, which applied one-way analysis of variance (ANOVA). Subsequently, we
was located on the right side of the chat window, contained control computed a structural equation modeling (SEM) approach to assess the
features that covered the following control dimensions: (1) an overall effect of autonomy on trust.
“power” feature to turn the chatbot on and off, (2) features to adjust the Second, we analyzed the two control conditions (LCT and HCT) in
appearance of the emotion-aware chatbot interactions (i.e., visual or more detail and compared the effects of low-level control (LCT) and
textual presentation of content and presentation of individual or group high-level control (HCT) experiences on autonomy and the experienced
emotional values), and (3) behavioral features that allow adaptation of cognitive effort. Since the variables under investigation violate the
blocking or unblocking intervention behaviors and the intervention assumption of multivariate normality, we decided to conduct a
timing under conditions proactive and reactive to the emotion-aware nonparametric one-way multivariate analysis of variance (MANOVA) in
chatbot. In the baseline condition (NC), there was no control panel the form of rank-based ANOVA type statistics (ATS) for multivariate
present and, thus, no control available to the participant. In the first nonparametric data (Dobler et al., 2020).
treatment condition, we instantiated a low-level control treatment (LCT) Third, to analyze user behavior and identify behavioral control
with only the power control feature available. This allowed the partic strategies, we computed descriptive statistics on the behavioral log data
ipant to turn the chatbot on or off. Additionally, when turned off, all and ran a linear regression model for relationships between behavioral
previous chatbot interventions were excluded from the group chat. In and survey data. Since the effects of cognitive effort in the HCT condi
the second treatment condition, a high-level control treatment (HCT) tions show high variance, we further aimed to categorize control stra
with all control features was available to the participants. The chatbot tegies. Therefore, we followed established methods and conducted a
and the control features are displayed in Fig. 1. two-step cluster analysis approach (Hair et al., 2010). Cluster analysis
5
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
is an exploratory method for classifying collections of similar objects and discriminant validity of the measures.
within a group (Kaufman & Rousseeuw, 1990) and is a well-accepted
method of group categorization (Hair et al., 2010). First, we applied 4.2. Manipulation check
an agglomerative hierarchical clustering technique (i.e., the Ward
method) to find an initial number of clusters as input for the second step, To check the effects of control levels on the participants and to rule
which was a nonhierarchical k-means clustering analysis. The increase out alternative hypotheses, we conducted two manipulation checks.
in the within-cluster sum of squares method (elbow method) was used to First, we assessed whether participants used the control features when
form the hierarchy using the Euclidean distance measure. The set of they were present. Second, we asked the participants to rate the
possible clusters was examined by observing the fusion level (i.e., the following statement on a 7-point Likert scale (ranging from “Strongly
change in the value of the sum of all squares) to choose the cluster so disagree” to “Strongly agree”): “I am allowed to control the affective
lution. The generally accepted procedure, which views large changes in chatbot.“. Subsequently, we calculated an ANOVA with this statement as
fusion levels as the ‘‘best cut’’ in hierarchical clustering, was used to the dependent variable. The results revealed a significant difference in
choose the number of clusters (= k). In the second step, we assigned the the variable between the experimental conditions (F(2, 173) = 189.6, p
data to k clusters with the k-means clustering algorithm using the < 0.001). The results of a Tukey HSD post hoc comparison showed that
resulting number of clusters from the first step. there was a significant difference between the NC condition (M = 2.695,
For the first analysis step, we used R software (v. 4.0.3) with the SD = 1.754) and the LCT condition (M = 5.475, SD = 1.754), as well as
lavaan package (v. 0.6–7), for the second step, we applied the rank between the NC and the HCT condition (M = 6.103, SD = 1.209) (p <
Manova package (v. 0.0.6), and for the two-step cluster analysis, we 0.001). These two steps confirm the validity of the manipulation in the
used the packages factoextra (v.1.0.7), NbClust (v.3.0), and fpc treatment conditions.
(v.2.2–9).
4.3. Effects of control on autonomy and trust (H1 & H2)
4. Results
To test the effect of control over emotion-aware chatbots on auton
Prior to the main analyses, Table 1 reports descriptive statistics of the omy (Hypothesis 1), we conducted a one-way ANOVA with autonomy as
experimental variables (autonomy, cognitive effort, and trust) manipu the dependent variable with the three treatment groups (NC, LCT, and
lated through the treatments, including the means and standard de HCT). The ANOVA showed a significant main effect of the treatment
viations of key constructs in the groups of experimental treatment group on autonomy (F(2, 175) = 34.84, p < 0.001). Tukey post hoc
conditions. While our main analyses focused on testing our hypotheses analyses revealed a significant difference (p < 0.001) between the NC (M
regarding the effect of control levels on autonomy (H1), the effect of = 3.21, SD = 1.57) and LCT conditions (M = 4.78, SD = 1.6) and be
autonomy on trust (H2), and the differences between low and high tween the NC and HCT conditions (M = 5.38, SD = 1.38) (p < 0.001).
control levels (H3a, H3b), we also conducted a supplementary analysis Similar results were obtained when controlling for age, gender, educa
on the direct effect of control levels on trust (see Appendix C for details). tion level, experience with chatbots, experience with collaboration
technologies, experience with agile work, and disposition to trust by
4.1. Measurement characteristics analysis of covariance (ANCOVA). This result provides support for H1
since control elicits greater levels of autonomy for both treatment
We first conducted a confirmatory factor analysis to examine the conditions.
reliability of the measurement scales through Cronbach’s α and com To test for the effect of autonomy on trust (Hypothesis 2), a structural
posite reliability (CR) and validity in the form of the average variance equation model (SEM) was computed. The model fit was acceptable
extracted (AVE) and correlations between the items. The measurement based on relevant statistics: χ 2 = 59.28, CFI = 1.0, TLI = 1.005, RMSEA
characteristics are shown in Table 2. = 0.0, and SRMR = 0.081. The SEM analysis revealed a significant
After removing one item of cognitive effort due to low factor loading, positive effect of autonomy on trust (b = 0.378, p < 0.001; H2 sup
all remaining items loaded strongly on their intended construct, with ported). Autonomy explains approximately 14.0% of the variance of
loadings ranging between 0.69 and 0.95, thus supporting indicator trust (R2 = 0.143). Furthermore, when including the control variables in
reliability (Hair et al., 2010). Next, we assessed internal reliability by the model, this effect did not change. Additionally, we observed sig
calculating Cronbach’s α and composite reliability (CR). For all con nificant effects of disposition to trust (b = 0.442, p < 0.001) and expe
structs, Cronbach’s α and CR exceed the accepted threshold of 0.70 (Hair rience with chatbots (b = 0.25, p < 0.001) on trust. This also increased
et al., 2010). We analyzed discriminant validity by examining the R2 by ΔR2 = 0.22. Taken together, these results suggest that autonomy
average variance extracted (AVE) values for each construct, which all has a strong effect on users’ trust in emotion-aware chatbots.
exceed the suggested threshold of 0.50 (Fornell & Larcker, 1981), and
convergent validity by comparing correlations between constructs. 4.4. Differences between low and high control levels (H3a & H3b)
Finally, the overall model showed a good fit to the data (χ 2 = 57.7, CFI =
0.994, TLI = 0.992, RMSEA = 0.034, and SRMR = 0.036). Consequently, In the second step of the analysis, we specifically focused on the
we concluded that there was sufficient reliability and good convergent effects of the two treatment conditions with different control levels on
autonomy and cognitive effort (Hypotheses 3a, 3b). Therefore, we
Table 1 excluded the NC condition, in which no control was provided, and
Descriptive statistics of manipulated constructs (mean, standard deviation). continued our analyses with a subsample including only the low- and
Construct Experimental conditions
high-level control conditions (LCT and HCT).
The results of the nonparametric rank-based MANOVA based on the
NC (n = 59) LCT (n = 59) HCT (n = 58)
wild bootstrap approach reveal a significant effect (p = 0.017) of the
(1) Autonomy 3.21 (1.65) 4.78 (1.6) 5.38 (1.38) treatment condition on the dependent variables, i.e., autonomy and
(2) Cognitive Effort -* 2.14 (1.29) 3.06 (2.19)
cognitive effort differ significantly between the LCT and HCT conditions.
(3) Trust 5.04 (1.01) 4.72 (1.67) 5.43 (1.6)
Note: NC = No control; LCT = Low-level control treatment; HCT = High-level control Given the significant main effects, we performed follow-up univariate
treatment. post hoc comparisons. We found that only the difference in autonomy
*In the NC condition, cognitive effort was not assessed because cognitive effort was significant (p = 0.013), while that in cognitive effort was not (p =
focused on the control functionalities only, which were not provided in the NC 0.271). Subsequently, we performed Tukey’s pairwise comparisons and
condition.
Dunnett’s many-to-one comparisons, which showed a significant effect
6
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Table 2
Measurement characteristics.
Constructs Mean (SD) α CR AVE Factor loadings Correlation
1 2 3
for autonomy (p = 0.014). The results suggest that autonomy is signif potential reasons for the high variance in cognitive effort, we investi
icantly affected by the level of control users have over the emotion- gated correlations within the data through a correlation matrix (see
aware chatbot (Support of H3a). Interestingly, cognitive effort was not Table 4). Findings from the correlation matrix indicate that there exists a
significantly different between the two conditions. Therefore, we reject negative relationship between cognitive effort and the relative start
H3b. It should be noted that the results for cognitive effort in the HCT time, Startrelative, in the HCT condition (see Fig. 3).
condition have high variance. This could imply that different users had To analyze this relationship in more detail, we ran a linear regression
different perceptions regarding cognitive effort. Consequently, we with Startrelative as the independent variable and cognitive effort as the
investigate these individual differences between users in Section 4.4 in dependent variable. The results show a significant negative effect on
more detail. Finally, the results of the hypotheses testing are summa cognitive effort (β = -0.04, p < 0.001) with an R2 of 0.13. In contrast, we
rized in Table 3. conducted a linear regression for the LCT condition, which did not show
a significant effect (p = 0.076). The results document a significant
negative relationship between the relative start time of participants and
4.5. Analysis of behavioral control strategies their cognitive effort in the HCT condition.
7
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Fig. 2. Distribution of the final status of the chatbot power control (left). Histogram of the distribution of participants showing the amount of interaction with the
control features (right).
Table 4
Correlations among behavioral variables, autonomy, cognitive effort, and trust (in HCT condition, n = 58).
Construct Mean SD 1 2 3 4 5 6 7
In contrast, in the second cluster, C2, all users decided to keep the
chatbot activated. C2 is characterized by an early start of the chat (mean
of Startrelative = 0.19) and the lowest level of control feature usage within
the clusters. Therefore, we reference C2 as Kickstarter. While autonomy
remains high in this cluster, we see the highest mean in cognitive effort.
This indicates that this group of individuals did not interact intensively
with the features and started the chat right away, without deeper
consideration of options and consequences. Similarly, they experienced
the control features as mentally exhausting (high cognitive effort),
possibly indicating that they were overwhelmed by the different options
available. Interestingly, we see a significant negative correlation be
tween Startrelative and cognitive effort (r = − 0.538) within C2. This means
that if users used the control features before they started the group chat,
they experienced lower cognitive effort. Thus, although users within this
cluster did feel overwhelmed by the control functionality, cognitive
effort decreased quickly with a later relative start of the group chat.
Fig. 3. Scatter plot showing the linear regression of the relationship between The third cluster, C3, distinguishes itself strongly from C2, as it also
Startrelative and cognitive effort in the HCT condition. shows a low amount of feature usage but exhibits a late chat start (mean
of Startrelative = 0.74). This indicates that C3 users thoughtfully adapted
some settings and then started the chat. We, therefore, name C3 as
Table 5
Controller. Simultaneously, the perception of cognitive effort is the
Results of two-step cluster analysis of participants in the HCT condition.
lowest among all clusters (MCognitive Effort = 1.0), while demographics
Control strategy Cluster 1 Cluster 2 Cluster 3 Cluster 4 show the highest average age for C3. This is remarkable given the public
cluster
perception that younger generations tend to cope better with digital
Cluster label Soloists Kickstarter Controller Undecideds technologies.
Cluster size 11 18 14 15
The fourth cluster, C4, is the second largest cluster, with 15 users
(18.97%) (31.03%) (24.14%) (25.86%)
Freqoverall 10.0 3.44 (3.67) 5.08 (3.59) 15.73 (25.86% of the sample), and is characterized by the highest control
(6.94)* (5.36) feature usage. The relative start time, on the other hand, is centered
Startrelative 0.49 0.19 (0.11) 0.74 (0.12) 0.51 (0.18) (mean of Startrelative = 0.51), which shows that C4 users used many
(0.14)* control features both before starting the chat and during the chat. We
Powerfinal Off On On On
refer to this cluster as the Undecideds. Cognitive effort within this cluster
Note: *Values represent the mean and standard deviation.
is also rather low but shows high variance (MCognitive Effort = 2.96,
8
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Fig. 4. Boxplots of behavioral and perceived variables according to behavioral control strategy clusters.
SDCognitive Effort = 2.12). we observe a pattern with peaks in the appearance categories (Modality
Next, we investigated the control strategies within the four clusters and Visibility) and a drop in the category Blocking. Confirming our prior
over (1) time and for the (2) specific control categories. First, to analyze observation, the Undecideds cluster shows the highest average usage
the distribution over time, we aggregated the control event time into scores of these three clusters in all the categories, with a peak in the
buckets of 1 min and assigned all control events to these buckets. The Visibility category. In contrast, the Soloists cluster shows a complemen
result is plotted in Fig. 5 (on the left). Based on the resulting distribution, tary pattern, with the highest count of Power changes, the lowest count
we describe four phases of control feature usage. For all clusters, there is in Visibility, and a contrasting peak in Blocking. This might indicate a
a peak in control events within the first minute (phase 1), followed by a distinct preference for controlling the emotion-aware chatbot in the
considerable drop in the following two minutes (phase 2). After this Controller cluster.
drop, we observe an increase in the Undecideds cluster in minutes four to
eight, which is confirmed by the other clusters (phase 3). Finally, only 5. Discussion
isolated observations remain in all clusters, representing participants
who took more time to conduct the group chat (phase 4). Overall, the In this paper, we investigate the effect of control levels of emotion-
fourth cluster includes the highest number of control events over usage aware chatbots on users’ autonomy, cognitive effort, and trust. To do
time, with strong variance. In contrast, the Kickstarter cluster shows a so, we conducted an online between-participants design experiment
rather constant distribution over time, while it contains the lowest with three conditions, i.e., no control (NC), low-level control (LCT), and
number of control events. high-level control (HCT). Our results show a positive effect of control on
Second, to investigate the usage of the five distinct control feature autonomy and trust and an increased but not significant effect on
types per cluster (i.e., changes of Power, Modality, Visibility, Blocking, and cognitive effort. Through cluster analysis, we identified four behavioral
Timing), we calculated the average usage of the five types by dividing the control strategies that describe a majority of users who keep the chatbot
number of control events (i.e., a click on a particular feature) by the total on, as well as users who start early versus those who first adapt their
number of participants in a cluster. Subsequently, we visualized the control behavior. This behavior seems to also correlate with users’ au
results as a parallel coordinates plot (see Fig. 5, right). Parallel co tonomy and cognitive effort. In the following, we highlight the study’s
ordinates plots are useful for examining multiple variables and their key findings and their practical implications. Finally, we outline limi
relationships. Looking at clusters Kickstarter, Controller, and Undecideds tations and future research directions.
Fig. 5. Distribution of absolute control events over time (in minutes) (left). Parallel coordinates plot of the average number of control events per category (right).
9
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Control level effects. As the first key finding, the results confirm the relating this to findings of increased autonomy, the ability to turn the
hypothesized effects of control over emotion-aware chatbots on auton chatbot off might have contributed to this comfortable outcome since
omy and trust. The experiment shows that control features enable the the users still might have appreciated this ability. Simultaneously, So
user to exercise control when interacting with the chatbot. Moreover, loists showed the lowest level of trust. The lowest trust level confirms the
our results provide empirical evidence suggesting that people experi decision to turn the chatbot off. It is possible that Soloists had negative
ence higher autonomy from emotion-aware chatbots with control fea prior experiences with chatbots since chatbots reportedly have often
tures than without. The participants experienced choice through the failed in the past (Brandtzaeg & Følstad, 2018). The distinct Soloists
adaption of chatbot features, leading to a sense of agency that increased control strategy shows that there exists a minority of users who do not
autonomy (Bandura, 2006). need or want to employ emotion-aware chatbots. However, the ability to
Furthermore, the results reveal that autonomy from emotion-aware turn it off still allows them to reject the chatbot, which might have
chatbots also increases trust. Therefore, our findings show how to contributed to the high level of autonomy.
overcome initially documented distrust towards chatbots (i.e., due to The second categorization of the remaining control strategies con
breaches of intimate user information) (Shumanov & Johnson, 2021) tains those who experienced high cognitive effort (Kickstarter) and low
and threats from affective technologies (i.e., negative influences on re cognitive effort (Controller and Undecideds). Kickstarter showed the
lationships between humans as well as between humans and computers) lowest level of control feature usage. This indicates that Kickstarter did
(IEEE, 2019), at least in the context of emotion-aware chatbots. It seems not value the control features or did not understand their value. While
that the ability to have power over technology mitigates such negative less use of the control ability could represent neglect of the control
feelings. With no control over chatbots, users do not have insights into features, these users did not turn the chatbot off. On the other hand, they
the interaction and might feel manipulated (Skjuve et al., 2021). experienced the highest cognitive effort, which represents a strong
Through the act of giving control to users and, thereby, letting them negative effect. An intuitive interpretation could be that users who
experience autonomy, this feeling of manipulation might be overruled, started early had to divide their cognitive capacity between controlling
and trust will be induced. Concerning implications, it is essential in re the chatbot and participating in the group chat. While the underlying
lationships between humans and computers to establish a high level of data make it difficult to draw general conclusions, one potential way to
trust, especially when a change in user behavior is desired and a sig fix this issue might be by motivating users to first configure the chatbot
nificant shift in cognitive and emotional attitude is required (Bickmore, before starting a chat. The Kickstarter control strategy, thereby, explains
2005). Since emotion-aware chatbots support users with such emotional the high variance of cognitive effort in the HCT condition. By reducing
shifts, an increase in trust is highly valuable. the cognitive effort within this group, the overall variance in cognitive
In summary, we see that wherever autonomy is desired, it makes effort might be reduced in the HCT condition, and shortcomings will be
sense to offer control over emotion-aware chatbots. This increases users’ excluded.
motivation to accept and use chatbots since control features accessible The Controller group, in turn, showed the opposite control strategy
by themselves effectively increase trust. In exceptional cases where and conducted control feature usage on average before starting the chat.
autonomy might not be desired but emotion-aware chatbots can be Additionally, Undecideds seemed to follow this strategy, although they
employed (e.g., psychiatric health or education), the application of still showed a higher level of control feature usage. Through this action,
control features needs to be reassessed. Controller users might have experienced low cognitive effort. We,
Autonomy vs. Cognitive effort. The second key finding is related to therefore, derive the consequence that for the Kickstarter group, poten
the interplay between positive effects on autonomy and potential tial for improvement exists.
negative effects on cognitive effort when comparing low- and high-level In summary, the four control strategies show that users can be
control conditions. Emphasizing the findings on autonomy, we observe categorized based on their control behavior, which is similar to findings
that a higher control level also leads to a higher autonomy. On the other concerning privacy preferences based on personality (Lee & Rha, 2016).
hand, our results show no significant difference between the two treat We see that the majority of users value the control ability. Moreover, this
ment conditions for cognitive effort. Findings from machine automation, finding explains potential reasons for a high cognitive effort with control
human-robot interaction, and privacy management on websites describe behavior in the HCT condition through unreflective interaction with the
such negative effects of higher control levels as caused by too many features and provides recommendations for future improvements of
control options, lack of transparency, and short decision times (Calhoun control features in chatbots.
et al., 2018; Dienlin & Metzger, 2016; Hancock & Warm, 1989). In the
context of emotion-aware chatbots, our results do not report this 5.1. Practical implications
downside. By itself, this finding is desirable since it means that there
seem to be no significant side effects on cognitive effort due to distinct The findings of our study have several practical implications. First,
control levels. they support developers in future chatbot design by deriving chatbot
Summarizing this finding, we report a balanced point of control over design guidelines that facilitate autonomy while minimizing shortcom
emotion-aware chatbots with a high level of control available based on ings such as high cognitive effort. While our results show that it is
its positive effect on autonomy and no significant negative effects of generally helpful to instantiate chatbot control, in some cases, it may be
cognitive effort in general. However, since general assumptions do not important to provide additional support in the form of explanations and
come without exceptions, a share of users experienced increased guidance on how and when to apply control features to prevent high
cognitive effort with a high control level due to the high variance of cognitive effort. While some users (Kickstarter) start chatting right away,
cognitive effort in the HCT condition, indicating that this finding may others first explore these features and the context. Since the Kickstarter
not generalize to every user. group often experienced high cognitive effort, possibly through the ne
Behavioral control strategies. As a third key finding, we discuss the cessity to divide their cognitive capacity between the group chat and the
four behavioral control strategies that we identified in the HCT condi control features, guidance through design (e.g., in the form of remarks in
tion. Abstracting these strategies allows two main categorizations. pop-up windows) might help decrease cognitive effort. Through such
First, the strategies can be divided into users who kept the chatbot on means, designers can adjust chatbots properly for different contexts and
(Kickstarter, Controller, Undecideds) and those who turned it off (Soloists). users and finetune their effect and increase users’ trust and willingness
Soloists, representing a minority of the sample, apparently decided that to use the chatbot.
the emotion-aware chatbot does not bring additional value. Simulta Second, one key focus area for the application of chatbots is
neously, they experienced high autonomy and low cognitive effort, providing team and task support in collaboration systems, such as
which implies that they seemed comfortable without support. However, Microsoft Teams or Slack (Hohenstein & Jung, 2020; McDuff &
10
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Czerwinski, 2018). Companies employ chatbots in such applications to potentially causes wrong decisions that also disrupt the group chat. In
increase the work productivity and well-being of users, resulting in future studies, additional important individual-level outcomes might be
better performance. To optimize the effectiveness of the chatbot and, included to minimize this threat.
thereby, the productivity of the user, users need to accept and trust the Finally, our experimental design includes two treatment conditions
chatbot. Following our results, implementing control features in practice — one includes the minimum level of control, and the other includes
and, thereby, enabling autonomy increases the willingness and moti extensive control over multiple dimensions of the chatbot at once. Apart
vation to use the emotion-aware chatbot. from these extremes, there are multiple different combinations of con
Finally, our findings also contribute to the ethical discussion about trol functionalities and additional features that are worth investigating.
AI, chatbots, and affective computing technologies. The application of For example, different control combinations, such as visual control only
chatbots in human-computer relationships has experienced critical (Visibility and Modality) or control over all features without the func
reflection since its beginnings (Krafft et al., 2017; Skjuve et al., 2021). To tionality of turning the chatbot on or off, could be investigated to derive
mitigate risks, many public and private initiatives have developed more detailed insights into the users’ control behaviors. While this is
ethical guidelines for the trustworthiness of AI-based systems, e.g., the beyond the scope of this study, future research should evaluate different
“Ethical guidelines for trustworthy AI” by the European Union (Euro levels of control using different control features in more detail.
pean Commission, 2019) and “AI — Our Approach” by Microsoft
(Amershi et al., 2019). In these guidelines, autonomy is described as a 6. Conclusion
key contribution to the provision of ethical conformation and trust
worthy technology (Jobin et al., 2019). Our findings provide a concrete In this study, we empirically investigated the effect of control over
example of the instantiation of autonomy and its positive influence on emotion-aware chatbots on human autonomy, trust, and cognitive
trust. Therefore, they show that ethical design also has effective prac effort. The results showed a significant positive effect on autonomy and
tical value for emotion-aware chatbots beyond abstract guidelines. trust and an increasing but nonsignificant effect on cognitive effort. In a
Consequently, we conclude with the benefits of implementing control subsequent cluster analysis of users’ control behaviors, we identified
over ethical chatbot design. four behavioral control strategies that shed light on the individual
control preferences of different types of users (Soloists, Kickstarter,
5.2. Limitations and future research Controller, and Undecideds). Our findings advance the understanding of
user control behavior in interactions with chatbots and provide impli
In addition to its findings, our study has certain limitations. First, the cations for the adaptation and design of such systems according to user
study was designed as a controlled online experiment to ensure internal needs. Therefore, this study also makes practical contributions and helps
validity. However, to generalize our results, future research should improve future interactions with emotion-aware chatbots by humans in
evaluate the effect of control features of emotion-aware chatbots in real- practical scenarios.
world applications such as Microsoft Teams or Slack.
Second, emotion-aware chatbots are applied in multiple contexts and The authors declare that they have no conflict of interest
ways. We selected chatbots in the form of team facilitators in a group
chat, which is an innovative but realistic and established use case This research did not receive any specific grant from funding
(McDuff & Czerwinski, 2018; Samrose et al., 2018). However, there are agencies in the public, commercial, or not-for-profit sectors.
also other use cases of emotion-aware chatbots, e.g., in the context of
health or education. In future research, our experimental design could Credit author statement
be transferred to such settings, and the findings of our study should be
compared and combined with the results to generalize the knowledge Ivo Benke: Conceptualization, Methodology, Software, Formal
obtained in our study. analysis, Investigation, Data curation, Writing – original draft, Writing –
Third, the design of the experiment, which was in accordance with review & editing, Visualization, Project administration. Ulrich Gne
respective initiatives in research and practice (e.g., “Ethics guideline for wuch: Conceptualization, Methodology, Formal analysis, Investigation,
Trustworthy AI”), had the goal of maximizing trust. However, maxi Writing – original draft, Writing – review & editing. Alexander Maed
mizing trust is not always beneficial. In the case of failures of the che: Resources, Investigation, Writing – original draft, Funding acqui
emotion-aware chatbot, trust that is too high can lead to overreliance or sition, Project administration, Supervision.
complacency (Parasuraman & Wickens, 2008). Such complacency
Appendix
Characteristics Mean SD
11
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
(continued )
Characteristics Indicator Percentage
The results of our main analysis show that control over emotion-aware chatbots positively influences autonomy, which in turn enhances users’
trust. To complement the evaluation of the effect of control over emotion-aware chatbots on autonomy, we conducted a supplementary analysis on the
direct effect of control over emotion-aware chatbots on trust. First, looking at the descriptive statistics in Table 1, we can see that the highest levels of
trust can be found in the high-level treatment condition (HCT; M = 5.43). However, somewhat counterintuitively, we find that trust in the emotion-
aware chatbot is lower in the low-level control condition (LCT; M = 4.72) than in the no control condition (NC; M = 5.04). This finding indicates that
users generally tend to have a high level of trust in the chatbot. However, their trust might be negatively affected when they are provided with only a
limited set of control features (there was only a power button to turn the chatbot on or off), even though it elicits greater levels of autonomy. In
contrast, when there is no control center at all, users do not even know that there is the possibility of controlling the chatbot and, therefore, their
evaluation of trust does not take into account their lack of autonomy but rather focuses on other aspects (e.g., the design of the chatbot itself). More
generally, this finding supports the view that trust is a complex multidimensional construct that is influenced by a variety of factors (McKnight et al.,
2002), particularly when it comes to AI-based systems such as emotion-aware chatbots.
To analyze the differences in trust between the experimental conditions in more detail, we conducted a Kruskal-Wallis test with trust as the
dependent variable since the assumption of normality was violated for two of the three experimental conditions. The results showed a significant main
effect of the experimental conditions on trust (χ 2(2, 176) = 17.04, p < 0.001) with a moderate effect size (η2 = 0.089). Post hoc pairwise comparisons
with Dunn’s test (Dinno, 2015) revealed significant differences between the NC and HCT conditions (p < 0.001) and the LCT and HCT conditions (p <
0.001). The difference between the NC and LCT conditions was not significant (p = 0.943). Consistent with our main analysis, these results provide
further evidence of the positive impact of providing higher levels of control on users’ trust.
References Berberian, B., Sarrazin, J. C., Le Blaye, P., & Haggard, P. (2012). Automation technology
and sense of control: A window on human agency. PLoS One, 7(3), 1–6. https://doi.
org/10.1371/journal.pone.0034075
Amershi, S., Weld, D., Vorvoreanu, M., Fourney, A., Nushi, B., Collisson, P., Suh, J.,
Bergeron, F., Rivard, S., & Serre, L. De (1990). Investigating the support role of the
Iqbal, S., Bennett, P. N., Inkpen, K., Teevan, J., Kikin-gil, R., & Horvitz, E. (2019).
information center. MIS Quarterly, 14(3), 247–260. https://doi.org/10.2307/248887
Guidelines for human-AI interaction. In Proceedings of the 2019 CHI conference on
Bickmore, T. W. (2005). Establishing and maintaining long-term human-computer
human factors in computing systems (pp. 1–13). https://doi.org/10.1145/
relationships. ACM Transactions on Computer-Human Interaction, 12(2), 293–327.
3290605.3300233
https://doi.org/10.1145/1067860.1067867
Araujo, T. (2018). Living up to the chatbot hype: The influence of anthropomorphic
Brandtzaeg, P. B., & Følstad, A. (2017). Why people use chatbots. Lecture notes in computer
design cues and communicative agency framing on conversational agent and
science (including subseries lecture notes in artificial intelligence and lecture notes in
company perceptions. Computers in Human Behavior, 85, 183–189. https://doi.org/
bioinformatics), 10673 LNCS (pp. 377–392). https://doi.org/10.1007/978-3-319-
10.1016/j.chb.2018.03.051
70284-1_30
Ataei, M., Degbelo, A., & Kray, C. (2018). Privacy theory in practice: Designing a user
Brandtzaeg, P. B., & Følstad, A. (2018). Chatbots: Changing user needs and motivations.
interface for managing location privacy on mobile devices. Journal of Location Based
Interactions, 25(5), 69–84. https://doi.org/10.1145/3236669
Services, 12(3–4), 141–178. https://doi.org/10.1080/17489725.2018.1511839
Calhoun, G. L., Ruff, H. A., Behymer, K. J., & Frost, E. M. (2018). Human-autonomy
Bandura, A. (1977). Self-efficacy: Toward a unifying theory of behavioral change.
teaming interface design considerations for multi-unmanned vehicle control.
Psychological Review, 84(2), 191–215. https://doi.org/10.1007/978-3-319-75361-4
Theoretical Issues in Ergonomics Science, 19(3), 321–352. https://doi.org/10.1080/
Bandura, A. (1989). Human agency in social cognitive theory. American Psychologist, 44
1463922X.2017.1315751
(9), 1175–1184. https://doi.org/10.1037/0003-066X.44.9.1175
Chattaraman, V., Kwon, W. S., Gilbert, J. E., & Ross, K. (2019). Should AI-based,
Bandura, A. (2006). Toward a psychology of human agency. Perspectives on Psychological
conversational digital assistants employ social- or task-oriented interaction style? A
Science, 1(2), 164–180. https://doi.org/10.1111/j.1745-6916.2006.00011.x
task-competency and reciprocity perspective for older adults. Computers in Human
Banks, J. (2019). A perceived moral agency scale: Development and validation of a
Behavior, 90, 315–330. https://doi.org/10.1016/j.chb.2018.08.048
metric for humans and social machines. Computers in Human Behavior, 90, 363–371.
Cummings, M. L., & Mitchell, P. J. (2008). Predicting controller capacity in supervisory
https://doi.org/10.1016/j.chb.2018.08.028
control of multiple UAVs. IEEE Transactions on Systems, Man, and Cybernetics - Part A:
Benke, I., Knierim, M. T., & Maedche, A. (2020). Chatbot-based emotion management for
Systems and Humans, 38(2), 451–460. https://doi.org/10.1109/
distributed teams: A participatory design study. Proceedings of the ACM on Human-
TSMCA.2007.914757
Computer Interaction, 4(CSCW2), 1–30. https://doi.org/10.1145/3415189
Dale, R. (2016). The return of the chatbots. Natural Language Engineering, 22(5),
811–817. https://doi.org/10.1017/S1351324916000243
12
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Deci, E. L., & Ryan, R. M. (1987). The support of autonomy and the control of behavior. Journal of Human-Computer Studies, 103, 95–105. https://doi.org/10.1016/j.
Journal of Personality and Social Psychology, 53(6), 1024–1037. https://doi.org/ ijhcs.2017.02.005
10.1037/0022-3514.53.6.1024 Lee, J.-M., & Rha, J.-Y. (2016). Personalization-privacy paradox and consumer conflict
Dienlin, T., & Metzger, M. J. (2016). An extended privacy calculus model for SNSs: with the use of location-based mobile commerce. Computers in Human Behavior, 63,
Analyzing self-disclosure and self-withdrawal in a representative U.S. Sample. 453–462. https://doi.org/10.1016/j.chb.2016.05.056
Journal of Computer-Mediated Communication, 21(5), 368–383. https://doi.org/ Lee, J. D., & See, K. A. (2004). Trust in automation: Designing for appropriate reliance.
10.1111/jcc4.12163 Human Factors, 46(1), 50–80. https://doi.org/10.1518/hfes.46.1.50_30392
Dinno, A. (2015). Nonparametric pairwise multiple comparisons in independent groups Li, X., & Sung, Y. (2021). Anthropomorphism brings us closer: The mediating role of
using Dunn’s test. STATA Journal, 15(1), 292–300. https://doi.org/10.1177/ psychological distance in User–AI assistant interactions. Computers in Human
1536867x1501500117 Behavior, 118, 1–9. https://doi.org/10.1016/j.chb.2021.106680
Dobler, D., Friedrich, S., & Pauly, M. (2020). Nonparametric MANOVA in meaningful Liu, B., & Sundar, S. S. (2018). Should machines express sympathy and empathy?
effects. Annals of the Institute of Statistical Mathematics, 72(4), 997–1022. https://doi. Experiments with a health advice chatbot. Cyberpsychology, Behavior, and Social
org/10.1007/s10463-019-00717-3 Networking, 21(10), 625–636. https://doi.org/10.1089/cyber.2018.0110
Edwards, C., Beattie, A. J., Edwards, A., & Spence, P. R. (2016). Differences in Martínez-Miranda, J., & Aldea, A. (2005). Emotions in human and artificial intelligence.
perceptions of communication quality between a Twitterbot and human agent for Computers in Human Behavior, 21(2), 323–341. https://doi.org/10.1016/j.
information seeking and learning. Computers in Human Behavior, 65, 666–671. chb.2004.02.010
https://doi.org/10.1016/j.chb.2016.07.003 McDuff, D., & Czerwinski, M. (2018). Designing emotionally sentient agents.
Endsley, M. R., & Kaber, D. B. (1999). Level of automation effects on performance, Communications of the ACM, 61(12), 74–83. https://doi.org/10.1145/3186591
situation awareness and workload in a dynamic control task. Ergonomics, 42(3), McKnight, D. H., Choudhury, V., & Kacmar, C. (2002). Developing and validating trust
462–492. https://doi.org/10.1080/001401399185595 measures for e-commerce: An integrative typology. Information Systems Research, 13
European Commission. (2019). Ethics Guidelines for trustworthy AI. High-level expert group (3), 334–359. https://doi.org/10.1287/isre.13.3.334.81
on artificial intelligence. Retrieved at 2021-07-01 https://digital-strategy.ec.europa. Mensio, M., Rizzo, G., & Morisio, M. (2018). The rise of emotion-aware conversational
eu/en/library/ethics-guidelines-trustworthy-ai. agents. In Companion proceedings of the the web conference 2018. International world
Evans, D. C., & Fendley, M. (2017). A multi-measure approach for connecting cognitive wide web conferences steering committee (pp. 1541–1544). https://doi.org/10.1145/
workload and automation. International Journal of Human-Computer Studies, 97, 3184558.3191607
182–189. https://doi.org/10.1016/j.ijhcs.2016.05.008 Neff, G., & Nagy, P. (2016). Talking to bots: Symbiotic agency and the case of Tay.
Feine, J., Gnewuch, U., Morana, S., & Maedche, A. (2019). A taxonomy of social cues for International Journal of Communication, 10, 4915–4931. https://doi.org/1932–8036/
conversational agents. International Journal of Human-Computer Studies, 132, 20160005.
138–161. https://doi.org/10.1016/j.ijhcs.2019.07.009 Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2008). Situation awareness, mental
Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with workload, and trust in automation: Viable, empirically supported cognitive
unobservable variables and measurement error. Journal of Marketing Research, 18(1), engineering constructs. Journal of Cognitive Engineering and Decision Making, 2(2),
39. https://doi.org/10.2307/3151312 140–160. https://doi.org/10.1518/155534308X284417
Frazier, P., Keenan, N., Anders, S., Perera, S., Shallcross, S., & Hintz, S. (2011). Perceived Parasuraman, R., & Wickens, C. D. (2008). Humans: Still vital after all these years of
past, present, and future control and adjustment to stressful life events. Journal of automation. Human Factors, 50(3), 511–520. https://doi.org/10.1518/
Personality and Social Psychology, 100(4), 749–765. https://doi.org/10.1037/ 001872008X312198
a0022405 Patrick, B. C., Skinner, E. A., & Connell, J. P. (1993). What motivates children’s behavior
Friedman, B., & Nissenbaum, H. (1997). Software agents and user autonomy. Autonomous and emotion? Joint effects of perceived control and autonomy in the academic
agents. https://doi.org/10.1145/267658.267772 domain. Journal of Personality and Social Psychology, 65(4), 781–791. https://doi.
Gaudiello, I., Zibetti, E., Lefort, S., Chetouani, M., & Ivaldi, S. (2016). Trust as indicator org/10.1037//0022-3514.65.4.781
of robot functional and social acceptance. An experimental study on user Peng, Z., Kim, T., & Ma, X. (2019). GremoBot: Exploring emotion regulation in group
conformation to iCub answers. Computers in Human Behavior, 61, 633–655. https:// chat. Proceedings of the ACM Conference on Computer Supported Cooperative Work and
doi.org/10.1016/j.chb.2016.03.057 Social Computing, 335–340. https://doi.org/10.1145/3311957.3359472
de Gennaro, M., Krumhuber, E. G., & Lucas, G. (2020). Effectiveness of an empathic Pirkkalainen, H., Salo, M., Makkonen, M., & Tarafdar, M. (2017). Coping with
chatbot in combating adverse effects of social exclusion on mood. Frontiers in technostress: When emotional responses fail. In Proceedings the 38th international
Psychology, 10, 1–14. https://doi.org/10.3389/fpsyg.2019.03061 conference on information systems (pp. 1–18). http://aisel.aisnet.org/icis2017/IT-a
Graesser, A. C., Cai, Z., Morgan, B., & Wang, L. (2017). Assessment with computer agents nd-Social/Presentations/3/.
that engage in conversational dialogues and trialogues with learners. Computers in Pitts, V. E., Wright, N. A., & Harkabus, L. C. (2012). Communication in virtual teams: The
Human Behavior, 76, 607–616. https://doi.org/10.1016/j.chb.2017.03.041 role of emotional intelligence. Journal of Organizational Psychology, 28(1),
Hair, J. F., Black, W., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis. 2046–2054. https://doi.org/10.1016/j.chb.2012.06.001
In Multivariate data analysis (7th ed.). Upper Saddle River: Prentice Hall. Ragu-Nathan, T. S., Tarafdar, M., Ragu-Nathan, B. S., & Tu, Q. (2008). The consequences
Hancock, P. A., Billings, D. R., Schaefer, K. E., Chen, J. Y. C., De Visser, E. J., & of technostress for end users in organizations: Conceptual development and
Parasuraman, R. (2011). A meta-analysis of factors affecting trust in human-robot empirical validation. Information Systems Research, 19(4), 417–433. https://doi.org/
interaction. Human Factors, 53(5), 517–527. https://doi.org/10.1177/ 10.1287/isre.1070.0165
0018720811417254 Rotter, J. B. (1980). Interpersonal trust, trustworthiness, and gullibility. American
Hancock, P. A., & Warm, J. S. (1989). A dynamic model of stress and sustained attention. Psychologist, 35(1), 1–7. https://doi.org/10.1037/0003-066X.35.1.1
Human Factors, 31(4), 519–537. https://doi.org/10.7771/2327-2937.1024 Ryan, R. M., Lynch, M. F., Vansteenkiste, M., & Deci, E. L. (2011). Motivation and
Hart, S. G., & Wickens, C. D. (1990). Workload assessment and prediction. In Manprint autonomy in counseling, psychotherapy, and behavior change: A look at theory and
(pp. 257–296). Dordrecht: Springer. https://doi.org/10.1007/978-94-009-0437-8_9. practice. The Counseling Psychologist, 39(2), 193–260. https://doi.org/10.1177/
Hohenstein, J., & Jung, M. (2020). AI as a moral crumple zone: The effects of AI- 0011000009359313
mediated communication on attribution and trust. Computers in Human Behavior, Samrose, S., Anbarasu, K., Joshi, A., & Mishra, T. (2020). Mitigating boredom using an
106. https://doi.org/10.1016/j.chb.2019.106190 empathetic conversational agent. Proceedings of the 20th ACM International
IEEE. (2019). Ethically aligned design: A vision for prioritizing human well-being with Conference on Intelligent Virtual Agents, (IVA ’20), 1–8. https://doi.org/10.1145/
autonomous and intelligent systems. In IEEE global initiative on ethics of autonomous 3383652.3423905
and intelligent systems. https://doi.org/10.1109/IHTC.2017.8058187 Samrose, S., Zhao, R., White, J., Li, V., Nova, L., Lu, Y., Ali, M. R., & Hoque, M. E. (2018).
Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. CoCo: Collaboration coach for understanding team dynamics during video
Nature Machine Intelligence, 1(9), 389–399. https://doi.org/10.1038/s42256-019- conferencing. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous
0088-2 Technologies, 1(4), 1–24. https://doi.org/10.1145/3161186
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster Scerbo, M. W. (2007). Adaptive automation. In R. Parasuraman, & M. Rizzo (Eds.),
analysis. John Wiley & Sons, Inc. Neuroergonomics (p. 252). Oxford University Press.
Kelly, J. R., & Barsade, S. G. (2001). Mood and emotions in small groups and work teams. Schaub, F., Balebako, R., & Cranor, L. F. (2017). Designing effective privacy notices and
Organizational Behavior and Human Decision Processes, 86(1), 99–130. https://doi. controls. IEEE Internet Computing, 21(3), 70–77. https://doi.org/10.1109/
org/10.1006/obhd.2001.2974 MIC.2017.75
Kimani, E., Rowan, K., McDuff, D., Czerwinski, M., & Mark, G. (2019). A conversational Schuetzler, R. M., Grimes, G. M., & Giboney, J. S. (2019). The effect of conversational
agent in support of productivity and wellbeing at work. In 2019 8th international agent skill on user behavior during deception. Computers in Human Behavior, 97,
conference on affective computing and intelligent interaction, ACII 2019 (pp. 332–338). 250–259. https://doi.org/10.1016/j.chb.2019.03.033
https://doi.org/10.1109/ACII.2019.8925488 Shank, D. B., Graves, C., Gott, A., Gamez, P., & Rodriguez, S. (2019). Feeling our way to
Krafft, P. M., Macy, M., & Pentland, A. (2017). Bots as virtual confederates: Design and machine minds: People’s emotions when perceiving mind in artificial intelligence.
ethics. In Proceedings of the 2017 ACM conference on computer supported cooperative Computers in Human Behavior, 98, 256–266. https://doi.org/10.1016/j.
work and social computing (pp. 183–190). https://doi.org/10.1145/ chb.2019.04.001
2998181.2998354 Shaw, T., Emfield, A., Garcia, A., De Visser, E., Miller, C., Parasuraman, R., & Fern, L.
Lankton, N. K., McKnight, D. H., & Tripp, J. F. (2017). Facebook privacy management (2010). Evaluating the benefits and potential costs of automation delegation for
strategies: A cluster analysis of user privacy behaviors. Computers in Human Behavior, supervisory control of multiple UAVs. Proceedings of the Human Factors and
76, 149–163. https://doi.org/10.1016/j.chb.2017.07.015 Ergonomics Society, 2, 1498–1502. https://doi.org/10.1518/
Lee, S., & Choi, J. (2017). Enhancing user experience with conversational agent for 107118110X12829370088525
movie recommendation: Effects of self-disclosure and reciprocity. International
13
I. Benke et al. Computers in Human Behavior 129 (2022) 107122
Shumanov, M., & Johnson, L. (2021). Making conversations with chatbots more Weizenbaum, J. (1966). ELIZA–A computer program for the study of natural language
personalized. Computers in Human Behavior, 117, 1–7. https://doi.org/10.1016/j. communication between man and machine. Communications of the ACM, 9(1), 36–45.
chb.2020.106627 https://doi.org/10.1145/365153.365168
Skinner, E. A. (1996). A guide to constructs of control. Journal of Personality and Social Wiener, M., Mähring, M., Remus, U., & Saunders, C. (2016). Control configuration and
Psychology, 71(3), 549–570. https://doi.org/10.1007/BF00014299 control enactment in information systems projects: Review and expanded theoretical
Skjuve, M., Følstad, A., Fostervold, K. I., & Brandtzaeg, P. B. (2021). My chatbot framework. MIS Quarterly, 40(3), 741–774. https://doi.org/10.25300/MISQ/2016/
companion - a study of human-chatbot relationships. International Journal of Human- 40.3.11
Computer Studies, 149, 1–14. https://doi.org/10.1016/j.ijhcs.2021.102601 Williams, G. C., Levesque, C., Zeldman, A., Wright, S., & Deci, E. L. (2003). Health care
Spector, P. E. (1986). Perceived control by employees: A meta-analysis of studies practitioners’ motivation for tobacco-dependence counseling. Health Education
concerning autonomy and participation at work. Human Relations, 39(11), Research, 18(5), 538–553. https://doi.org/10.1093/her/cyf042
1005–1016. https://doi.org/10.1177/001872678603901104 Xolocotzin Eligio, U., Ainsworth, S. E., & Crook, C. K. (2012). Emotion understanding
Vimalkumar, M., Sharma, S. K., Singh, J. B., & Dwivedi, Y. K. (2021). ‘Okay google, what and performance during computer-supported collaboration. Computers in Human
about my privacy?’: User’s privacy perceptions and acceptance of voice based digital Behavior, 28(6), 2046–2054. https://doi.org/10.1016/j.chb.2012.06.001
assistants. Computers in Human Behavior, 120, 1–13. https://doi.org/10.1016/j. Xu, A., Liu, Z., Guo, Y., Sinha, V., & Akkiraju, R. (2017). A new chatbot for customer
chb.2021.106763 service on social media. In Proceedings of the 2017 CHI conference on human factors in
de Visser, E. J., Monfort, S. S., McKendrick, R., Smith, M. A. B., McKnight, P. E., computing systems (pp. 3506–3510). https://doi.org/10.1145/3025453.3025496
Krueger, F., & Parasuraman, R. (2016). Almost human: Anthropomorphism increases Zhou, L., Gao, J., Li, D., & Shum, H.-Y. (2020). The design and implementation of
trust resilience in cognitive agents. Journal of Experimental Psychology: Applied, 22(3), XiaoIce, an empathetic social chatbot. Computational Linguistics, 46(1), 53–93.
331–349. https://doi.org/10.1037/xap0000092 https://doi.org/10.1162/COLI_a_00368
Wang, W., & Benbasat, I. (2009). Interactive decision aids for consumer decision making Zhou, M. X., Mark, G., Li, J., & Yang, H. (2019). Trusting virtual agents: The effect of
in E-commerce: The influence of perceived strategy restrictiveness. MIS Quarterly, 33 personality. ACM Transactions on Interactive Intelligent Systems, 9(2–3), 1–13. https://
(2), 293–320. https://doi.org/10.25300/MISQ/2016/40.3.11 doi.org/10.1145/3232077
14