08 - Ethics and Privacy of Artificial Intelligence
08 - Ethics and Privacy of Artificial Intelligence
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys
article info a b s t r a c t
Article history: Artificial intelligence (AI) and its broad applications are disruptively transforming the daily lives of
Received 20 December 2020 human beings and a discussion of the ethical and privacy issues surrounding AI is a topic of growing
Received in revised form 24 March 2021 interest, not only among academics but also the general public This review identifies the key entities
Accepted 26 March 2021
(i.e., leading research institutions and their affiliated countries/regions, core research journals, and
Available online 2 April 2021
communities) that contribute to the research on the ethical and privacy issues in relation to AI and
Keywords: their intersections using co-occurrence analysis. Topic analyses profile the topical landscape of AI ethics
Artificial intelligence using a topical hierarchical tree and the changing interest of society in AI ethics over time through
Ethics scientific evolutionary pathways. We also paired 15 selected AI techniques with 17 major ethical issues
Privacy and identify emerging ethical issues from a core set of the most recent articles published in Nature,
Bibliometrics Science, and Proceedings of the National Science Academy of the United States. These insights bridging the
Topic analysis knowledge base of AI techniques and ethical issues in the literature, are of interest to the AI community
and audiences in science policy, technology management, and public administration.
© 2021 Elsevier B.V. All rights reserved.
1. Introduction proposals), has been widely used as a tool for science, technol-
ogy and innovation studies [7], such as identifying technological
A pandora’s box of artificial intelligence (AI) has been opened topics [8], discovering latent relationships [9], and predicting
and these disruptive technologies are transforming the daily lives potential future trends [10]. Recently, AI has received recognition
of human beings in relation to new ways of thinking and behav- in bibliometrics as an emerging topic for empirical investiga-
ioral patterns, with enhanced capabilities and efficiency. There tion [11,12]. These investigations either align with the interest in
are many examples of AI applications in use today, such as
technology management (e.g., using AI as a representative case
smart homes [1] smart farming [2], precision medicine [3] and
in digital transformation) or emphasize its role in examining the
healthcare surveillance systems [4]. The ethical and privacy issues
reliability of the proposed methods. However, from a practical
surrounding the use of AI have been a topic of growing interest
among diverse communities. For example, the general public has perspective, a bibliometric guide which summarizes ideas, as-
expressed concern about the impact of the increased use of robots sumptions, and debate in the literature would bring significant
on unemployment and inequality [5], social scientists have raised benefits to the AI community, not only by highlighting the ethical
deep privacy concerns related to surveillance systems [6], and and privacy concerns raised by the public but also by identifying
limited regulation of social media has raised debate with techni- the potential conflicts between AI techniques and these issues of
cal giants on the abuse of private data.1 Despite these concerns, concern.
the AI community stands behind the efficiency and robustness of To address these concerns, this paper reports on a bibliometric
their AI models and there is an urgent need to guide the research study to comprehensively profile the key ethical and privacy
community to understand these ethical and privacy challenges. issues discussed in the research articles and to trace how such
Bibliometrics, which is a set of approaches for analyzing sci- issues have changed over the past few decades. We integrated
entific documents (e.g., research articles, patents, and academic a set of intelligent bibliometric approaches within a framework
for diverse analyses. To identify the key entities, i.e., the leading
∗ Corresponding author.
research institutions and their affiliated countries and regions,
E-mail addresses: [email protected] (Y. Zhang),
and the core research journals and their behind research com-
[email protected] (M. Wu), [email protected] (G.Y. Tian),
[email protected] (G. Zhang), [email protected] (J. Lu). munities, which report the ethical and privacy issues surrounding
1 More information can be found on the website: https://www.bbc.com/news/ AI, we used co-occurrence statistics with diverse bibliographical
business-49099364. indicators (e.g., authors, affiliations, and sources). With specific
https://doi.org/10.1016/j.knosys.2021.106994
0950-7051/© 2021 Elsevier B.V. All rights reserved.
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
foci in topic analysis, we initially retrieved terms from the com- Organization (UNESCO) has issued its first draft of Recommenda-
bined titles and abstracts of collected articles and used a term tion on the Ethics of Artificial Intelligence (Recommendations)3 in
clumping process [13] to remove noisy terms and consolidate September 2020, which sets up ten important Principles of the
technical synonyms. In parallel, we represented each word in Ethics of AI, including: proportionality and do no hard, safety and
the combined field with titles and abstracts as a vector using security, fairness and non-discrimination, sustainability, privacy,
the Word2Vec model [14] and combined the word vectors into human oversight and determination, transparency and expand-
term vectors by matching the core terms refined in the term ability, responsibly and accountability, awareness and literacy,
clumping process. We answered the questions as to what is the and multi-stakeholder and adaptive governance and collabora-
topical landscape and how have these topics evolved over time, tion.
using an approach of scientific evolutionary pathways [15]. We
also targeted a core set of articles published in three world-
leading multi-disciplinary journals, namely Science, Nature, and 2.2. Privacy, data privacy, and AI privacy
Proceedings of the National Academy of Sciences (PNAS) of the
United States of America, and identified cutting-edge issues that Privacy, as one of ten important Principles of the Ethics of AI
might either focus attention on emergent ethical and privacy developed by the UNESCO, may deserve a particular attention. In
issues in the current AI age or lead to novel developments in AI legal and philosophical literature, privacy has been defined in a
models to address any potential negative impacts. We anticipate variety of ways, for example, privacy is ‘‘the right to be let alone’’,
that the empirical insights identified in this study will motivate as a component of personhood, control over personal information,
the AI community to extensively and comprehensively discuss and the right to secrecy [6].
the ethical and privacy issues surrounding AI and will guide the Together with the big data boom and AI age, data privacy4 and
implementation of AI in line with an ethical framework. control over personal information arguably becomes increasingly
The rest of this paper is organized as follows: Section 2 important aspects of privacy protection, and AI brings further
presents a review of the related work on AI ethics, privacy, and threats to privacy protection [22]. Kerry (2020) observed that AI
bibliometrics; Section 3 introduces the data and methodologies expands the ability to use personal information in ways that can
used in this study; Section 4 presents the results, and our key infringe on privacy interests by bringing personal data analysis to
findings and Section 5 concludes the study and suggests future new levels of power and speed [23].
research directions.
2
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
2.4. Bibliometrics and topic extraction the WoS All Databases. In addition, since the WoS Core Collection
database provides a cleaned form of full bibliographical infor-
Modern bibliometrics can be traced back to the observations mation (e.g., author affiliations, countries/regions, and forward
of Derek Price on the patterns of scientific activities [27]. Early and backward citations), we particularly focused on an analysis
definitions of bibliometrics emphasize ‘‘the application of math- of the key entities that contribute to the research on AI ethics
ematics and statistical methods to books and other media of and the interactions between these entities. Comparably, the
communication [28]’’, involving indicators such as citation/co- WoS All Database covers a relatively ‘‘full’’ collection of various
citation statistics, word co-occurrence, and co-authorships [15]. types of articles in WoS, with a priority on data coverage, but
The increasing diversity of practical data sources rapidly extends the WoS Core Collection only contains journal articles collected
the scope of bibliometric data from books to a wide range of in selective indexes (e.g., Science Citation Index), highlighting
information resources in science, technology and innovation, such the quality of its data collection. In other words, the WoS Core
as research articles, patents, and academic proposals, as well as Collection is a subset of the WoS All Database, with a filtered data
to social media data (e.g., Twitter) [29]. Information technologies, collection.
especially AI techniques, further strengthen the capabilities of
Referring to the literature discussed in Sections 2.1–2.3 on AI
bibliometrics in analyzing scalable data with enhanced efficiency,
ethics, together with the IEEE’s Ethically Aligned Design [39] and
effectiveness, and robustness. In this area, some of our pilot stud-
the AI Ethics Principles reported by Australia’s Department of
ies are spearheading a cross-disciplinary approach that develops
Industry, Science, Energy and Resources7 , we proposed a search
computational models incorporating bibliometric indicators with
string and collected data on October 14, 2020 (see Table 1)8 . We
AI techniques, which we call intelligent bibliometrics [30].
set #1 as the full dataset for understanding the topic landscape
Topic extraction identifies abstract topics from a collection
of documents to represent the major content, using either clus- of AI ethics, and #3 (a subset of #1) as the dataset for iden-
tering or classification algorithms [31]. Topic extraction is also tifying key research entities (e.g., affiliations and communities)
of significant interest to the bibliometric community, in which that contribute to the research on AI ethics. As a specific inter-
citation statistics and textual elements are heavily used [8,32]. est, we collected another subset #2 from #1, containing articles
These extracted topics represented by either a sub-collection published in the three world-leading multi-disciplinary journals
of documents or a set of terms hold recognized capabilities in – i.e., Science, Nature, and PNAS, to discover potential emerging
knowledge interpretation and exploration, e.g., profiling research issues in AI ethics.
disciplines and technological areas [7,33], identifying latent rela- Focusing on #1 and #2, the trends for the annual number of
tionships [10,15,34], and predicting potential future changes in records in the two datasets are given in Fig. 1. Before 2013, the
either collaborative patterns or research interests [35–37]. How- number of records in the full dataset increased at a relatively low
ever, regarding the characteristics of bibliometric documents and rate and a sudden rise after 2016 illustrates the urgent attention
the urgent need to interpret topics in depth, we anticipate two from the academia. Comparably, the general trend in the core
emergent directions of topic extraction: 1) since research top- dataset coincides with that of the full dataset — that is, certain
ics are constantly changing (e.g., cross-/inter-/multi-disciplinary isolative papers are observed before 2013 and the ‘real’ growth
interactions) rather than being stable [15], extracting topics and starts in 2014.
discovering their relationships from a dynamic perspective could
be practically significant for not only the bibliometric community
3.2. Methodology
but also business and management studies; and 2) hierarchy is
an innate structure of knowledge composition, as well as top-
ics. Thus, profiling topics from a hierarchical dimension would The research framework is shown in Fig. 2. We had a two-
provide an extensive understanding of its related knowledge phase approach to discover insights into the ethical issues sur-
base [38]. Even though it is not a new task for the computer rounding AI discussed in the research articles: phase 1 for data
science discipline, a balance between non-parametric solutions pre-processing and phase 2 for a systematic analysis incorporat-
and explainable results is still elusive. ing bibliometrics with a series of analytic approaches.
3
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
Table 1
Search strategy and data information.
Dataset #R Search strategy
#1 4375 TS = ((‘‘artificial intelligence’’ OR ‘‘big data’’) AND (‘‘disinform*’’
OR ‘‘ethic*’’ OR ‘‘crimin*’’ OR ‘‘moneti*’’ OR ‘‘data control*’’ OR
‘‘implicit trust*’’ OR ‘‘addiction*’’ OR ‘‘contestab*’’ OR ‘‘moral*’’ OR
‘‘digit* transparen*’’ OR ‘‘algorithm* transparen*’’ OR
‘‘accountabilit*’’ OR ‘‘liabilit*’’ OR ‘‘fairness*’’) )
process [13] to identify a set of core terms by removing noise exploited a matching function to bridge word vectors and core
and consolidating variations with a set of thesauri and rules. In terms and create a vector for each core term.
parallel, we applied a Word2Vec model [14] to the raw text of Our aim was to focus on the current emerging concerns in
titles and abstracts and represent each word as a vector. We then relation to AI ethics raised by multiple research communities. We
4
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
and PNAS, and conducted a miniature bibliometric analysis to # R Affiliation Country # R Country
5
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
Berlin) concentrate on their relatively small groups and China • Relatively clear boundaries among five communities indi-
(Chinese Acad Sci, Peking Univ, and Baidu) is also located at cate an established knowledge system on AI ethics, namely
a marginal area of the co-authorship network. The active role computer science (purple nodes), information systems and
played by the leading universities (e.g., MIT, Univ Penn, Univ management (red nodes), medical sciences and multi-
Oxford, Natl Univ Singapore, and Univ Edinburgh) in conducting disciplinary studies (green nodes), and law and general
research on AI ethics may indicate the increasing interest in this magazines (blue nodes).
field from academia and its urgency. • Leading journals play an active role in bridging cross-
Fig. 4 shows the co-authorship map for the top 30 countries disciplines, and the publication of AI ethics in reputable
and regions in relation to the research on AI ethics. The USA newspapers and magazines assists in increasing the aware-
produces the most research on discussing AI ethical issues and ness of the general public in these issues. The following pub-
its collaborative network covers almost all the countries/regions lications construct the backbone of this knowledge system:
in this map and it has particularly strong ties with the UK, Canada, Nature, Science, PNAS, PloS One, JAMA, New England Journal of
Australia, China, Germany, and the Netherlands. However, while Medicine, Lecture Notes in Computer Science, Communication
domestic collaboration is obviously the key pattern in the leading of ACM, Big Data & Society, Information Communication &
countries (e.g., approximately 66% in the USA, 52% in the UK, Society, Guardian, New York Times, Ethics and Information
and 62% in China), we also observe several European countries Technology, and Science and Engineering Ethics.
have a preference for international collaboration, such as Austria,
Belgium, and Sweden — the proportion of their international In conclusion, in this section we identified the key players
collaboration achieves almost 60%. (e.g., research institutions and communities) which contribute
Table 3 shows the publication sources (e.g., research journals to the research on the ethical issues surrounding AI and the
and magazines) which publish research on AI ethics, as well as countries/regions where this research is being undertaken. Such
the interactions between the WoS subject categories of these insights draw a landscape to support the understanding of ‘‘who’’
sources. The 3259 articles in #3 dataset were published in 1936 has been involved in the study of AI ethics and ‘‘how’’ they
publication sources, including journals, conference proceedings, have contributed to this topic. In particular, we highlighted the
and magazines, and Table 3 lists the top 24 most productive role of cross-disciplinary research publications (e.g., Communi-
publication sources on AI ethics. Except for three conference cations of ACM, and Lecture Notes in Computer Science), multi-
proceedings and one magazine, most of these publications are in disciplinary research journals (e.g., Nature and Science), and news-
research journals and are from diverse disciplines, such as com- papers (e.g., New York Times) in gradually transferring technical AI
puter science, medical science, biology, and media. As reflected knowledge to inform public concerns on ethics.
in the names of these publications, one common interest of these
journals is to investigate the societal impact (e.g., ethics, law, 4.2. Landscapes and evolution of ai’s ethical topics
crimes, and sustainability) of science and technology.
Every journal covered by the Web of Science Core Collection In this section, we move our foci to topic analyses by analyzing
is assigned to at least one of 254 subject categories. We retrieved #1 dataset collected from the WoS All Databases. We initially
199 WoS categories from these 1,936 publications, revealing a retrieved 93,364 terms from the combined titles and abstracts of
multi-disciplinary interest in AI ethics, and we visualized their the 4375 articles, and we conducted a term clumping process [13]
co-occurrence relationships in Fig. 5. We summarize and discuss to remove noise and consolidate the technical synonyms, reduc-
the following key observations: ing the total number of terms to 52,054. Then, we used the 2163
• The three WoS categories (‘‘computer science & artificial terms appearing in more than 2 articles as the core set of terms
intelligence’’, ‘‘computer science, theory & methods’’, and to generate the topical hierarchical tree (THT) shown in Fig. 7 and
‘‘computer science & information systems’’), which build the the map of the scientific evolutionary pathways (SEP) shown in
core knowledge pillars on the fundamental research and Fig. 8.
applications of AI, together with ‘‘engineering, electrical & Fig. 7 enhances the understanding of the details of AI ethical
electronic’’, ‘‘computer science, hardware & software’’, and issues, especially the connections between specific AI techniques
‘‘computer science, software engineering’’, form the techni- and ethical concerns. Among its 71 nodes, the THT lists 27 AI
cal backbone of AI (red nodes). Its key application areas in techniques (e.g., machine learning) and AI-driven applications,
‘‘medical informatics’’ and ‘‘health care sciences & services’’ devices, and products (e.g., robots and autonomous vehicles), 28
further extend this technical scope (light green nodes). ethical topics (e.g., fairness and discrimination), and 16 societal
• Ethical issues (purple and gray-blue nodes) are discussed topics (most of them in relation to medical and healthcare issues).
in extensive categories of social sciences, such as ‘‘ethics’’, The four main branches of this THT represent four major issues
‘‘history & philosophy of science’’, ‘‘philosophy’’, ‘‘medical relating to AI ethics, that is, #1 AI techniques and potential ethical
ethics’’, and ‘‘social sciences, biomedical’’. As supplemen- issues, #2 technological and political implications of AI ethics, #3
tary sources, ‘‘information science & library science’’, ‘‘man- data privacy, and #4 privacy in healthcare. We discuss these four
agement’’, and ‘‘economics’’ provide analytic approaches issues in detail:
(brown nodes), while the engagement of ‘‘regional & urban #1 AI techniques and potential ethical issues: Fig. 7 reveals the
planning’’, ‘‘environmental studies’’, ‘‘political science’’, and key AI techniques that may raise ethical concerns, such as ma-
‘‘education & educational research’’ involves new applica- chine learning (including deep learning, computer vision, neural
tion scenarios (blue nodes). networks, natural language processing, etc.), ontologies, commu-
nication technologies, and neuroscience.10 Machine learning, one
To track the knowledge flow through the citation behaviors of the
of the key areas in AI, shares close connections with almost
3,259 articles, we collected their references and retrieved 51,431
all AI techniques, and thus attracts the most attention in this
journals. The co-occurrence relationships of these cited journals
THT and are connected with all ethical issues, such as fairness,
are visualized in Fig. 6, providing a new angle to identify the
research communities contributing to the research on AI ethical
issues. We raise the following points: 10 Neuroscience here mostly refers to techniques of brain computer interface.
6
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
Fig. 3. Co-authorship network for key affiliations in relation to research on AI ethics (visualization tool: VOSViewer [43]).
Note that in science maps generated by VOSViewer in this paper (i.e., Figs. 3, 5, and 6), a node represents an entity (e.g., an institution, a WoS category, and
a publication source) and an edge indicates a co-occurrent relationship between its connected nodes. The size of a node represents its importance, measured
by the total number of records linked to this node in our dataset. The color of a node represents a group of entities to which the node belongs. Since in
Fig. 3, we have more than 50 research groups, and thus we do not list all those colors as a legend. High-resolution versions of Figs. 3–9 could be found on
https://github.com/IntelligentBibliometrics/KBS-AI-Ethics.
Table 3
Publication sources with more than 10 articles on AI ethics.
#R Publication source #R Publication source
1 56 AI & Society 13 13 Proceedings of the 2019 AIES
2 41 Big Data & Society 14 13 BMJ Open
3 40 Science and Engineering Ethics 15 12 AI Magazine
4 34 Ethics and Information Technology 16 12 Journal of Information Communication &
Ethics in Society
5 23 Computer Law & Security Review 17 12 Russian Journal of Criminology
6 23 Journal of Medical Internet Research 18 11 Asian Bioethics Review
7 23 Minds and Machines 19 11 OMICS
8 19 IEEE Access 20 11 Proceedings of the 2019 ECIAIR
9 18 Proceedings of the 2018 AIES 21 11 Sustainability
10 17 Philosophical Transactions of The Royal Society A 22 10 Journal of Bioethical Inquiry
11 16 BMC Medical Ethics 23 10 New Media & Society
12 15 Information Communication & Society 24 10 Social Media + Society
discrimination, liability, frauds, and criminals.11 It is easy to ex- public, and ethical issues (such as privacy) and related regulations
plain these cases. For example, applying AI models to make are appearing in public reading materials [49].
decisions entails justiciable ‘‘right to a well-calibrated machine #2 technological and political implications of AI ethics: As an
decision’’ [45,46], AI-driven fraud in social media, political elec- extension of the ethical issues in #1, #2 further extends AI’s
tions, and financial markets (e.g., fake videos and identifications influence from ethics to the broad society through specific tech-
manipulated by AI techniques, such as image processing and face nological and political implications, such as sustainability, re-
recognition) have become a major concern [47]. How to validate sponsibility, and digitalization. From the perspective of a com-
AI recommendations with human knowledge in actual cases, such plex ecosystem, these societal reactions could be the resilient
as clinical practice, is challenging both the AI community and the progress of an ecosystem responding to disruptions introduced
by AI techniques and their resulting ethical issues [50].
receptivity of the general public [48]. A brand-new topic, brain
#3 data privacy and #4 privacy in healthcare: #3 and #4 are
computer interface is attracting increasing attention from the
a specific case of AI ethics. The big data boom initially activated
the public’s concerns on data privacy, where the illegal exposure
11 We note that fairness is one constraint in evaluating reinforcement learning of personal data, particularly those linked with social media, oc-
approaches and fraud detection is a specific task of machine learning, and thus curred, e.g., the Facebook case in Footnote 1. Furthermore, while
these variations might introduce noise to our analysis. analyzing health data (e.g., electronic health records), including
7
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
Fig. 4. Co-authorship map for key countries and regions in relation to the research on AI ethics (visualization tool: Circos [44]).
Note that in a map generated by Circos in this paper (i.e., Figs. 4 and 9), one slice with a unique color represents one entity (e.g., countries/regions, and topics),
and the ribbon link between two slices indicates the strength of their co-occurrence. Specifically, the self-linked ribbons in Fig. 4 represent domestic collaborations
within a country/region.
clinical trials and gene sequencing data provides evidence for human knowledge, and thus, together with practical cases such
precision medicine, privacy concerns in medical and healthcare as energy efficiency, it seems that fairness issues mainly appear
sectors then become not only a societal issue but are also a in this path.
threat to national strategies and the sustainability and balance of Criminals (#2) involves criminal justice, crime analysis, cyber
nature [51]. criminals, and liability. This relatively new community started
To further explore the details of these AI ethical issues and in 2014, and its two large branches appeared in 2016 and after.
their evolutionary relationships over the past few decades, the One interesting aspect here is the involvement of face recognition
topical evolutionary pathways on AI ethics are visualized in techniques in cyber criminals, and the ‘deepfake’ story12 may
Fig. 8. We set ‘expert systems’ as the starting point of the evo- well endorse this observation, in which an AI mobile app can
lutionary pathways, considering it is a representative AI tech- insert faces in place of film and TV characters and may result
nique/application in the 1990s and before. Seven communities, in fraud by defeating the ‘Face ID’ function in smart phones. The
represented by different colors in Fig. 8, uncover diverse interests other aspect for computer vision is its use in law enforcement
and emphases in AI techniques, applications, and related ethical (e.g., surveillance systems) for crime detection in national secu-
concerns. They are #1 expert systems (dusty yellow), #2 criminal rity activities. However, such techniques violate personal privacy
investigation (macaron blue), #3 machine ethics (grass green), in these practical uses.
#4 anonymity (light purple), #5 decision making (ocean blue), The study of machine ethics (#3) results in a timeline showing
#6 health care (orange), and #7 clinical practice (peach red). We how public concerns about social media privacy have changed
observed certain findings and discussed these as follows:
Expert systems (#1) represent the interactions between AI 12 See details of this news on the website: https://www.bbc.com/news/
techniques (and information technologies in the early years) and technology-49570418.
8
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
Fig. 5. Co-occurrence network for WoS categories on AI ethics (visualization tool: VOSViewer [43]).
Fig. 6. Co-citation network for journals cited by research articles on AI ethics (visualization tool: VOSViewer [43]).
over time – e.g., from illicit activities of social media platforms #5 is a community investigating the traditional base of in-
in 2016 to responsibility one year later, and from a governance formation systems, in which multi-agent systems and intelli-
framework in 2019 to regulations in relation to ethical behaviors gent systems were involved before 2014. After this, increased
and dimensions in 2020. constraints such as accountability, confidentiality, and sustain-
From anonymity in 2004, #4 develops into a relatively broad ability to evaluate the capabilities of information systems indi-
cate the emerging interests of this research community. Partic-
scope of ethical concerns in research and health data (e.g., poten-
ularly, rooted in accountability, new concerns on monetization
tial influence of electronic health records), data protection, and
of data were raised in 2019, inspiring global-wide debates on
privacy. In the other main branch of this community, from a tech-
diverse aspects, from political governance to legal and financial
nical perspective, autonomous vehicles and cognitive capabilities regulations.
could act as open data sources and benefits, but interestingly, #6 is an extension of decision-making in diverse scenarios
how to protect sensitive information in open data initiatives has such as health care and medical data and with diverse theo-
become an issue as well and cybersecurity further strengthens ries, concepts, and techniques, such as ontology, neuroscience,
such protection [52]. and game theory, but in this path, human morality, together
9
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
with human factors and misleading information, is specifically As a conclusion for the SEP, (1) data privacy is an urgent
highlighted. Another highlight here regards to neurosciences. As topic relating to AI ethics, particularly when data contain sensi-
we discussed in Fig. 7, brain computer interface may align with tive personal information, with clinical trials and genomics; (2)
this topic, which analyzes brain signals and makes decisions for the increasing threats and fears in relation to AI-driven fraud
human beings, and such activities attract comments on human and cybercrime are drawing attention; and (3) the reliability,
morality – [49] quotes from one of its interviewed ethicists, a transparency, and fairness of AI models are still unsolved issues.
device of brain computer interface ‘‘was more than a device. . . the As discussed and highlighted in the THT and SEP, of particular
company owned the existence of this new person’’. Thus, it is interest to the AI community is the discovery of potential con-
critical to discuss how to regulate these new AI devices. flicts between AI techniques and specific ethical issues, and thus,
#7 contains the largest number of emerging topics generated referring to our search strategy (Table 1) and terms appearing
in recent years and ethical issues in clinical practice are a key in Figs. 7 and 8, we selected 15 AI technique-related terms and
concern not only to academia but also to the general public. Like 17 AI ethics-related terms, and visualized their co-occurrence
relationships in Fig. 9. We discuss these AI techniques and their
our discussion of #4, such concerns mostly revolve around the
closely connected ethical issues in the following:
illegal use of various personal data, such as health records, clinical
data, and genomic data, as well as data sharing and security. In • Machine learning as a representative technique, including
2020, following the topic of bioethics, gene editing, the winner deep learning, reinforcement learning, and neural networks,
of the 2020 Nobel Biology prize, attracted the attention of this touches all 17 ethical issues, particularly, fairness, account-
community. ability, and privacy. Despite different emphases, data mining
10
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
and cloud technologies follow similar patterns. In this area, (i.e., Nature, Science, and PNAS), and thus, we analyzed the 53
all concerns come to the balance between AI decisions and articles in dataset #2 but we removed articles published before
the mechanism behind that decision (e.g., data collection 2015, then manually read the remaining 46 articles and selected
and algorithmic transparency). 34 articles which directly touch on AI ethical issues. Interestingly,
• Computer vision (including face recognition and imagine most of these articles are opinions, news, and comments, and
processing techniques) is raising concerns from the gen- Nature is the key publication (25 articles). Compared to research
eral public. These are directly linked with crime (regarding articles, these ‘‘informal’’ types of articles might reflect the in-
manipulated fake images and videos and surveillance sys- creasing interest of the public to AI ethics, and such opinions and
tems used for national and domestic security detection) and comments could be some rapid re-actions to emerging ethical
accountability issues. issues in relation to AI (but may need further extensions and
• Robots, as an engineering-driven AI application, draw atten- studies to enrich them to full research articles). Given this cir-
tion in relation to machine ethics, responsibility, account- cumstance, we consider this section as a complementary study
ability, liability, and privacy, as do autonomous vehicles. of Section 4.2, and the main purpose here is to explore current
Since political regulations for those intelligent machines lag emerging issues in AI ethics.
its technological progress, the general public worries about With the involvement of manual intervention, we categorized
the reliability of these new technologies (e.g., the safety of the 34 articles into the following main topics in Fig. 10.
an autonomous car), and broad ethical and moral issues
Privacy issues (30%) are one of the key emerging concerns.
(e.g., how shall we charge a machine with a crime, and who
This is consistent with the position of the UNESCO in its recent
is liable for a failure).
draft of the Recommendation on the Ethics of Artificial Intelligence,
• Blockchain techniques, an interdisciplinary area with both
which, as mentioned in Section 2.1, listed privacy as one of key
AI hardware and software, attract criticism in relation to
principles of AI ethics. Specifically, concerns were expressed in
accountability and sustainability. In fact, from a public point
18% of the articles about healthcare data privacy, with a focus
of view, blockchain techniques are heavily involved in the
on issues such as the balance of governance on public health
internet of things, and thus, compared to traditional ethical
control and data privacy for patient records, disease monitoring,
issues, sustainability is a special concern in this area.
and genomic data. The other 12% of articles expressed concerns
• Neuroscience, as a discipline for techniques in brain com-
puter interface, represents current AI activities in collecting relating to big data privacy. One specific interest comes to the
personal information, in clinical trials, healthcare records, observation here, which reveal that healthcare data privacy con-
genomic data, and brain signals. Despite great potential in stitutes a separate topic from topic data privacy. In fact, this result
benefiting human beings in precision medicine, disability is consistent with the current privacy governance and regulatory
and accessibility services, and smart home, critical concerns structure in Australia. Using the privacy laws in State of New
align with privacy and responsibility. South Wales (NSW) as an example, there are two major statutory
laws governing privacy and personal information protection in
4.3. Current emerging issues in AI ethics and privacy NSW. One is the Privacy and Personal Information Protection
Act 1998 (NSW) (hereinafter ‘PPIPA’). The other is the Health
The specific interest of digging out emerging issues in AI ethics Records and Information Privacy Act 2002 (NSW) (hereinafter
leads us to timely articles published in the three leading outlets ‘HRIP’). The PPIPA offers protection to all personal information
11
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
Fig. 9. Co-occurrence map between 15 key AI techniques and 17 ethical topics (visualization tool: Circos [44]).
Note that bold and italic labels represent ethical topics and other labels present topics of AI techniques.
12
Y. Zhang, M. Wu, G.Y. Tian et al. Knowledge-Based Systems 222 (2021) 106994
except health information. S4 A of PPIPA explicitly excludes the 5.1. Key findings
‘‘health information’’ under the HRIP from definition of personal
information under the PPIPA.13 In contrast, the HRIP focuses on Referring to Tables 2 and 3, this paper found that the key
health information with the purpose of ‘‘promoting fair and re- contributors to the research on the ethical issues relating to AI
sponsible handling of health information’’ in particular14 . Such a were English-speaking countries such as the USA, UK, Australia,
and Canada. In comparison, China and the European countries
governance structure (personal information + health information)
contribute to this research area as well, but their key research
is arguably consistent with the structure of our observations
institutions are not as equally appealing as those of English-
in the series of topic analyses (data privacy + healthcare data
speaking countries. According to Fig. 4, intriguingly, those coun-
privacy). This may arguably serve as prima facie evidence on the tries making the major contribution to the research on AI ethical
accuracy and reliability of our system. issues, namely the USA, UK, and China, mostly engage in do-
Other concerns are mainly related to machine ethics (23%) mestic collaboration, however certain European countries, such
and fairness (20%). Specifically, machine ethics touches on a wide as Austria, Belgium and Sweden, seem to prefer international
range of topics relating to the morality of intelligent machines collaboration.
(e.g., AI cars), how to uphold human rights with robots, and the In Figs. 5 and 6, the ethical issues relating to AI cover a wide
consciousness of machines. These discussions reflect the poten- range of disciplines (i.e., 199 of the 254 WoS categories), and four
tial fears of the general public relating to these unknown but research communities play an active role in the research associ-
extremely smart machines and the dilemma between technology ated with AI ethical issues, namely computer science, business
and ethics. On the other hand, fairness indicates the unease of the and management, medical science, and law. The involvement of
newspapers and magazines in publishing research on AI ethical
general public as to whether AI models can generate fair decisions
issues indicates the interest of the general public in these matters.
in diverse scenarios. Articles related to AI strategy mainly talk
In terms of topic analysis in Sections 4.2 and 4.3, key AI
about how to regulate this new AI world in a power-shifting techniques such as machine learning, data analysis, robots and
theme (e.g., how to seek a balance between human beings and in- intelligent systems, and cloud technologies, generate concerns
telligent machines) and how shall national strategies and military about the ethical issues relating to AI. Fairness, as well as dis-
actions involve in the development of AI techniques. crimination, are among those key concerns because AI models
In addition to a general discussion on the issues surrounding are applied in decision support in diverse scenarios. Data privacy,
AI ethics, surveillance seems to be an increasing concern, where particularly in the healthcare and medical sectors, is a cause of
the authors of these articles call for the review and regulation increasing concern. Cybercrime and fraudulent behavior are par-
of AI surveillance systems, regardless of whether they are used ticularly concerning in the absence of appropriate support from
for national security, industrial monitoring, or research/individual the law and regulations. Machine ethics are mostly related to
use. robots, autonomous cars, and intelligent machines, highlighting
a balance between machine consciousness and human rights.
Declaration of competing interest [24] A. Jobin, M. Ienca, E. Vayena, The global landscape of AI ethics guidelines,
Nat. Mach. Intell. 1 (9) (2019) 389–399.
[25] B. Mittelstadt, AI Ethics–Too principled to fail, 2019, arXiv preprint arXiv:
The authors declare that they have no known competing finan-
1906.06668.
cial interests or personal relationships that could have appeared [26] T. Hagendorff, The ethics of Ai ethics: An evaluation of guidelines, Minds
to influence the work reported in this paper. Mach. (2020) 1–22.
[27] D. Price, Little Science, Big Science, NY: Columbia University Press, 1963.
Acknowledgment [28] A. Pritchard, Statistical bibliography or bibliometrics, J. Doc. 25 (4) (1969)
348–349.
[29] Y. Zhang, Y. Guo, X. Wang, D. Zhu, A.L. Porter, A hybrid visualisation model
This work is supported by the Australian Research Council for technology roadmapping: Bibliometrics, qualitative methodology and
under Discovery Early Career Researcher Award DE190100994. empirical study, Technol. Anal. Strateg. Manag. 25 (6) (2013) 707–724.
[30] Y. Zhang, A.L. Porter, S.W. Cunningham, D. Chiavetta, N. Newman, Parallel
or intersecting lines? Intelligent bibliometrics for investigating the involve-
References
ment of data science in policy analysis, IEEE Trans. Eng. Manage. (2020)
http://dx.doi.org/10.1109/TEM.2020.2974761.
[1] R. Harper, Inside the Smart Home, Springer Science & Business Media, [31] D.M. Blei, Probabilistic topic models, Commun. ACM 55 (4) (2012) 77–84.
2006. [32] T. Velden, K.W. Boyack, J. Gläser, R. Koopman, A. Scharnhorst, S. Wang,
[2] A. Walter, R. Finger, R. Huber, N. Buchmann, Opinion: Smart farming is Comparison of topic extraction approaches and their results, Sciento-
key to developing sustainable agriculture, Proc. Natl. Acad. Sci. 114 (24) metrics 111 (2) (2017) 1169–1221, http://dx.doi.org/10.1007/s11192-017-
(2017) 6148–6150. 2306-1.
[3] F.S. Collins, H. Varmus, A new initiative on precision medicine, N. Engl. J. [33] Y. Zhang, H. Chen, J. Lu, G. Zhang, Detecting and predicting the topic
Med. 372 (9) (2015) 793–795. change of knowledge-based systems: A topic-based bibliometric analysis
[4] M.S. Hossain, G. Muhammad, N. Guizani, Explainable AI and mass from 1991to 2016, Knowl.-Based Syst. 133 (2017) 255–268.
surveillance system-based healthcare framework to combat COVID-I9 like [34] J. Guo, X. Wang, Q. Li, D. Zhu, Subject–action–object-based morphology
pandemics, IEEE Netw. 34 (4) (2020) 126–132. analysis for determining the direction of technological change, Technol.
[5] J. Bossmann, Top 9 ethical issues in artificial intelligence. World Economic Forecast. Soc. Change 105 (2016) 27–40.
Forum, 2020, (Accessed October 26, 2020). [35] L. Huang, Y. Zhu, Y. Zhang, X. Zhou, X. Jia, A link prediction-based
[6] V.C. Müller, Ethics of Artificial Intelligence and Robotics, 2020. method for identifying potential cooperation partners: A case study on
[7] Y. Zhang, G. Zhang, H. Chen, A.L. Porter, D. Zhu, J. Lu, Topic analysis and four journals of informetrics, in: 2018 Portland International Conference
forecasting for science, technology and innovation: Methodology and a on Management of Engineering and Technology (PICMET), IEEE, 2018, pp.
case study focusing on big data research, Technol. Forecast. Soc. Change 1–6.
105 (2016) 179–191. [36] E. Yan, R. Guns, Predicting and recommending collaborations: An author-,
[8] Y. Zhang, et al., Does deep learning help topic extraction? A kernel k- institution-, and country-level analysis, J. informetr. 8 (2) (2014) 295–309.
means clustering method with word embedding, J. informetr. 12 (4) (2018) [37] Y. Zhang, X. Wang, L. Huang, G. Zhang, J. Lu, Predicting the dynamics
1099–1117. of scientific activities: A diffusion-based network analytic methodology,
[9] M. Wu, Y. Zhang, G. Zhang, J. Lu, Exploring the genetic basis of diseases in: 2018 Annual Meeting of the Association for Information Science and
through a heterogeneous bibliometric network: A methodology and case Technology, Vancouver, Canada, 2018.
study, Technol. Forecast. Soc. Change (2020). [38] T.L. Griffiths, M.I. Jordan, J.B. Tenenbaum, D.M. Blei, Hierarchical topic
[10] Y. Zhang, M. Wu, Z. Hu, R. Ward, X. Zhang, A. Porter, Profiling and models and the nested chinese restaurant process, Adv. Neural Inf. Process.
predicting the problem-solving patterns in China’s research systems: Syst. (2004) 17–24.
A methodology of intelligent bibliometrics and empirical insights, in: [39] The IEEE global initiative on ethics of autonomous and intelligent sys-
Quantitative Science Studies, 2020. tems, in: Ethically Aligned Design: Prioritizing Human Wellbeing with
[11] D. Cetindamar, T. Lammers, Y. Zhang, Exploring the knowledge spillovers Autonomous and Intelligent Systems, First ed., IEEE, 2019.
of a technology in an entrepreneurial ecosystem—The case of artificial [40] M.Z. Wu, Yi, Hierarchical topic tree: A hybrid model incorporating network
intelligence in sydney, Thunderbird Int. Bus. Rev. 62 (5) (2020) 457–474. analysis and density peaks searching, in: Presented at the The 18th Inter-
[12] Y. Zhang, X. Zhou, A.L. Porter, J.M.V. Gomila, How to combine term clump- national Conference on Scientometrics & Informetrics, Leuven, Belgium,
ing and technology roadmapping for newly emerging science & technology 2021.
competitive intelligence: Problem & solution pattern based semantic TRIZ [41] M. Bastian, S. Heymann, M. Jacomy, Gephi: An open source software
tool and case study, Scientometrics 101 (2) (2014) 1375–1389. for exploring and manipulating networks, in: Proceedings of International
[13] Y. Zhang, A.L. Porter, Z. Hu, Y. Guo, N.C. Newman, Term clumping for AAAI Conference on Web and Social Media, Vol. 8, 2009, pp. 361-362.
technical intelligence: A case study on dye-sensitized solar cells, Technol. [42] M.E. Newman, Modularity and community structure in networks, Proc.
Forecast. Soc. Change 85 (2014) 26–39. Natl. Acad. Sci. 103 (23) (2006) 8577–8582.
[14] T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed repre- [43] L. Waltman, N.J. van Eck, E.C. Noyons, A unified approach to mapping and
sentations of words and phrases and their compositionality, Adv. Neural clustering of bibliometric networks, J. informetr. 4 (4) (2010) 629–635.
Inf. Process. Syst. (2013) 3111–3119. [44] M. Krzywinski, et al., Circos: An information aesthetic for comparative
[15] Y. Zhang, G. Zhang, D. Zhu, J. Lu, Scientific evolutionary pathways: genomics, Genome Res. 19 (9) (2009) 1639–1645.
Identifying and visualizing relationships for scientific topics, J. Assoc. Inf. [45] P. Kalluri, Don’t ask if artificial intelligence is good or fair, ask how it shifts
Sci. Technol. 68 (8) (2017) 1925–1939. power, Nature 583 (7815) (2020) 169.
[16] R. Attfield, Ethics: An Overview, Bloomsbury Publishing, 2012. [46] A.Z. Huq, A right to a human decision, Va. L. Rev. 106 (2020) 611.
[17] K. Grace, J. Salvatier, A. Dafoe, B. Zhang, O. Evans, When will AI exceed [47] T.C. King, N. Aggarwal, M. Taddeo, L. Floridi, Artificial intelligence crime:
human performance? Evidence from AI experts, J. Artificial Intelligence An interdisciplinary analysis of foreseeable threats and solutions, Sci. Eng.
Res. 62 (2018) 729–754. Ethics 26 (1) (2020) 89–120.
[18] J.-F. Bonnefon, A. Shariff, I. Rahwan, The social dilemma of autonomous [48] W.N. Price, S. Gerke, I.G. Cohen, Potential liability for physicians using
vehicles, Science 352 (6293) (2016) 1573–1576. artificial intelligence, JAMA 322 (18) (2019) 1765–1766.
[19] W. Wallach, C. Allen, Moral Machines: Teaching Robots Right from Wrong, [49] L. Drew, The ethics of brain-computer interfaces, Nature 571 (7766) (2019)
Oxford University Press, 2008. S19.
[20] L. Floridi, Artificial. intelligence, Artificial intelligence deepfakes and a [50] Y. Zhang, X. Cai, C.V. Fry, M. Wu, C. Wagner, Topic Evolution, Disruption
future of ectypes, Philos. Technol. 31 (3) (2018) 317–321. and Resilience in Early COVID-19 Research, SSRN, 2020, http://dx.doi.org/
[21] C. Cath, S. Wachter, B. Mittelstadt, M. Taddeo, L. Floridi, Artificial intelli- 10.2139/ssrn.3675020.
gence and the ‘good society’: the US, EU, and UK approach, Sci. Eng. Ethics [51] B.L. Webber, S. Raghu, O.R. Edwards, Opinion: Is CRISPR-based gene drive
24 (2) (2018) 505–528. a biocontrol silver bullet or global conservation threat?, Proc. Natl. Acad.
[22] G.Y. Tian, Current issues of cross-border personal data protection in the Sci. 112 (34) (2015) 10565–10567.
context of cloud computing and trans-Pacific partnership agreement: Join [52] B. Green, G. Cunningham, A. Ekblaw, P. Kominers, A. Linzer, S.P. Crawford,
or withdraw, Wis. Int’l LJ 34 (2016) 367. Open Data Privacy, Berkman Klein Center Research Publication, 2017, pp.
[23] C. Kerry, Protecting Privacy in an AI-Driven World., Brookings, 2020. 07–17, no. 2017-1.
14
Respecting AI ethics, particularly in the context of privacy and data ownership, has profound societal implications. As AI systems process vast amounts of data, societal concerns about data monetization and privacy violations grow, prompting debates on legal and financial regulations . Respecting these ethical standards could improve public trust, influence political governance, enhance the protection of individual rights, and ensure equitable access to technology . This reflects a broader need to balance technological advancement with ethical responsibility, safeguarding both personal freedoms and promoting transparency .
Current AI ethics concerns, such as those involving data privacy and accountability, align closely with historical challenges posed by AI systems through consistent themes of transparency and fairness . Historically, AI ethics grappled with ensuring systems acted reliably and without bias, rooted in the operation and decision-making of expert systems . Today, this alignment continues as the complexity and scale of AI systems amplify these issues, particularly with the vast amounts of sensitive data they handle and the significant impact of their decisions on personal and societal levels .
AI techniques in clinical practice intersect with ethical considerations primarily through issues of data privacy, transparency, and bias. As AI is increasingly used in processing electronic health records, clinical trials, and genomic data, concerns arise regarding the illegal use and sharing of sensitive personal information . The challenge lies in ensuring that AI systems operate transparently and without bias, aligning with human morality, which is crucial to maintaining trust in AI systems within healthcare .
Brain-computer interfaces have significantly influenced discussions on human morality and ownership within AI ethics. The ability of these interfaces to interpret and act upon brain signals raises ethical questions about who owns and controls the data and outcomes derived from such intimate access to human thoughts . The concern is amplified by the prospect of a company or device essentially "owning" aspects of a person's identity or decision-making capabilities, questioning the boundary between human and machine, and emphasizing the need for clear ethical guidelines and regulations .
AI surveillance systems pose challenges related to privacy intrusion, data security, and potential misuse for governmental or commercial purposes. These systems amplify concerns about constant monitoring and the erosion of personal freedoms . Recommendations for regulation include establishing clear guidelines for data use, transparency in surveillance operations, and accountability measures to prevent abuse. Public and governmental oversight is crucial to ensuring that these systems operate ethically and that the benefits of such technologies do not override fundamental human rights .
Human morality plays a crucial role in integrating AI into decision-making processes, especially within healthcare. AI systems must align with ethical norms to be accepted and trusted by human users, especially when making decisions that impact human health and life . This integration challenges designers to create systems that reflect human values, balance moral judgments like fairness and empathy, and address issues such as misleading information and bias . The alignment of AI with human morality ensures that technological advancements genuinely benefit society, maintaining trust and accountability in sensitive areas.
The evolution of ethical issues in AI has progressed from debates around expert systems in the 1990s to a diverse array of current concerns including machine ethics, decision-making, and privacy in healthcare . Initially, ethical discussions focused on the system's capabilities and decision-making processes. Over time, these concerns expanded to include broader societal implications such as privacy, data ownership, and human morality as AI applications permeated various aspects of daily life. Each stage in this evolution reflects an increasing awareness of AI's potential impacts, necessitating ongoing analysis and regulatory updates .
From a complex ecosystem perspective, societal reactions to AI techniques concerning sustainability and responsibility have been multifaceted. Societies have shown resilience in responding to disruptions introduced by AI, with reactions encompassing calls for sustainable practices and heightened responsibility in AI deployment . The resilience indicated by these reactions reflects a shift towards viewing AI not just as a technological tool but as a societal actor with ethical obligations. This includes implementing regulations that ensure AI contributes positively to societal goals and minimizes harm, emphasizing the need for sustainable and responsible AI innovation .
Reinforcement learning and fraud detection methodologies introduce noise in AI ethics analysis because they interact differently with ethical parameters such as fairness and accountability. Reinforcement learning, a type of machine learning, often involves actions based on trial and error, raising concerns about fairness and unintended outcomes in its applications . Fraud detection, on the other hand, focuses on specific machine learning applications, potentially skewing the analysis by emphasizing detection accuracy over the broader ethical implications like data privacy and bias . This variability necessitates nuanced consideration of each technique's unique ethical footprint.
International contributions to AI ethics research vary significantly, with English-speaking countries like the USA, UK, Australia, and Canada leading the field . These countries' research institutions are highly influential, contrasting with China and some European countries, which also contribute to this research but are less prominent in terms of institutional appeal . This variation highlights differing national priorities and capacities in addressing AI's ethical challenges.