0% found this document useful (0 votes)
22 views24 pages

L 0024631845 PDF

The document explores the implications of Artificial Intelligence (AI) in accounting and auditing, highlighting its role in the evolving corporate data ecosystem. It discusses the challenges and opportunities presented by Big Data and AI technologies in decision-making processes, risk management, and enhancing business operations. The work is a collaborative effort from researchers at the University of Pisa and is supported by various funding initiatives, including the European Commission's NextGeneration EU program.

Uploaded by

m.habiba812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views24 pages

L 0024631845 PDF

The document explores the implications of Artificial Intelligence (AI) in accounting and auditing, highlighting its role in the evolving corporate data ecosystem. It discusses the challenges and opportunities presented by Big Data and AI technologies in decision-making processes, risk management, and enhancing business operations. The work is a collaborative effort from researchers at the University of Pisa and is supported by various funding initiatives, including the European Commission's NextGeneration EU program.

Uploaded by

m.habiba812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Mariarita Pierotti · Anna Monreale ·

Federica De Santis

Artificial Intelligence
in Accounting and
Auditing
Accessing the
Corporate Implications
Artificial Intelligence in Accounting and Auditing
Mariarita Pierotti · Anna Monreale ·
Federica De Santis

Artificial Intelligence
in Accounting
and Auditing
Accessing the Corporate Implications
Mariarita Pierotti Anna Monreale
Department of Computer Science Department of Computer Science
University of Pisa University of Pisa
Pisa, Italy Pisa, Italy

Federica De Santis
Department of Economics
and Management
University of Pisa
Pisa, Italy

ISBN 978-3-031-71370-5 ISBN 978-3-031-71371-2 (eBook)


https://doi.org/10.1007/978-3-031-71371-2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer
Nature Switzerland AG 2024

This work is subject to copyright. All rights are solely and exclusively licensed by the
Publisher, whether the whole or part of the material is concerned, specifically the rights
of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on
microfilms or in any other physical way, and transmission or information storage and
retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc.
in this publication does not imply, even in the absence of a specific statement, that such
names are exempt from the relevant protective laws and regulations and therefore free for
general use.
The publisher, the authors and the editors are safe to assume that the advice and informa-
tion in this book are believed to be true and accurate at the date of publication. Neither
the publisher nor the authors or the editors give a warranty, expressed or implied, with
respect to the material contained herein or for any errors or omissions that may have been
made. The publisher remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.

This Palgrave Macmillan imprint is published by the registered company Springer Nature
Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

If disposing of this product, please recycle the paper.


Acknowledgements

Research partly funded by PNRR—M4C2—Investimento 1.3, Parte-


nariato Esteso PE00000013—“FAIR—Future Artificial Intelligence
Research”—Spoke 1 “Human-centered AI”, funded by the European
Commission under the NextGeneration EU program
This work has been supported by SoBigData.it that receives funding
from European Union—NextGenerationEU—National Recovery and
Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR)—
Project: “SoBigData.it– Strengthening the Italian RI for Social Mining
and Big Data Analytics”—Prot. IR0000013—Avviso n. 3264 del 28/
12/2021; and by the NextGenerationEU program under the funding
schemes PNRR-PE-AI scheme (M4C2, investment 1.3, line on AI) FAIR
(Future Artificial Intelligence Research).

v
Contents

1 The New Corporate Data Ecosystem 1


Federica De Santis
1.1 Introduction 1
1.2 The Data Deluge: How the Corporate Data
Ecosystem Is Expanding 2
The Emergence of Big Data: Definition and Evolution 3
The Sources of Big Data 8
1.3 Big Data and Artificial Intelligence: Opportunities
for Decision-Making and Control Processes 11
1.4 Big Data and Artificial Intelligence: What Are
the Challenges for Contemporary Organizations 14
The Nature and Complexity of Big Data 14
The Black-Boxing Effect and Data Quality Issues 15
Privacy and Security Concerns 16
Interpretability and Bias in Artificial Intelligence 16
Organizational Barriers to Analytics 16
1.5 Conclusions 17
References 18
2 From Data to Corporate Decision-Making Process 25
Mariarita Pierotti
2.1 From Data to Knowledge 25
2.2 Knowledge Management 28
2.3 The Problem-Solving Perspective 41

vii
viii CONTENTS

The Data Warehouse 43


The Data Mart 45
The ETL System 45
The OLAP Systems 46
Data Mining 48
2.4 The New Decision-Making Paradigm 49
References 57
3 An Introduction to Artificial Intelligence 63
Anna Monreale
3.1 Introduction 63
3.2 Data Mining and Machine Learning 65
Supervised Machine Learning 66
Unsupervised Machine Learning 77
3.3 Deep Learning 83
Supervised Deep Learning Models 84
Unsupervised Deep Learning Models and Generative
Models 85
Generative Large Language Models 87
3.4 Conclusion 88
References 89
4 The Need of Trustworthy Artificial Intelligence 91
Anna Monreale
4.1 Introduction 91
4.2 AI Regulation 93
4.3 Fairness, Explainability, and Privacy in AI Lifecycle 94
Biases and Fairness 96
Explainability 98
Privacy 101
4.4 Conclusion 104
References 104
5 Artificial Intelligence to Support Business Decisions 107
Federica De Santis
5.1 Introduction 107
5.2 History and Evolution of Artificial Intelligence
in Business Contexts 108
5.3 Artificial Intelligence Applications
in Decision-Making Processes 111
CONTENTS ix

Theoretical Foundations on AI’s Adoption


in Business Contexts 112
Artificial Intelligence for Operational, Tactical,
and Strategic Decisions 117
The Automation-Augmentation Debate 120
Challenges and Future Trends of Using AI
for Business Decisions 124
5.4 Conclusions 128
References 129
6 Artificial Intelligence for Risk Management 139
Federica De Santis
6.1 Introduction 139
6.2 Fundamentals of Risk Management 141
6.3 AI Applications in Risk Management:
Opportunities and Challenges 147
6.4 Conclusions 151
References 152
7 AI for Financial Accounting 155
Mariarita Pierotti
7.1 Digitalization in Accounting 155
7.2 AI in Accounting Environment 159
7.3 AI Adoption: Challenges for the Accounting Profession 162
7.4 ChatGPT in the Accounting Field 166
7.5 Conclusions 167
References 168
8 AI for Managerial Accounting 171
Mariarita Pierotti
8.1 Modern Managerial Accounting 171
8.2 Business Intelligence and Analytics for Managerial
Accounting 173
8.3 Integration of Data Analytics in ERP Systems
for Management Accounting 176
The Managerial Accounting Data Analytics
(MADA) Framework 177
8.4 Machine Learning for Management Accounting 181
Creating Measures from Textual Data 182
Creating Measures from Other Than Textual Sources 182
x CONTENTS

Prediction Methods for MA 183


8.5 AI Ethics for Managerial Accounting 188
References 189
9 Artificial Intelligence in Auditing 193
Federica De Santis
9.1 Introduction 193
9.2 The History of Digital Technologies in Auditing:
An Overview 195
9.3 Opportunities and Challenges of Introducing AI
in the Audit Process: A Literature Review 197
Opportunities of Using AI in Financial Statements
Audit 198
Challenges Related to the Use of AI in Financial
Statements Audit 201
9.4 Conclusions 202
References 203
10 Using AI in the Business Context: Initial Applications
in the Italian Business Environment 209
Mariarita Pierotti
10.1 Introduction 209
10.2 The Application of AI in Italian Business
Environment 210
10.3 Artificial Intelligence Solution Classes 211
10.4 AI to Support Business Decisions 213
10.5 AI for Managerial Accounting Embedded
in an ERP Solution 213
Target Process Model and Technical Solution 214
10.6 AI for Financial Accounting: Main Solutions 215
Robot Process Automation (RPA) Business Case 219
Optical Character Recognition (OCR) Solutions
Business Case 224

Index 227
List of Figures

Fig. 1.1 An integrated view of Big Data (Source Lee


[2017]. Big data: Dimensions, evolution, impacts,
and challenges. Business Horizons, 60, p. 295) 6
Fig. 1.2 The expanding corporate data ecosystem (Source
Moffitt and Vasarhelyi [2013]. AIS in the age of Big
Data. Journal of Information Systems, 27 (2), p. 10) 7
Fig. 1.3 Internet of Events and new sources of Data (Source
van der Aalst, W. [2014]. Data scientist: The Engineer
of the Future) 10
Fig. 2.1 The DIKW hierarchy Rowley (Rowley, 2007) 27
Fig. 2.2 The approaches: —KM as a product e —KM
as a process (Mentzas et al., 2003) 37
Fig. 2.3 Reworkings of Nonaka’s original model (Dieng et al.,
1999) 38
Fig. 2.4 The impacts of KM on the corporate structure
(Macintosh et al., 1999) 39
Fig. 2.5 The new paradigm for DSS systems (Courtney, 2001) 53
Fig. 2.6 The types of Latency (Panian, 2009) 56
Fig. 2.7 la riduzione della latenza totale (Panian, 2009) 57
Fig. 3.1 Decision tree classifying financial transactions
as fraudulent or legitimate 69
Fig. 3.2 Schema describing the process for learning
an ensemble classifier 72
Fig. 3.3 Dendrogram generated by hierarchical clustering 79

xi
xii LIST OF FIGURES

Fig. 4.1 Ontology of the taxonomy of XAI methods (Guidotti


et al., 2021) 99
Fig. 5.1 Types of decisions in business contexts (Source
Author’s own elaboration) 118
Fig. 8.1 Potential competitive advantage increases with more
sophisticated analytics (Davenport & Harris, 2017) 175
Fig. 8.2 The Managerial Accounting Data Analytics (MADA)
framework, motivated from Cokins (2013, p. 27) 177
Fig. 8.3 Ideal enterprise system structure that supports
management accountants in a BI system motivated
from Chaudhuri et al. (2011, p. 90) 180

Picture 10.1 VAT Declaration and Quarterly settlement’s manual


workflow (AS-IS) 222
Picture 10.2 Automatic Flow Diagram (TO-BE) 223
Picture 10.3 Automatic Flow Diagram 226
List of Tables

Table 5.1 Main theories on technology adoption 113


Table 6.1 Traditional and alternative data sources in risk modeling 149
Table 10.1 Macro-phases detail’s process model 216
Table 10.2 Main activities of the technical solution 217
Table 10.3 Main KPIs of account management activities 218
Table 10.4 Effort and main pain points of quarterly VAT
settlements and VAT declaration 220
Table 10.5 ERP system core modules 221

xiii
CHAPTER 1

The New Corporate Data Ecosystem

Federica De Santis

1.1 Introduction
Since we shifted from an industrial economy to an information economy
during the late 1960s, scholars and practitioners clearly recognized the
central role information plays in conducting and controlling business
(Bertini, 1990; Galbraith, 1968; McGee et al., 1993). In an informa-
tion economy, what you know determines your success far more than
what you own, and competition is mainly based on companies’ “ability to
acquire, manipulate, interpret, and use information effectively” (McGee
et al., 1993, p. 1). Nowadays, the fruits of the “knowledge era” or “infor-
mation society” are easy to see, as every device is online, sensors are
ubiquitous and generate continuous streams of data, and the utter volume
of data produced and consumed on the Internet will increase by orders of
magnitude (Cavanillas et al., 2016; Mayer-Schönberger & Cukier, 2013).
Such a growing “datafication” of the world (Lycett, 2013) where tech-
nological advancements offer new opportunities to extract value from the
so-called data deluge (Cavanillas et al., 2016) led to the emergence of the
concept of Big Data (Gandomi & Haider, 2015). Big Data can be defined
as “high-volume, high-velocity and high-variety information assets that
demand cost-effective, innovative forms of information processing for
enhanced insights and decision-making” (Gartner IT Glossary).
Currently, the sheer annual volume of data and information generated
is approximately 120 zettabytes and it is expected to increase by over

© The Author(s), under exclusive license to Springer Nature 1


Switzerland AG 2024
M. Pierotti et al., Artificial Intelligence in Accounting and Auditing,
https://doi.org/10.1007/978-3-031-71371-2_1
2 F. DE SANTIS

150% to 181 zettabytes by 2025 (Acciarini et al., 2023). Organizations


from almost every industry have the opportunity to collect and use Big
Data by means of advanced technologies such as data analytics, artificial
intelligence, and machine learning (Acciarini et al., 2023; Akter et al.,
2016; Erevelles et al., 2016) to generate knowledge (de Camargo Fiorini
et al., 2018; Lee, 2017; Zaitsava et al., 2023). However, several commen-
tators underline that many companies are struggling to manage this
knowledge generation potential (Chen & Xiaotong, 2014; Gandomi &
Haider, 2015; Zaitsava et al., 2023).
This chapter, therefore, introduces the reader to the key concepts
related to the new corporate data ecosystem and lays the groundwork for
understanding how data is collected, stored, managed, and utilized within
an organization. We will also highlight the importance of data ecosystems
in driving business decisions, innovation, and competitive advantage by
focusing on both opportunities and challenges related to the use of Big
Data and related technology to drive and control businesses.

1.2 The Data Deluge: How the Corporate


Data Ecosystem Is Expanding
Nowadays, the data explosion combines with unprecedented advance-
ments in computing power and connectivity, thus allowing for an
increasing amount of Big Data to be analyzed anytime, anywhere (Cappa
et al., 2021; Zillner et al., 2021). Big data analytics has the potential to
transform the way businesses compete, as they can extract hidden patterns
from a set of raw data for making correct decisions, increasing produc-
tivity, generating knowledge, and upgrading innovations (Maroufkhani
et al., 2020).
Big data as a research field finds its roots in practice, rather than in
academia (Manyika et al., 2017) since practitioners first recognized the
relevance of Big Data in driving business innovation, due to its ability
to provide information about customers’ preferences, feedback about
firms’ products and services, and insights about emerging trends (Cappa
et al., 2021; George et al., 2014; Mazzei & Noble, 2017; Vecchio et al.,
2018). Over the last fifteen years, scholars have increasingly paid atten-
tion to Big Data and business analytics by investigating their potential to
revolutionize the way we work, live, and conduct business (Chen et al.,
2012; George et al., 2014; Mikalef et al., 2020). From that moment on,
researchers have attempted to explore the phenomenon of Big Data and
1 THE NEW CORPORATE DATA ECOSYSTEM 3

business analytics and to understand how organizations can create and


capture value from their data resources (Cappa et al., 2021; Cavanillas
et al., 2016; Mikalef et al., 2020).

The Emergence of Big Data: Definition and Evolution


The most widespread definition of Big Data can be referred to as the
so-called Vs framework, that stems from the recognition of its distinctive
characteristics. The seminal work of Laney (2001) firstly proposed the 3Vs
model, focusing on Big Data’s volume, velocity, and variety. Starting from
the 3Vs model, scholars and practitioners described other key features
of Big Data, so that the original interpretive framework has evolved to
include also Big Data value, veracity, and data variability (Gandomi &
Haider, 2015).
Volume is conceived as a purely quantitative dimension, which refers to
the magnitude of Big Data in terms of the amounts of available data, that
either consumes huge storage capability or consists of a large number of
records. Therefore, “big” only reflects the size of the dataset (Chen &
Xiaotong, 2014; Gandomi & Haider, 2015; George et al., 2014). A
dataset indeed is considered “Big” only if it pushes traditional information
systems’ limits, thus requiring advanced capturing and storage tools and,
way more importantly, advanced data processing capabilities (Vasarhelyi
et al., 2015).
Variety is conceived as the qualitative dimension of Big Data that
“makes big data really big” (Sagiroglu & Sinanc, 2013, p. 43), as it
reflects the structural heterogeneity of a dataset and the Big Data’s ability
to provide granular precision by means of a great variety of data sources
(Günther et al., 2017; Yoo, 2015; Zaitsava et al., 2023). In today’s infor-
mation society, data is generated from a great variety of sources and
formats, and contains multidimensional fields (Russom, 2011). Variety,
therefore, relates to the richness of data representation, that enables a
distinction between structured, semi-structured, and unstructured data
(Kaisler et al., 2013). Challenges arise due to the differing nature of
structured and unstructured data. Structured data, which resides in fixed
fields (e.g., in relational databases or spreadsheets), can be managed with
conventional technological solutions. However, structured data consti-
tutes less than 20% of the existing Big Data. The remainder consists of
semi-structured data, which does not conform to fixed fields but includes
4 F. DE SANTIS

tags and markers to separate data elements (such as XML or HTML-


tagged text), and unstructured data, which lacks fixed fields and includes
freeform text (e.g., books, articles, email bodies) as well as untagged
audio, image, and video data (Manyika et al., 2011).
The challenging aspect of such attribute lies in the conceptual differ-
ence between data and information1 and its consequences on information
production processes. Data indeed is the raw material of the information
production process, by means of which data is classified and organized
in a unique “elementary message” that is aimed at reducing the inherent
uncertainty of decision-making processes, thus increasing its effectiveness.
As a consequence, information is the output of the information produc-
tion process that is thus aimed at adding value to rough data (Marchi,
2003). To transform data, both structured and unstructured, into valu-
able information and exploit its potential, appropriate analytics tools and
techniques are required. However, significant challenges exist in finding
analytics and data management technologies that can effectively derive
valuable insights from these vast volumes of unstructured data (De Santis,
2018; Gandomi & Haider, 2015).
The velocity attribute captures the speed at which a company processes
and analyzes data (Akter et al., 2019; Günther et al., 2017; Sivarajah
et al., 2017; Zaitsava et al., 2023). Velocity is challenging for organi-
zations because it contributes to the Big Data’s complexity. Companies
must harness vast amounts of data that is continuously flowing and fast-
generated (De Santis, 2018). Such fast data must be analyzed and acted
upon close to real time in order to create value and allow companies
to gain a competitive advantage. Strictly related to the velocity attribute
is Big Data’s variability, that refers to the fluctuations of the data flow
rate, which is subject to periodic peaks and troughs (Gandomi & Haider,
2015).
Big Data’s Veracity refers to how predictable, reliable, and accurate
the data and the related analytics are (Ohlhorst, 2012). Therefore, this
attribute identifies the credibility or trustworthiness of the output gener-
ated (de Camargo Fiorini et al., 2018; Zaitsava et al., 2023). Reliability
issues with Big Data mostly emerge because data can be extracted from
many different sources, either internal or external to the organization,

1 For an in-depth examination of the new data-information-knowledge paradigm, see


Chapter 2.
1 THE NEW CORPORATE DATA ECOSYSTEM 5

either traditional or new at a constantly increasing pace. Veracity, there-


fore, emphasizes the need to be aware of data quality (Erevelles et al.,
2016).
The last attribute, Big Data Value, measures the usefulness of data
in making decisions (Kaisler et al., 2013). Big Data’s Value is charac-
terized by a certain degree of ambiguity as it relates both to the extent
to which Big Data generates economically worthy insights for decision-
making (Fosso Wamba et al., 2015), and to the relatively “low value
density” as compared to large volumes of available data. Useful informa-
tion that provides leverage for decision-making within a problem space
is often hidden among huge amounts of non-traditional data received
in the original form (Gandomi & Haider, 2015). The challenge here
is in the form of a “needle-in-a-haystack problem,” which consists in
understanding what data is worth acquiring and what is worthless, and
then transforming and extracting that data for analysis (Kaisler et al.,
2013; McGuire & Ladd, 2014). In this respect, some authors also added
a further attribute, namely the “decay” of Big Data, to underline the
declining value of data over time and the need for timely information
processing and real-time data-driven actions (Lee, 2017).
To summarize, Lee (2017, pp. 294–296) proposed an integrated view
of Big Data as reported in Fig. 1.1. According to the author, the three
edges of the integrated view of Big Data represent the three Big Data’s
dimensions that were first listed by Laney (2001): i.e., volume, velocity,
and variety. Inside the triangle Lee (2017) listed five further Big Data’s
dimensions that are affected by the growth of the three triangular dimen-
sions: veracity, variability, complexity, decay, and value. Moreover, the
relationship between the initial 3Vs with the additional dimensions is
explained. That is to say that all the five further dimensions but veracity
are positively impacted by the growth of the three triangular dimensions.
In such representation of Big Data, traditional data that mostly consists
in structured data such as ERP and legacy data represents a subset of
Big Data with the same three dimensions, although the scope of each
dimension is much smaller than that of Big Data.
In this regard, Moffitt and Vasarhelyi (2013) proposed a representation
of the expanding corporate data ecosystem, as depicted in Fig. 1.2.
This view also suggests that, though scholars and professionals use to
define Big Data by referring to its characteristics, one must consider Big
Data as a moving, rather than static, target (Kaisler et al., 2013). Big
Data’s Volume is a dynamic attribute that varies across time and space.
6 F. DE SANTIS

Fig. 1.1 An integrated view of Big Data (Source Lee [2017]. Big data:
Dimensions, evolution, impacts, and challenges. Business Horizons, 60, p. 295)

What qualifies as Big Data today may not meet future “high-volume”
thresholds due to continuous advancements in storage capabilities. Simi-
larly, two datasets of the same size might require different data manage-
ment technologies, meaning one might not meet the characteristics of Big
Data (Gandomi & Haider, 2015). Therefore, the term “Big Data” implies
that the data volume is at or beyond the limits of what a company’s infor-
mation system can store, manage, and process efficiently (Vasarhelyi et al.,
2015).
Although the quantities of data available today are undeniably large,
volume alone cannot define this new data ecosystem (Boyd & Crawford,
2012). The term “big” can be misleading, as it directs researchers’ and
practitioners’ attention to the dataset size rather than to the insights it can
1 THE NEW CORPORATE DATA ECOSYSTEM 7

Fig. 1.2 The expanding corporate data ecosystem (Source Moffitt and
Vasarhelyi [2013]. AIS in the age of Big Data. Journal of Information Systems,
27 (2), p. 10)

provide. The defining parameter of Big Data thus becomes its granularity,
shifting the focus from the number of records to the detailed information
each record contains, i.e., how smart the data is (George et al., 2014).
Moreover, Big Data is “worthless in a vacuum.” Its potential value is
unlocked only when leveraged for decision-making. To enable evidence-
based decision-making, organizations need efficient processes to turn
high volumes of fast-moving and diverse data into meaningful insights
(Gandomi & Haider, 2015, p. 140). This involves understanding the
relationship between data, information, and knowledge, with the assump-
tion that “data is a record, information is a message, and knowledge is a
model of how something works” (Bhimani & Willcocks, 2014, p. 471).
Big data should not only be collected, extracted, cleansed, and integrated;
it must also be modeled using advanced analytic tools and interpreted by
decision-makers to deliver business value.
8 F. DE SANTIS

The Sources of Big Data


We know that Big Data represents the cornerstone of the business
environments’ digitalization because, when combined with appropriate
analytics tools, it provides companies with the ability to extract intel-
ligence from data and translate that into business advantage, so that
managers can decide based on evidence rather than intuition (McAfee
et al., 2012). Further, Big Data is everywhere since there is a smart-
phone in every pocket and a computer in every backpack (McGee et al.,
1993). As many authors underline, not only we are “awash with more
information than ever before, but that information is growing faster”
(Mayer-Schönberger & Cukier, 2013, p. 6).
Big Data is generated by a wide variety of sources and can be classified
as follows, depending on how it is owned and managed (Cappa et al.,
2021; George et al., 2014):

1. Public data
2. Open data
3. Private data
4. Data exhaust
5. Community data
6. Self-quantification data.

Public data is typically generated by individuals and used by public


entities such as governments, governmental organizations, and local
communities. This data category includes data collected from free Wi-Fi
access in public areas or public transportations by municipal organizations
and can be used for a wide-ranging business and management applica-
tions or for several research purposes by public entities. Public data also
includes data related to energy use, and health care, that must be accessed
under certain restrictions in order to protect individuals’ privacy. Open
data, instead, is usually not tradable as it is freely accessible to anyone
interested. This data includes data collected by large international organi-
zations about global trends or phenomena, such as the data collected by
the World Bank about the global economic and institutional trends. Open
data is usually provided as aggregate data, thus lacking any connection to
specific individuals so that data users are not required to confront privacy
and ethical concerns.
1 THE NEW CORPORATE DATA ECOSYSTEM 9

Private data are generally created and owned by firms, non-profit


organizations, and individuals and reflect private information. Companies
typically collect private data by means of proprietary platforms such as the
company’s Enterprise Resource Planning (ERP) or Customer Relation-
ship Management (CRM) system, website, or social networks accounts.
This category also includes consumer transactions, radio-frequency iden-
tification (RFID) tags used by supply chains, movements of companies’
goods and resources, and data collected through mobile application
(mostly interactions with customers).
Data exhaust refers to data passively collected from the general envi-
ronment in which the company operates. That data, in its original form,
has little or zero value for the collector as it is usually collected for other
purposes, but it can be combined with other data sources to create new
sources of value. As a way of example, data exhaust comprises data that
individuals passively generate during their daily activities such as when
they make purchases, or access healthcare services, or when they interact
with others on Internet. Individuals also generate data exhaust with their
information-seeking behavior via Internet or by phone. Companies can
use such data to infer people’s preferences or to predict consumers’
behavior.
That labeled as community data mostly consists in unstructured data
(i.e., text, images, video, etc.) that is generated by means of social
networks, blog, and the like. Typical community data includes social
networks’ feeds, consumers’ reviews on product and services, voting
buttons associated with consumers’ sharing, scoring, and reviewing
systems (e.g., the “Like” button on Facebook or “I found it useful” on
Amazon).
Finally, we consider self-quantification data as data generated by indi-
viduals when they quantify personal actions and behavior, such as that
data generated via sensors on common wearables (e.g., a smartwatch or a
wristband) used to monitor exercise and movement, or that collected by
sensors installed on home automation accessories (e.g., energy consump-
tion). That data becomes of interest when it is collected via mobile
applications, aggregated, and then tracked for psychology, marketing, or
public policy purposes.
Over time, scholars have proposed other potential classifications of Big
Data sources, such as the one proposed by Baesens et al. (2016), who
distinguished data sources among the following categories:
10 F. DE SANTIS

1. Large-scale enterprise systems, that include companies’ most


widespread transactional information systems such as ERP systems,
CRM, and Supply Chain Management (SCM) platforms. These
sources usually generate structured data.
2. Online social graphs, i.e., the social networks within which people
interact each other and all the digital platforms through which
people access media content.
3. Mobile devices, since they are maybe the today’s most widespread
gateway to the Internet, with 5 billion handsets worldwide, that
allow to track and potentially geotag every action taken by a user.
4. Internet of things (IoT), that relates to the vast types of sensors
connecting objects one another and with humans.
5. Open data/public data, that comprises data about topics including
weather, traffic, maps, environment, and housing.

Broadly speaking, within the information society, where everything


happens in the physical world can be recorded and thus turned into data,
new sources of data usually emerge because of the interactions that occur
through a web-based connection. Based on this consideration, we can
group all the possible sources of Big Data according to the “Internet of
Events” paradigm (van der Aalst, 2014), as reported in Fig. 1.3.

Fig. 1.3 Internet of Events and new sources of Data (Source van der Aalst, W.
[2014]. Data scientist: The Engineer of the Future)
1 THE NEW CORPORATE DATA ECOSYSTEM 11

1.3 Big Data and Artificial


Intelligence: Opportunities
for Decision-Making and Control Processes
In today’s complex and dynamic business environment, the ability to
explore data to understand customer behavior, segment customer bases,
offer customized services, and gain insights from multiple data sources
is crucial for achieving a sustainable competitive advantage (Assunção
et al., 2015). Since digitalization has made the use of data crucial for
companies (Lull et al., 2024), many firms are investing in Big Data to
find innovative ways to differentiate themselves from their competitors
(Ghasemaghaei & Calic, 2020). In the current marketplace, organizations
need for more advanced data analyses, scenario planning, and predic-
tive capabilities to face higher complexity and uncertainty in the external
environment (Visani, 2017). Big Data and advanced analytics tools can
be particularly helpful in dealing with such new challenges. Whether the
emergence of Big Data has impacted on how managers collect, process,
and use data to achieve business goals (Warren et al., 2015), advanced
analytics tools such as data analytics, data mining, machine learning, and
artificial intelligence provide companies with the ability to investigate,
discover, and create questions. Specifically, the use of Big Data anal-
ysis and the development of artificial intelligence techniques stimulate
more questions, more complex questions and, most importantly, have the
power to answer these questions (Cokins, 2016, pp. 4–5).
Both academics and professionals unanimously recognize that a profi-
cient use of digital technology can lead to reducing costs and improving
efficiency and innovation (Zhang & Li, 2024). Big Data has the poten-
tial to positively impact on the effectiveness of business strategy and on
organizational performance in almost every industry and business func-
tion (de Camargo Fiorini et al., 2018; Lull et al., 2024) by facilitating
process standardization, streamlining operational activities, and improving
customer interactions (Betti et al., 2024).
To effectively respond to the changes occurred in the marketplace and
gain competitive edge, companies are increasingly required to carefully
monitor product and service quality, time-to-market, as well as flexibility
in the production processes, which have become critical corporates’ value
drivers. This means that the scope of task-level decisions and control
activities has enlarged to encompass all the processes that compose the
company’s value chain (D’Onza, 2008).
12 F. DE SANTIS

The rapid advancement of emerging technologies, such as cloud


computing, the IoT, Big Data, and artificial intelligence (AI), is trans-
forming the paradigm of advanced manufacturing toward the so-called
cognitive manufacturing, which bases on robust data collection and
utilizes this data to extract actionable insights (El Kalach et al., 2024;
Wang et al., 2021). Specifically, cognitive manufacturing “not only
integrates the physical and digital worlds but also allows for flexible,
reconfigurable, and adaptive production” (El Kalach et al., 2024).
Moreover, the increasing deployment of IoT, Big Data, and other AI-
based technologies allows companies to use real-time data throughout
the entire product lifecycle, from “the inception of an intangible
concept,through the recycling of a finished product” (Wang et al., 2021).
As a way of example, by conducting advanced market analysis via data
mining, firms are able to gain valuable insights on how to develop the
product’s conceptual design, then they can collect data from sensors to
track parts, monitor machinery, and optimize operations. The use of IoT
and AI algorithms enables companies to exploit real time, highly gran-
ular data to enhance procurement, supply, and production processes. This
kind of data, in fact, allows ubiquitous process control and optimization
to reduce waste and maximize yield or throughput (Manyika et al., 2011;
Wang et al., 2021).
To enhance overall customer satisfaction, Big Data derived from
customer interactions can inform product development decisions. As tech-
nology advances, embedding sensors in products to generate data on
utilization, usage patterns, and performance is increasingly feasible. The
availability of such real-time granular data provides manufacturing compa-
nies with unprecedented opportunities to identify emerging product
defects, enabling immediate adjustments to the production process and
informed product development planning. Additionally, leveraging large
datasets in marketing, sales, and after-sales service activities presents
numerous opportunities. Companies can develop monetization models
based on post-sales services priced according to usage, rather than solely
on product purchases (Manyika et al., 2011; Wang et al., 2021).
Scholars also underlined that Big Data and AI can significantly impact
on supporting managerial control activities, since managers are now
able to combine traditional accounting data with new data sources
(e.g., GPRS, RFID, sensors, and the like) to improve planning and
decision-making processes, by focusing more on assessing the emerging
trends within the competitive environment, than on purely historical data

You might also like