0% found this document useful (0 votes)
36 views

Data Governance

Uploaded by

Richard Mmassy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Data Governance

Uploaded by

Richard Mmassy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

A Prototype Data

Governance Framework
for Africa

CO NSO RTIUM P OUR L A R EC HERC HE ÉCON OMIQUE E N AF R I QU E


AFRICAN E C ONOM IC R E S E A RCH CONS ORTIUM
Bitange Ndemo
and
Aaron Thegeya

Working Paper DG-001

Bringing Rigour and Evidence to Economic Policy Making in Africa


A Prototype Data
Governance Framework
for Africa
By

Bitange Ndemo
University of Nairobi

and

Aaron Thegeya
Aliquot Ltd

AERC Working Paper DG-001


African Economic Research Consortium, Nairobi
February 2023
THIS RESEARCH STUDY was supported by a grant from the African Economic Research
Consortium. The findings, opinions and recommendations are those of the author,
however, and do not necessarily reflect the views of the Consortium, its individual
members or the AERC Secretariat.

Published by: The African Economic Research Consortium


P.O. Box 62882 - City Square
Nairobi 00200, Kenya

© 2023, African Economic Research Consortium.


Contents
List of figures
Abstract

1. Introduction 1

2. Background and Literature Review 3

3. An Organizational Framework for Data Governance 6

4. A Prototype Data Governance Framework for Africa 10

5. Conclusion and Recommendations 15

Notes 16

References 17
List of figures
1. Organizational framework for data governance 7
Abstract
To create a vibrant market for the use of data, while still protecting the individual rights
of those on the continent, Africa must lead the way in developing its data strategy and
data governance framework. This data governance framework should account for the
continent’s unique characteristics while addressing gaps in digitization, identity, and
access to data across countries. This background paper discusses the key features of
an effective data governance framework within an African context; it identifies the key
dimensions that merit consideration in this regard, and it describes the principles that
should animate such a system for governing how data is deployed on the continent.

Keywords: Data, Data Governance Framework, Africa, Data Infrastructure


A Prototype Data Governance Framework for Africa 1

1. Introduction
In their simplest form, data are frequently defined as a collection of symbols that
are the properties of observables, or the representation of facts. Data within a given
context translates into information, and information in perspective, integrated
into a viewpoint based on experience, is what we think of as knowledge (Ackoff,
1989). Despite the distinction between data and information, the terms are often
interchangeable in practice. Data are an important component of total factor
productivity and contribute in important ways to growth, in addition to labour and
capital. There are massive economies of scale to be gained from combining different
datasets to yield insights that would otherwise be unavailable or difficult to capture. In
addition, improvements in data processing, data storage, and data analytics through
machine learning and artificial intelligence can support productivity gains, boost
efficiency, and decrease costs; advances that can drive economic growth, increase
prosperity, and improve the standard of living on the continent.
Data governance involves establishing principles to enable an environment for
the sharing of data, with the goal of improving living standards while at the same
time recognizing and protecting the rights of data originators and users. Given the
central role of data in today’s global economy, a system of effective data governance
is essential. That said, the development of such a framework requires scrutiny of
the economic, legal, and institutional issues attendant to such regulation, and the
establishment of proper standards for the exchange and protection of data.
At the micro or firm level, data governance has historically referred to managing
the availability, usability, integrity, and security of data. From a global perspective,
the World Bank (2021a: 38) has deemed that data governance “entails creating
an environment of implementing norms, infrastructure policies and technical
mechanisms, laws and regulations for data, related economic policies, and institutions
that can effectively enable the safe, trustworthy use of data to achieve development
outcomes”. To leverage the vast opportunities of data utilization, Africa must develop
a data strategy underpinned by a governance framework. Such a strategy would
establish data sovereignty and render the continent more competitive and better
positioned to engage in cross-country collaboration during the digital age. Data could
be reused by promoting practices protective of privacy, including personal and other
sensitive data, through techniques including anonymization, pseudonymization,
differential privacy, generalization, suppression, and randomization.

1
2 Working Paper DG-001

Africa must define a continental strategy and devise a governance framework


that maximizes the use of data while ensuring productive cross-border data flows
and protecting individual rights. The continent lacks sturdy and expansive national
or regional structures for governing data, and individual African nations have yet to
develop legislation to safeguard data use and digital transactions, an absence likely
to cause market fragmentation due to insufficient harmonization (UNCTAD, 2021).
Meanwhile, the data governance frameworks that exist, albeit in their limited form,
lack coherence in terms of principles, scope and enforceability across jurisdictions.
The rapidly changing landscape of data generation, storage, and mining capacity, and
the dearth of human and financial resources, reliable institutions, and enforcement
capacity to support an efficient data governance environment will cause the continent
to regress at the moment when it is arguably positioned to show its greatest progress
ever, unless key stakeholders take immediate action.
Without a coherent data governance framework, data generated in Africa risk being
improperly utilized within each country and in other parts of the world, leading to an
unbalanced platform of data exchange with countries where data are closely regulated.
African countries, therefore, must take the lead in establishing appropriate frameworks
that will serve their own national interests and those of the continent. Governments,
development institutions, and non-state actors should collaborate to implement
and enforce data governance laws and policies that can make the continent’s digital
economy more competitive, while at the same time enhancing transparency, trust,
and digital inclusiveness for all users.
A Prototype Data Governance Framework for Africa 3

2. Background and literature Review


The history of data is closely intertwined with the evolution of mankind. The earliest
examples of data being stored and analyzed by humans date back to about 18,000
BC, in what is now Uganda, when humans were recorded using the Ishango Bone,
a dark length of bone with a sharp piece of quartz affixed to one end. These bones
were marked with notches for the purposes of tallying. Subsequently, the abacus,
the first device constructed specifically for performing calculations, was invented
around 2,400 BC. The first data libraries appeared during roughly the same period,
marking mankind’s initial endeavor towards mass data storage. The year 1663 saw
the emergence of statistics as a distinct mode of analysis, when John Graunt recorded
mortality information in London and used his figures and framework to design an early
warning system to alert the population about the spread of the bubonic plague that
had been ravaging Europe (Marr, 2015). The central concept of the modern computer
emerged thereafter, based on the ideas of Alan Turing who, in 1936, presented the
notion of a universal machine (Zimmerman, 2017), paving way for the first digital
computers in the following decade. Finally, data became ubiquitous with the advent
of the Internet, announced by Tim Berners-Lee in 1991, thus setting the stage for the
modern age of Big data.
Historically, people struggled to collect data because they lacked the necessary
tools and infrastructure. The digital revolution, however, led to dramatic changes
in the scope and types of data collected, and the volume of datasets collected have
increased compared to only a few decades ago. Moreover, when governments fail to
do the collecting, private firms and individuals can now use new digital platforms to
gather data for private use, for commercial purposes, or to promote accountability
and governance, such as platforms used to report violence or discrimination. The
digitization of African historical data in archived records has enabled researchers to
capture historical statistics on a wider scale than ever before, and enabled insightful
publications on Africa’s past that would not have been possible without today’s data
infrastructure.
Data consumption needs have increased significantly over time, and consumption
of data varies by region. Daily usage statistics are staggering; from the advent of
civilization to 2003, for example, 5 exabytes of data were created1 but, by only seven
years later, that amount of data was being generated every two days (World Bank,
2021b). By 2025, it is estimated that 463 exabytes of data will be created around the

3
4 Working Paper DG-001

world each day (Desjardins, 2019). Presently, the entire universe of data is estimated
at 44 zettabytes, a total that includes 294 billion emails, 5 billion Internet searches,
and 65 billion messages transmitted each day through messaging services such as
WhatsApp (World Bank, 2021b).
A World Bank (2021b) study looking at minimum data consumption using data
from six developing and emerging countries found that the most frequent online
activities, which included visits to public service websites, learning, shopping, health
information and news, consumed 660 megabytes of data per user, per month. When
looking beyond data requirements for solely welfare-improving activities such as those
just mentioned, individuals in these countries needed an additional 5.2 gigabytes for
recreational activities on social media per month, putting total monthly data demand
in these economies at approximately 6 gigabytes per person (World Bank, 2021b).
The body of research on data governance has been carried out mostly from an
organizational perspective. Given data’s role as a strategic and monetizable asset,
organizations have researched holistic data governance frameworks to facilitate
effective utilization of data with a profit motive, while respecting privacy rights (see,
for example Khatri and Brown, 2010; Otto, 2011; Weber et al., 2009). From a regulatory
perspective, countries are in the process of defining data governance frameworks.
For example, in November 2020, the European Commission proposed rules on data
governance to boost data sharing and support European data spaces, in line with
principles such as personal data protection (General Data Protection Regulation),
consumer protection, and competition. The World Bank has focused its 2021 World
Development Report on data issues pertinent to developing economies.
Micheli et al. (2020) investigated the emerging models of data governance in the age
of datafication and, in addressing the politics of data, considered actors’ competitive
struggles. This conceptualization brought to the forefront the multifaceted economic
and social interactions, and power relations within data governance models—
particularly those at work in corporate environments. Public bodies and civil society
are, within these models, key players for both redistributing any value produced via
data and democratizing its governance. Further, Micheli et al. (2020) found that data
trust and intermediaries were included in nearly every investigated model, leading
the researchers to underscore the importance of data infrastructure as fundamental
to improving trust in data.
Research has also revealed a wide variety of views and minimal agreement across
stakeholders on the issue of data governance frameworks. Within the context of
academia, Kouper et al. (2020) carried out an exploratory study on data governance
in the United States involving individuals who worked in research and academic
institutions, aiming to understand the entities central to decision-making and
governance on data and research-related issues. Their findings showed considerable
complexity and diversity across stakeholders in terms of both identity and ideas on
the governance of data. To account for such diversity, Kouper et al. (2020) proposed
to frame data governance in research around common governance bodies, arguing
that, to ensure effective data governance in research, voices of people from different
A Prototype Data Governance Framework for Africa 5

literacy and income levels should always be incorporated in shaping policy and
making decisions.
Several approaches have been used to determine data governance activities. For
instance, Alhassan et al. (2016) used keywords to identify papers on data governance
activities using open-coding approaches and identified 31 articles that mentioned
such activities. Their analysis identified 110 data governance activities across five
decision domains of their framework (data principles, metadata, data quality, data
life cycle, and data access), with each domain implicating a different critical aspect
of data governance.
The rapid growth in digital financial services presents concerns over data
protection and privacy for low-income individuals, especially those in developing
countries. Vidal and Medine (2019), for example, analyzed whether data privacy is
desirable in a corporate world. Their analysis included experiments in India and Kenya,
where several products with varying degrees of data protection and a range of privacy
options were offered to low-income individuals, thereby allowing the researchers
to evaluate the demand for individual safeguards within markets with limited or
no frameworks in place to protect individual privacy. They found that low-income
individuals were willing to pay for their data privacy. For instance, in Kenya, 64% of
low-income individuals surveyed chose options with a greater degree of data privacy,
despite the imposition of a non-trivial 10% fee attached to this option. Even more,
results in Bangalore were similar to those of Kenya, with 66% of survey participants
choosing this option.
Freely available public data could generate economies of scale through reuse,
and the benefits of these types of data in terms of the public good present a case
for protecting the availability of some classes of data from public sources relative to
private firms. Beraja et al. (2021) analyzed the state of artificial intelligence within
China by gathering comprehensive data from government and firm-procurement
contracts within the artificial intelligence industry and found that sharing data
improved productivity in both private and public institutions. Their results also
indicated that the ability to access government data outweighed the feasibility of
providing these same data through commercial means; accessible government data
should not, they concluded, be substituted by private markets.
6 Working Paper DG-001

3. An organizational framework for


data governance
A sound data governance framework requires that institutions and stakeholders
have the right incentives to produce, protect, and share data; a comprehensive
understanding of data governance also demands consideration of key dimensions,
including: (a) the relevant stakeholders who use data and those who are impacted by
the use of data; (b) the lifecycle of data from creation to destruction; (c) the typology of
data, reflecting relevant characteristics that impact processing, storage, and accuracy;
and (d) enabling pillars such as economic, legal, and institutional aspects that create
the necessary infrastructure for using data and maximize its productivity. These key
dimensions are illustrated in Figure 1 below.
The key stakeholders who generate and use data include households, the
private sector, governments, and civil society, with households and the private
sector being major data producers and consumers, and governments and civil
society offering essential safeguards concerning its use. The government is
central in formulating policies and regulations, while civil society helps hold other
stakeholders accountable. The needs of each stakeholder bear consideration
within a data governance framework. Data privacy concerns, for example, vary
across stakeholders and are relevant particularly for households and the private
sector; conversely, certain data collected by governments merit classification
under public data, particularly where the utilization of these data improves
productivity and creates economies of scale, and given that the data are collected
using public resources.
The data lifecycle details the key steps that occur between the creation and
destruction or reuse of data, specifically the collection, processing, and storage of data;
transferring or sharing of data among users; analysis and value addition; archiving
and preservation for future use; and destruction of data at the end of the cycle. Stored
or archived data are usually available for reuse, and an enabling infrastructure is
essential during each step, including security of storage and transmission through
encryption—protocols that enable data transfer across systems, allow its destruction
at the end of the cycle, and maintain integrity and accuracy of data by preventing
unauthorized manipulation.

6
A Prototype Data Governance Framework for Africa 7

Figure 1: Organizational framework for data governance

Source: Designed by the author.

Data collection and processing methods help determine accuracy, in turn


promoting greater trust in datasets. Established data collection techniques for
public data include the collection of population statistics by an official authority, or
the collection of sample statistics using rigorous sample-design techniques. These
methods yield accurate and trusted datasets structured in nature, but they also
tend to command significant resources during collection, depending on the level
of disaggregation required. Due to the financial cost, and planning and logistical
requirements associated with collection, these techniques tend to be implemented
infrequently. Therefore, analyses based on these data usually have gaps, either in
their level of disaggregation or across time.
A large number of distinct typologies of data exist, determined by the multi-
dimensional aspects inherent in data, and the lens or perspective through which
data are viewed. Data can be classified according to whether they are for private
or public use, a distinction that, in turn, determines how widely available they are
8 Working Paper DG-001
likely to be and their cost to access. Data collected for commercial use are treated
as a private good, and owners of these data enjoy a competitive advantage and the
ability to collect fees from data sales. Public data, by contrast, are collected using
public resources. These data usually provide social value and are useful inputs for
other economic activities. Open data, for example, are a type of public data shared to
fortify public governance and increase transparency, while also generating commercial
opportunities.
For purposes of classification, structured data are organized according to some
predefined model and stored electronically, typically in a relational database within
a tabular format. Databases allow for efficient searching, editing and error detection,
as well as easy manipulation by programming languages. Conversely, unstructured
data are less organized, are typically text-heavy, and require more flexible data
structures. Additionally, data can be classified according to cross-sectional and
temporal dimensions, with cross-sectional data including many observations on
subjects recorded at a fixed point and time-stamped data accounting for observations
on one or many subjects recorded over time. Spatio-temporal data describe both the
time and location of a particular event.
The utilization of Big data is a subset of new collection and analysis techniques
involving unstructured data, techniques made possible by accessible, less expensive,
and more expansive storage capacity, and by advances in machine learning related to
processing capacity and the increasing production of large amounts of digital data.
These techniques reveal patterns from high frequency data, in real time, while low
statistical errors within the data are supported by the large number of observations.
Machine-learning algorithms depend on the availability of large datasets, with the
predictive power of the algorithms increasing as the data become more available,
even as the effectiveness of the algorithms continues to depend on the accuracy of
the training data being used.
Newer techniques for collecting data rely on the availability of digital data and
depend on both advances in machine learning and estimation theory at the small-area
level. These methods offer advantages relative to traditional methods in terms of cost,
frequency, and coverage; the use of small-area estimation techniques, for example,
allows interpolation of statistics at a disaggregated level based on the combination
of population, sample, and even satellite data.
Underpinning the rights of stakeholders, the flow of data within its lifecycle, and
the various data typologies, are the enabling pillars of an effective data governance
framework. The pillars include the economic, legal, and institutional frameworks that
facilitate policies that enable the appropriate use of data while protecting data privacy;
standards that embed data accuracy and make possible the secure storage and
transfer of data; and the implementation and enforcement of appropriate regulation
for the use of data. Establishing appropriate legally empowered institutions to create
and regulate the data space is a critical dimension of the enabling framework.
A number of data governance frameworks exist, varying in terms of membership
and degree of implementation. In 2014, the African Union adopted the Malabo
A Prototype Data Governance Framework for Africa 9

Convention, which sought to encourage cybersecurity and personal data protection


among partner countries, although the Convention was not fully implemented and
thus not enforceable. According to the Convention, data need not be stored once the
purpose for which it has been collected is met. This would call for personal data to
be protected by deletion when its purpose is achieved, meaning that data controllers
would need to follow up with other data users to ascertain the destruction of personal
data.
The Economic Community of West African States (ECOWAS) was established to
promote the integration and economic growth of its member states. The member
states adopted a Personal Data Protection Act in 2010, an agreement covering personal
data and consent by the subject, recipient, and third parties, and the role of data
processors and a data protection authority. However, the Act does not cover other
important dimensions, such as profiling, anonymization, personal data breaches,
and pseudonymization—matters of particular relevance with respect to cross-border
data flows within ECOWAS. The Act requires member states to develop independent
data-processing agreements for their citizens, guaranteeing their professional secrecy,
impartiality, and power to punish errant parties. According to the Act, the processing
of personal data is legitimate when carried out with the owner’s consent and approval.
The Asia-Pacific Economic Cooperation (APEC) has developed a privacy framework
for Asian countries, specifically in the Pacific region. APEC aims to promote flexible
and effective information flows within the APEC community, while ensuring well-
managed data protection. In 2020, to protect their government institutions, firms, and
individuals against harm or the risk of private data being exposed, and to promote
trade and ensure trust among member states, New Zealand, Chile, and Singapore
signed a Digital Economy Partnership Agreement (DEPA) governing and protecting
the sharing and processing of electronic data.
Other well-established data governance frameworks can be found in the European
Union and the United States. In 2018, the European Union put in place its General
Data Protection Regulation (GDPR), enshrining it as the legally recognized framework
for data privacy and protection among member states. This framework governs the
European Union’s member states and their trading partners in all matters of data
governance. In a rather different and definitively disaggregated manner, the United
States offers both state and federal laws to protect personal online data and privacy.
10 Working Paper DG-001

4. A prototype data governance


framework for Africa
Objectives

Establishing an effective data governance framework for Africa requires a clear


delineation of its objectives and careful attention to the unique characteristics of the
continent. Africa has a large informal sector, an agricultural sector that dominates
in production, and most of its commercial entities are small businesses. Much of
the population connects through mobile phones, even though data access levels
are much lower, averaging about 20% of the population, and access to high-speed
data connections is even lower. Additionally, data access across households is highly
uneven and depends on geographic location and economic status. For most Africans,
the costs of enjoying Internet access are prohibitive, meaning that uneven access to
data at a national level is mirrored by a large disparity in access at the continental level.
Digital-format data are highly limited in Africa; many public datasets are not
digitized, and wide access to those that are digitized is low, fragmented, and
inconsistent. Low levels of Internet connectivity also deter households and the
private sector from generating new digital data, which in turn presents a major
barrier to producing the high levels of data concentration that can spur innovative
activity, enable the utilization of Big data and machine learning techniques for data
mining, and increase productivity. Data strategies and governance frameworks do
not exist in many African countries, and where frameworks do exist, they are typically
incomplete, disjointed, or not fully aligned with other existing and relevant legislation
already in place, such as laws protecting individual rights. What is more, due to weak
institutions, governance issues, or limited capacity, levels of enforceability within
existing frameworks are low across the continent. The countries that already have
data governance frameworks in place require close levels of coordination to avoid
suffering fragmentation and, consequently, diminished effectiveness.
A pan-African data governance infrastructure can help the continent realize a
single market for data, thereby enabling the creation, use, and reuse of data by
individuals across Africa and spurring economic growth and development while
protecting the rights of data subjects. A prerequisite for a single data market is the
generation of sufficient data to allow economies of scale through utilization. This
means that an appropriate framework to rapidly increase data digitization and
widespread access must be developed in parallel with a data governance framework,

10
A Prototype Data Governance Framework for Africa 11

in addition to the establishment of other key legal and regulatory frameworks that
comprehensively govern the data lifecycle. Realizing an effective data governance
framework is contingent on the establishment of country-level guidelines that
provide a template instructing nations on the precise components necessary for a
comprehensive framework, while also establishing principles to ensure coherence
across the components within a country. Further, a complementary overall framework
linked to and interoperable with national level frameworks should be established at
the continental level.
An effective framework requires a clear set of definitions and categories for different
types of data, and rules pertaining to the use and reuse of data within each category.
In this regard, a framework should clearly define private versus public data and
should offer clear guidelines on the use of each type. But effective implementation
should also go a step further, designating key public datasets to be shared both
nationally and across borders—datasets that should be identified according to the
strategic interests of countries, thereby calling for a concurrent effort to determine
and prioritize interests that will maximally promote the sharing of data. These may
include, for example, expanding regional trade, boosting agricultural productivity and
promoting food security, or dealing with climate-related threats. Cross-border data
sharing can leverage the principles employed in existing systems that effectively use
information transcending borders, such as monitoring systems for infectious diseases.
A comprehensive data governance framework must rest on the widespread
engagement of all stakeholders in a social contract that defines the protection of
individual data, thereby building trust, creating an enabling environment that adds
value to data, and promoting an equitable system (World Bank, 2021a). Such a social
contract could overcome negative externalities resulting in the under-utilization of
data for productive activity and, if properly implemented, it could define the role
and cultivate trust in data intermediaries—those figures or institutions central to the
eventual success of a data governance framework.
The key elements of data property rights include guidance on the establishment of
data ownership, and the appropriate level of control on data sharing. Property rights
management is a critical part of any data management process; data owners have an
interest in understanding how other users will utilize their data, and they also seek to
ensure that ethical, legal, and professional obligations are observed. Data property
rights are also important from the perspective of equity, as poor legal and governance
structures can encourage misuse of information and render individuals vulnerable.
A data governance framework can take a number of perspectives on data
ownership, either creating a centralized authority responsible for monitoring and
enforcing data sharing regulations or following a more decentralized framework
where data sharing resides at the individual level. Within this context, individual
preferences can be brought to bear in terms of the utility one obtains from maintaining
data privacy, relative to the advantages that may accrue from data sharing, such as
better matching and personalization of services. The appropriate framework can be
implemented by passing on the control of data access protocols to individual users,
12 Working Paper DG-001
for example by enabling individuals to choose their level of access to different types
of information generated by their devices. Indeed, privacy can be fragile and fleeting
when third parties have access to sensitive data. When users share their data on
different online platforms, they reveal signals about other users’ preferences, based
on shared visible characteristics. For example, the preferences of a teenager of a given
age in a given school may signal the preferences of that person’s circle of friends,
thereby limiting the scope of control over personal information within that group of
friends, and perhaps creating negative externalities (Acemoglu et al., 2019).
Market failure may arise due to lack of data rights. Data are non-rival and
excludable2, creating incentives to hoard data and allowing the collection of rents
and the maintenance of a dominant market position. In such situations, significant
positive externalities to data sharing that could have a major impact on economic
growth may fail to occur. In addition, organizations that collect data lack sufficient
incentives to protect the privacy of users who have shared data, given that they do
not internalize users’ utility from privacy. In such cases, oversharing of data may occur
relative to user utility from privacy.
An effective data governance structure must promote access that offers benefits to
small businesses and the costs of adhering to the framework must not be prohibitive.
Additionally, the realization of a single market for African data must be balanced with
incentives for data localization, which is defined as a mandatory administrative or
legal requirement stipulating that data be stored and processed, non-exclusively or
exclusively, within a specified jurisdiction. There are aspects of data localization that
both advance and detract from effective data governance; although data localization
can enhance data privacy and security, it can also inhibit trans-border data flows and
lead to various negative consequences attendant to such a slowdown.

Principles
To fulfill its objectives, the data governance framework should adhere to certain
central principles, including: (a) promoting an agile framework to allow for innovation
and experimentation; (b) ensuring accountability of all stakeholders within the data
lifecycle; (c) establishing standards for data accuracy and quality; (d) developing
protocols for the standardization of data, thereby underpinning data quality and
enabling interoperability; (e) preserving transparency in the utilization of data; (f)
enabling equitable access of public data to all data users; (g) securing non-prohibitive
costs of compliance to regulations relating to data; (h) promoting competition in the
use and reuse of data; and (i) seeing to it that data sharing at an international level,
outside Africa, occurs in full compliance with the rules of Africa’s data governance
framework.
The data governance framework should be governed by the principle of light-
touch regulation, allowing innovation and experimentation, while still possessing the
agility to respond quickly to information needs and implementing lessons learned.
The large number of use cases for data are unknown, and a restrictive regulatory
A Prototype Data Governance Framework for Africa 13

stance discourages the realization of their full potential. A conducive environment


for innovation can be established through regulatory sandboxes, and by leveraging
the global experiences of other countries as they implement their own frameworks.
A conducive framework should also be promoted by regulating data applications
appropriately as they are introduced, while still maintaining a principle of widespread
availability of public data.
Accountability is a critical component of data governance that should be
emphasized when developing both national and regional data governance
frameworks. A comprehensive data governance framework covers dimensions
of accountability both within and across organizations. Thus, at the macro level,
an appropriate data governance framework should contemplate organizational
dimensions to guide appropriate design within organizations, while also recognizing
aspects of accountability from both domestic and cross-border perspectives. Within
an organization, the framework should promote a holistic view of data governance,
and integration of data governance practices across departments, by directly and
indirectly involving individuals. For example, the establishment of a data council with
responsibility for data governance and with representation across departments and
at all levels of seniority could formalize the creation of data policies and procedures
for implementation and enable effective observation and monitoring. Organizational
data governance frameworks currently in place tend to relegate data governance
functions to an information technology department, often resulting in ineffective,
fragmented, and partial implementation. Building integrated data governance at the
organizational level will help build trust among stakeholders.
Data quality standards ensure the accuracy of data, build trust in its use, and
allow for consistency in its dissemination. The sheer volume of data produced and
analyzed daily, and its exponential growth, underline the importance of maintaining
data quality standards. Uncertainty about the quality of data discourages its use
and, if relied upon, may result in erroneous decisions. In the worst case, data can
be misused for malicious intent. Quality should therefore be ascertained to ensure
data are timely, accurate, complete and consistent— and quality standards should
be compatible with other existing rules and regulations, including but not limited to
those pertaining to privacy and competition.
Data standardization and the establishment of data protocols are also key enabling
factors for supporting an effective cross-border data governance infrastructure
and a single market for data. Data standardization contributes to ensuring data
accuracy and has important implications for productivity by improving the efficiency
of data processes and encouraging usage. Moreover, standardization improves
the interoperability and portability of data.3 Interoperability standards should be
established within countries, with comprehensive coverage over sectors, geographies
and interests, and these standards ought to be underpinned by a set of appropriate
technical frameworks that individuals can leverage in promoting interoperability.
Transparency should be exercised in all stages of data governance. Data-related
decisions and data processes should be communicated across all data users to ensure
14 Working Paper DG-001

a clear understanding of data-handling processes and to allow users to know how


their information is obtained and deployed. This will, in turn, build trust within the
data governance process, encouraging users to participate within the framework and
providing users with the necessary information to exercise their rights regarding the
availability and use of their data. Moreover, access to data should be provided on an
equitable basis across all categories, including both private and private data, and this
access should be universal and independent of data producers’ and users’ economic
status or market power. Access costs should be low enough to allow widespread
participation across all data users, and gaps in the enabling infrastructure within
countries and across the continent should be closed to promote equitable access.
Further, the costs of compliance for participation in the data economy should not be
prohibitive, an especially relevant concern in the African context where most of the
private sector consists of small businesses with limited capacity—businesses whose
active participation will require a conducive framework that promotes competition
in the use and reuse of data, and drives innovative activity. Finally, to ensure that the
rights of African citizens are adequately protected by third parties, the data governance
framework should ensure that data sharing at an international level is done with those
countries and regions who comply fully with the rules established within the African
data governance framework.

Infrastructure

Without an appropriate enabling environment, including physical and human capital


infrastructure underpinning its implementation, a data governance framework cannot
reach its potential. This framework also calls for institutions to secure the right cultural,
legal, policy, regulatory, organizational, institutional, and technical environment to
ensure that all data users can effectively and efficiently extract value from data, and
enforcement mechanisms of national and regional data governance frameworks,
where regulatory authorities enjoy administrative, agency and financial autonomy,
to ensure the security and privacy of data.
The appropriate enabling environment should invest in infrastructure that lowers
data access costs while increasing its quality, with particular attention to strengthening
each country’s infrastructure while also addressing access gaps both within and across
countries. Investment should target a rapid increase in the digitization of data, promote
the sharing of currently existing high-value public datasets, improve access to data at
the household level, and raise the quality of public Internet connections. Additionally,
the value of the enabling infrastructure should be established by quantifying the
impact of increased digitization, data access, and use both within countries and across
the continent. The enabling environment depends on establishing technical protocols
and interfaces to promote the standardization and transfer of data both nationally
and regionally. Finally, these frameworks, and their productive deployment, hinge
on investment in a well-trained labour force.
A Prototype Data Governance Framework for Africa 15

5. Conclusion and recommendations


Data sharing can vastly improve living standards through improvements in
productivity. Data are also non-rival and partially excludable, meaning they can be
reused infinitely without degradation. With knowledge building upon knowledge,
returns to the utilization of data are increasing in scale, and the efficient utilization
of data is thus critical to productivity growth.
Free data exchange has spawned unprecedented opportunities to people across the
globe, creating jobs and industries, facilitating increased mobility, and ultimately raising
living standards. Policy makers aspire to responsible and safe data use to improve the
lives of the people they serve, while minimizing the misuse or exploitation of data. To
this end, a clearly defined data governance framework integrated with a data strategy is
necessary to establish data sovereignty and buttress Africa’s competitiveness and cross-
country collaboration during the digital age. Implementing an effective data governance
framework will preserve the availability, usability, integrity, and security of data across
the continent, and such a framework will be both served and safeguarded by a developed
data infrastructure, technical protocols, laws and regulations, and institutions suited to
promoting the safe and trustworthy use of data while respecting privacy.
The central aim of a pan-African data governance infrastructure is to help bring about
a single market for data, thereby allowing for the creation, use, and reuse of data by
individuals across the continent and spurring economic growth and development while
still protecting the rights of data subjects. To fulfill its objectives, this framework should
adhere to certain central underlying principles and operate within a light-touch and agile
regulatory framework that encourages innovation. These principles include ensuring
accountability, maintaining data accuracy and quality, and facilitating interoperability
and standardization of data; but just as much, this framework must afford equitable
access to data and keep the costs of compliance low, to promote competition.
Some key areas require additional research if the maximum utility of a data governance
framework is to be enjoyed, such as: (a) identifying and prioritizing key strategic interests
across the continent that will benefit the most from implementation of a data governance
framework, and quantifying the value of the framework; (b) determining strategies to
increase the pace of digitization of offline public data sources, while also increasing access
to already digitized public data; and (c) mapping infrastructure gaps and cost-of-access
disparities at the sub-national and cross-country level and then resolving these inequities.
Finally, further research must explore how to create a regulatory framework that best
implements the principles of effective data governance.

15
16 Working Paper DG-001

Notes
1. An exabyte corresponds to 1021 bytes, and a zettabyte corresponds to 1024 bytes.

2. Non-rivalry of data means that data can be consumed or processed by multiple users
without depleting its quality or supply. Excludability of data occurs when some groups
or individuals are excluded from accessing or using the data.

3. Interoperability refers to the ability to integrate datasets from different sources, while
portability means the ability to transfer or share data without affecting their quality
and content.

16
A Prototype Data Governance Framework for Africa 17

References
Acemoglu, D., Makhdoumi, A., Malekian, A. and Ozdaglar, A. 2019. Too much data: Prices
and inefficiencies in data markets. NBER Working Paper Series. National Bureau of
Economic Research. https://www.nber.org/system/files/working_papers/w26296/
w26296.pdf.
Ackoff, R. L. 1989. “From data to wisdom”. Journal of Applied Systems Analysis, 126(1): 3‒9.
https://softwarezen.me/wp-content/uploads/2018/01/datawisdom.pdf.
Alhassan, I., Sammon, D. and Daly, M. 2016. “Data governance activities: An analysis of the
literature”. Journal of Decision Systems, 25 (Issue sup1), 64–75. https://doi.org/10.1080/1
2460125.2016.1187397.
Beraja, M., Yang, D. and Yuchtman, N. 2021. Data-intensive innovation and the state: Evidence
from AI firms in China. NBER Working Paper Series. National Bureau of Economic Research.
https://www.nber.org/system/files/working_papers/w27723/w27723.pdf.
Chander, A. 2020. “Is data localization a solution for Schrems II?” Journal of International
Economic Law, 23(3): 771-784. https://doi.org/10.1093/jiel/jgaa024.
Desjardins, J. 2019. How much data is generated each day? World Economic Forum. https://
www.weforum.org/agenda/2019/04/how-much-data-is-generated-each-daycf4bddf29f/.
Fourie, J. 2016. The long walk to economic freedom after apartheid, and the road ahead. IDEAS.
https://ideas.repec.org/p/sza/wpaper/wpapers267.html.
Gakwaya, O., Meier-Hahn, U., Mbouna, R. and Wannemacher, L. (eds). 2020. Blockchain in
Africa: Opportunities and challenges for the next decade. Smart Africa. https://www.giz.de/
expertise/downloads/Blockchain%20in%20Africa.pdf.
Grannis, S., Xu, H., Vest, J., Kasthurirathne, S., Bo, N., Moscovitch, B., Torkzadeh, R. and
Rising, J. 2019. “The effect of data validation and standardization on patient matching
accuracy”. Journal of the American Medical Informatics Association, 26(5): 447-456. https://
doi.org/10.1093/jamia/ocy191.
Khatri, V. and Brown, C.V. 2010. “Designing data governance”. Communications of the ACM,
53(1): 148-152. https://www.researchgate.net/publication/220426163_Designing_data_
governance.
Kouper, I., Raymond, A. and Giroux, S. 2020. “An exploratory study of research data governance
in the U.S”. Open Information Science, 2020(4): 122-142. https://doi.org/10.1515/opis-2020-
0010.
Marr, B. 2015. A brief history of big data everyone should read. World Economic Forum. https://
www.weforum.org/agenda/2015/02/a-brief-history-of-big-data-everyoneshould-read/.

17
18 Working Paper DG-001

Micheli, M., Ponti, M., Craglia, M. and Suman, A.B. 2020. “Emerging models of data
governance in the age of datafication”. Big Data and Society, 7(2): 1-15. http://dx.doi.
org/10.1177/2053951720948087.
Otto, B. 2011. A morphology of the organization of data governance. European Conference
on Information Systems (ECIS 2011 Proceedings). https://www.researchgate.net/
publication/221407900_A_morphology_of_the_organisation_of_data_governance.
United Nations Congress on Trade and Development – UNCTAD. 2021. Trade and Development
Report. United Nations Congress on Trade and Development. https://unctad.org/webflyer/
trade-and-development-report-2021.
Vidal, M. F. and Medine D. 2019. Is data privacy good for business? Consultive Group to Assist
the Poor. https://www.cgap.org/sites/default/files/publications/2019_12_Focus_Note_Is_
Data_Priv acy_Good_for_Business.pdf.
Weber, K., Otto, B. and Österle, H. 2009. “One size does not fit all—A contingency approach to
data governance”. ACM Journal of Data and Information Quality, 1(1), 1–27. https://www.
alexandria.unisg.ch/67793/1/a4-weber_external.pdf.
World Bank. 2021a. World Development Report 2021: Data for better lives. World Bank Group.
https://doi.org/10.1596/978-1-4648-1600-0.
World Bank. 2021b. Minimum data consumption: How much is needed to support online activities,
and is it affordable? World Bank Group. http://hdl.handle.net/10986/35149.
Zimmermann, K. A. 2017. History of computers: A brief timeline. Live Science. https://www.
livescience.com/20718-computer-history.html.
A Prototype Data Governance Framework for Africa 19

Mission
To strengthen local capacity for conducting independent,
rigorous inquiry into the problems facing the management of economies in sub-
Saharan Africa.

The mission rests on two basic premises: that development is more likely to
occur where there is sustained sound management of the economy, and that such
management is more likely to happen where there is an active, well-informed group of
locally based professional economists to conduct policy-relevant research.

www.aercafrica.org

Learn More

www.facebook.com/aercafrica www.instagram.com/aercafrica_official/

twitter.com/aercafrica www.linkedin.com/school/aercafrica/

Contact Us
African Economic Research Consortium
Consortium pour la Recherche Economique en Afrique
Middle East Bank Towers,
3rd Floor, Jakaya Kikwete Road
Nairobi 00200, Kenya
Tel: +254 (0) 20 273 4150
[email protected]

You might also like