Describing The Organisation Data Landscape
Describing The Organisation Data Landscape
net/publication/336530200
CITATIONS READS
0 1,297
1 author:
Alan McSweeney
Eirgrid
72 PUBLICATIONS 11 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Alan McSweeney on 14 October 2019.
Alan McSweeney
http://ie.linkedin.com/in/alanmcsweeney
Describing the Organisation Data Landscape
Contents
Page 2
Describing the Organisation Data Landscape
List of Figures
Page 3
Describing the Organisation Data Landscape
Page 4
Describing the Organisation Data Landscape
The Data Landscape is a representation of the organisation’s data entities and their relationships,
interfaces and data flows. Data entities are data asset components that perform data-related functions,
from data storage to data transfer and data processing within the Data Landscape.
Supporting the Data Landscape is a database structure to allow information to be stored, managed
extracted and reported on. This is described on page 49 onwards. This means that the Data Landscape is
not a static representation of a current or desired future state. It is a dynamic model that can be updated
and maintained. It can be used to assess change scenarios.
The objective of developing a Data Landscape model is to define an approach for formally and exactly
defining the operation and use of data at a high-level within the organisation and to plan for future
changes.
The outputs from the data landscape model creation process are aimed at both technical and non-
technical audiences.
The model needs to be sufficiently flexible to include different levels of detail, from a high-level view of
data entities and their relationships to detailed descriptive attributes on the contents and processing
performed by data entities and their underlying technology components.
This approach does not use a formal modelling language other than a relational model (as a form of data
database) for the constructs underlying the data landscape. The material contained here represents a set
of conceptual and dynamic models designed to allow insights to be obtained on the design, construction,
operation and use of the data landscape. As the data landscape model is itself data driven, it can be
changed easily and quickly.
What is the Organisation Data Landscape? on page 6 – this outlines the concept and objectives
of the data landscape, lists the data-related drivers and the linkages to other information technology
architecture practices
Data Landscape Definition Principles on page 9 – this lists some principles to apply to the
creation of the data landscape model.
Benefits and Uses of Data Landscape Approach on page 10 – this lists some of the benefits of
using the data landscape approach.
Data Landscape Concepts on page 10 – this introduces the concepts that underpin the data
landscape approach.
Data Landscape Data Model on page 35 – this describes the core and extended elements of the
data landscape model
Data Processes and Capabilities on page 53 – this describes data processes, capabilities and data
life stages and their possible use to assess the health and status of the data landscape
Business Functions and Business Processes on page 66 – this discusses an extension of the data
landscape model to incorporate details on business processes associated with data processing.
Page 5
Describing the Organisation Data Landscape
Data Landscape Model and Enterprise Data Model on page 67 – this outlines an extension to
the data landscape model in to include elements of an Enterprise Data Model such as the Subject
Area Model.
Using the Data Landscape Model Approach – Some Sample Scenarios on page 72 – this
contains some examples of using the data landscape model for planning data-related changes.
The organisation data landscape is a representation of the static data entities and the dynamic
relationships and data flows across a wide view of the organisation, including external interacting data
components and parties, both current and future.
Creating a data landscape view is important as data underpins the operation of information technology
solutions and business processes. Data breathes life into solutions as its flows through the organisation.
The optimum and most cost-effective design of the data landscape is therefore important. Similarly,
solutions that are developed or acquired and deployed on the data landscape
The nature of the organisation data landscape is changing as organisations are undergoing a data
transformation:
The data landscape has been broadened and there are more data entities that form part of the
extended organisation data landscape as more applications are moved to cloud service providers and
as cloud platforms are used for providing additional facilities not currently present in organisations
such as data analytics and machine learning
There is a wider range of data entities as the data landscape increases in complexity
There are more data entity types and data-related capabilities, especially in the areas of advanced
data analytics
There are more data demands within the organisation especially in the areas of analytics
These developments co-exist with other more general data-related trends that include:
Greater volumes of operational data from increasing numbers of different sources and providers
Greater volumes of derived data
More data sources both internal and external to the organisation
Data in larger numbers of different formats
Data with wider range of contents
Data being generated at different rates
Data being generated at different times
Data being generated with varying degrees accuracy, reliability and greater fuzziness
Data that changes constantly
Data that is of different utility and value
Page 6
Describing the Organisation Data Landscape
The data landscape approach aims to understand and handle these complexities in order to enable the
organisation move from its current state to a target future state. It allows options to be explored and
understood. It facilitates the planning and organisation required to achieve this change.
The creation of an organisation data landscape is not an end in itself. It is constructed to add value to
data architecture-related activities, provide insight, assist with resolving issues and in planning data-
related changes.
The organisation’s data landscape and the work of the data architect in developing it evolves in line with
other information technology architectural practices within the organisation that can involve some or all
of the following logical roles:
Enterprise Architecture that defines, develops, extends and manages the implementation and
operation of the overall IT delivery and operation framework including standards and solution
development and acquisition.
Solution Architecture that designs and oversees the implementation of a portfolio of IT solutions
that translate business needs into operable and usable systems that comply with standards.
Page 7
Describing the Organisation Data Landscape
Business Architecture that defines and manages the implementation of IT solutions and related
organisation changes needed to implement business strategy and objectives.
Service Architecture that designs and oversees the implementation of service processes and
supporting technologies and systems to ensure the successful operation of IT solutions including
outsourced supplier management framework.
Security Architecture that designs data and system security processes and systems to ensure the
security of information and systems across the entire IT landscape.
Technical Architecture that translates solution designs into technical delivery, acting as a bridge
between solution architecture and the delivery function and designing new delivery approaches.
The data architect does not work in isolation to these other architectural disciplines. While the data
architect needs to focus on the core work of data architecture, the work should be part of the wider
overall organisation’s IT architecture. The data architect needs to (necessarily) balance narrowly (and
selfishly) focussing on pure data work with the broader needs of other IT architecture disciplines. The
results of the data landscape model should be shared with other members of the wider IT architecture
team.
Figure 2 – Information and Data Architecture in a Wider Information Technology Architecture Context
The data landscape is an integrated view of all data entities within and outside the organisation. It
captures a larger and deeper view of data and the data technologies, processes and capabilities within
and outside the organisation. This approach is designed as one tool to allow the data architect perform
the role of ensuring the success of current data operations while planning for adopting changes and new
technologies.
Page 8
Describing the Organisation Data Landscape
This data operations views captures key data entities, their relationships, data flows and the associated
data capabilities and their supporting processes.
The objective is not to represent all information technology components or applications, but just those
that are explicitly related to the processing of data in its widest sense. Server infrastructure used to host
data processing applications is not explicitly represented. Similarly, security infrastructure such as web
proxies, firewalls, security appliances and user directories need not be shown unless doing so adds value
to describing, understanding and planning the data landscape.
The data landscape model creation process must be governed by a number of principles:
Less is More – create a model that is just detailed and complex enough to allow results to be
generated. The more detail that is added to the model, the greater its complexity becomes. Usability
and the ability to interpret the model to generate insight and value are reduced. While the amount of
detail that constitutes too much is undefined and subjective, the model should nonetheless be kept
simple. Too much detail, especially at the early stage, will kill the data landscape creation process.
Self-Descriptive – the data landscape model should be as self-descriptive as possible. The model
should be easy to understand and require as little additional knowledge as possible.
Consistency – the information representation approach should be consistent across all presentation
instances and types.
Utility – one measure of the usability of the model is that it is useful and is used. One objective of
modelling is to aid understanding, insight, planning and problem determination and resolution.
Results-Focussed – the model is not an end in itself but a means to an end. Too much analysis is
implicitly inward- and backward-looking. The model should be forward-looking, looking to assist
with the resolution of problems and in planning and defining the future data landscape. Too much
time and effort can be spent of gathering detail that is not useful or relevant means that less will be
available to devote to value-adding activities. Documenting the existing data entity landscape can
be useful to determine the gaps that must be filled.
Relevance – the model should only contain what is relevant. Irrelevant detail should not be added.
These principles are inexact but they should nonetheless be considered when creating any data landscape
model.
Page 9
Describing the Organisation Data Landscape
The landscape is only as useful as the information it contains and the accuracy and currency of that
information and the ability to present and use the information. The level of detail that is gathered about
the data landscape governs the type of detailed processing and analysis that can be performed. More
information means more maintenance of that information. The usefulness and usage will be reduced if
the information is not current.
Information should be collected at a high level initially. More detail can be added later. Only sufficient
information that is needed to add value should be collected and input into the model. The objective is to
describe the present in order to map, plan and understand the future, identify gaps, consider options and
optimise future configurations.
Creating the data landscape view requires an audit of the existing data entities static and dynamic data
relationships and flows. It can be a once-off or a continuous engagement: once-off to assist with specific
planning activities or continuous to allow the state of the data landscape to be constantly reviewed.
The data landscape is a representation of the way in which the organisation currently and how it would
like in the future to generate, receive, process, use and disseminate data.
The approach will allow changes to be planned and their requirements and impacts to be understood. It
will allow data selection, design and deployment options to be explored and opportunities to be assessed,
their scope understood, their impacts identified and data architecture and technology alternatives be
explored.
It can be used to assist with understanding, mapping and planning an organisation’s data
transformation and assist in moving to a more data operations-oriented organisation. It can be used to
identify opportunities for improvement, simplification and automation. Gaps and missing data
capabilities and facilities and capabilities can be identified.
Data Zones – these are groupings of physically closely located data entities. The data zones
represent major clusters of or containers for data entities that are physically and/or logically close to
one another. Data zones do not represent objects that perform processing. The data entities located
within data zones perform the data related activities.
Data Entity Types – these are types of data source, storage, transformation, processing or transfer,
components. Data entities perform data-related work across the spectrum of actions and events.
Essentially a data entity is a hardware or software technology components involved in any form of
data processing. Data entity types are associated with data zones.
Data Entities – these are data assets that are involved in the storage, processing and transfer of
organisation data, in the widest sense. Data entities have a type and are located in Data Zones.
Data Entity Relationships – these are connections or associations between data entities. These
relationships can be loosely or exactly defined.
Page 10
Describing the Organisation Data Landscape
Data Interfaces and Data Flows – a data interface is a specific way a data entity can provide or
accept data. A data flow is a link between two (or more) data entity interfaces where there are at
least two endpoints: a source and a target.
Levels of Descriptive Detail – these define types and amounts of information to be provided
ranging from initial foundational information to details on individual data elements within a data
entity.
Data Entity Type Attributes – these are entity-specific attributes that contain descriptive
information.
Data Capabilities and Processes – these are sets of activities commonly and repeatedly performed
to generate specific results within the context of the data landscape.
Data Zones
The data landscape model contains a number of data zones, such as:
Insecure External Organisation Presentation And Access – this represents a location where
publically accessible data entities reside. These entities are regarded as insecure and/or untrusted.
Secure External Organisation Participation and Collaboration – this is a location outside the
physical organisation boundary where data entities that are provided by or two trusted external
parties reside.
Secure External Organisation Access – this zone contains data entities that enable secure access
from outside the organisation.
Organisation – this data zone represents the entire organisation and it contains all the locations
and business units or functions within the organisation.
Central Data Infrastructure – this contains the central data applications and their associated
data.
Page 11
Describing the Organisation Data Landscape
In this diagram, higher-level data zones are shown as encompassing and surrounding lower-level ones.
This is just one possible representation of the logical layering of these data zones, from central data
infrastructure to wider zones that ring the organisation.
The data zones could also be represented as islands in the following view:
Page 12
Describing the Organisation Data Landscape
The data landscape model can be extended to include further data zones if necessary. The following
diagram shows additional data zones explicitly represented.
These additional data zones are overt representations of locations for organisation data entities located
outside the core organisation but effectively part of a stretched data landscape. These further data zones
are:
Co-Located Data Infrastructure – this represents organisation data entities that are logically part
of the organisation but physically located within a co-located facility.
Outsourced Service Provider Data Infrastructure – this represents data entities that are used
by the organisation but are provided or use technology infrastructure provided by an outsourcing
service provider.
Cloud Service Provider Data Infrastructure – this represents organisation data entities of
various types (applications and their associated data stores, data technology infrastructure and data
entities implemented on it or platforms used to create data applications and store their data)
provided by cloud service providers.
There could be several of each of these zones, one for each service provider the organisation uses.
In the diagram above, the data entities that are represented in the Central Data Infrastructure zone
level can exist at the Organisation zone. Central organisation data entities can also be located in an
Organisation Location/Unit zone defined as a container for that purpose.
Page 13
Describing the Organisation Data Landscape
These represent types of data entities that can reside in data zones. Data entities are assigned a type.
The following diagram shows one view of a possible list of data entity types located in the concentric
data zone view show in Figure 3 on page 12.
Page 14
Describing the Organisation Data Landscape
Page 15
Describing the Organisation Data Landscape
This diagram shows the data entities types using the concentric data zone view shown in Figure 3 on
page 12. This list is neither complete nor definitive. There are many ways of representing data entities
types of which this is one. The objective here is to have a consistent way of representing key data entities.
Once this is done, the relationships, interactions and data flows between data entities can be specified.
Page 16
Describing the Organisation Data Landscape
Page 17
Describing the Organisation Data Landscape
Page 18
Describing the Organisation Data Landscape
Page 19
Describing the Organisation Data Landscape
The data entity types that are in the Central Data Infrastructure data zone could also be located in other
data zones such as:
These are data entity types and not actual data entities. The scenario analysis section uses a simple data
landscape with data entities of some of these types shown on page 72.
The following diagram shows some of the placeholder data types such as Hosted Data Infrastructure,
Externally Co-Located Data Infrastructure and Externally Co-Located Outsourced Data Infrastructure
being replaced by explicitly references to their constituent data entity types.
Page 20
Describing the Organisation Data Landscape
Figure 7 – Data Zone and Data Entity Type Example with Cloud and Outsourced Service Providers
Page 21
Describing the Organisation Data Landscape
These data entity types are generic and independent of any specific technology or set of facilities they
provide other than that which is implied by their type.
For example, a finance and accounting or HR solution will have a type of Data Processing Business
Applications. They can be located in a data zone such as Central Data Infrastructure or Secure External
Organisation Access.
Entities can have (many) relationships with other entities. These can be one-way – from a source to a
target entity – or two-way – between two entities. Relationships can be expressed in active or passive
terms: A acts on B or B is acted up by A. Relationships are not necessarily definitive. They can be used
to denote informal associations.
There can be many different types of entity relationships. These relationships can be characterised in
different ways.
Entity relationships are intended to represent connections between entities. Changes in those entities –
movement to a different zone as a result of movement to a cloud service provider or an outsourcing
arrangement, new entities added, entities aggregated or split, new functionality added – impact the
relationships. Understanding the entity relationships means the impact of entity changes can be
understood.
The following diagram shows some of the possible relationships between entity types.
Page 22
Describing the Organisation Data Landscape
Page 23
Describing the Organisation Data Landscape
Entity relationships can be defined at varying levels of detail and complexity. The amount of definition
needs to be directly related to the benefit that will be derived. Entity relationships allow the likely
consequences of data landscape changes to be identified.
This diagram violates the design principles listed on page 9 because of its level of detail that confuses
rather than add insight. Such a diagram obscures rather than enlightens. However, once the data entity
relationship information has been entered into the data landscape, it can be filtered to show relationship
types or just a subset of relationships.
The absence of defined relationships between entities can be used to identify potential problems such as
underused or redundant entities and the absence of information that need to be collected.
A data interface is a method of a data entity where it can accept or provide data. Interfaces can be
PUSH – where the source data provider entity pushes the data to the target or PULL where the target
data entity pulls the data from the source.
Parameters supported that affect the nature of the data being sent or received
A data flow is a path from a data source and its associated data interface to a data target and its
associated interface. So a data flow involves two (or more) interfaces.
Data flows can be direct – from the source data entity interface to the target data entity interface – or
indirect – by way of an interim data entity (such as an (S)FTP server, service bus or data storage
location acting as a mailbox). The data flows involved in an ETL tool moving data from one data entity
to another could be viewed at two data flows or one.
Data flows can also involve a transformation, where the source data is modified before it reaches the
target.
At a very high level, based on the combinations of these options, there can be ten major types of data
flow:
Page 24
Describing the Organisation Data Landscape
These types of data flows are concerned with the transfer of data. They exclude details on the
handshaking required to initiate the data flow such as authentication and generation and use of
temporary session keys.
Page 25
Describing the Organisation Data Landscape
The following diagram represents an Indirect Source Pull Target Pull data flow.
The following diagram represents an Indirect Source Push Target Pull data flow.
The following diagram represents an Indirect Source Pull Target Push data flow.
The following diagram represents an Indirect Source Push Target Push data flow.
Page 26
Describing the Organisation Data Landscape
The following diagram represents a Transformation Source Pull Target Pull data flow.
The following diagram represents a Transformation Source Push Target Pull data flow.
The following diagram represents a Transformation Source Pull Target Push data flow.
Page 27
Describing the Organisation Data Landscape
The following diagram represents a Transformation Source Push Target Push data flow.
These types of data flows have a single start and single end point in a single data entity. Data flows can
be more complex with, for example, data being sent to multiple targets.
Transformations can consist of multiple data processing steps. For the purposes of documenting the data
landscape, this additional information increases complexity while not necessarily adding value in terms
of understanding the existing landscape and planning for data transformation changes.
The following diagram shows a number of possible data flows across a number of interfaces for a subset
of the data entity types shown on page 15.
Page 28
Describing the Organisation Data Landscape
Figure 20 – Sample Data Interfaces and Data Flows for Data Entity Types
Page 29
Describing the Organisation Data Landscape
The next diagram shows an example of a single extended data flow extracted from the previous diagram.
The example relates to data the flow from data generated by external devices to the data being analysed
across a number of interfaces, and spanning a number of data zones.
In this example, there are 12 data entity types and their interfaces involved in the extended data flow:
1. Public External Data Devices – these collect or provide measurement or telemetry data. The
collected data is pushed to a data concentrator.
2. Edge Device – this acts as a data concentrator, receiving data from multiple external data sources
such as meters or telemetry units. The data is then pushed to a data access data entity type.
3. External Data Receipt And Access Control – this is a generic data entity type that represents the
entry portal for incoming data. The edge device pushes aggregated edge device data to this.
4. Integration/Service Bus – this represents a data entity type that implements or provides service
oriented data integration facilities.
5. Data Processing Business Applications – this denotes data entity types that represent the
business applications that receive the data from the Integration/Service Bus data entity and process
it.
6. ETL – the ETL data entity type may be involved in the extended data flow in a number of ways:
It can receive data from the Integration/Service Bus and pass it the Data Processing Application
Operational Data Stores data entity type that represents the data stores of the Data Processing
Business Applications data entity type.
It can extract data from the Data Processing Application Operational Data Stores and move it
to the Data Warehouse and Data Marts data entity types, after transformation to convert
operational data into the subject-oriented data format with a time dimension that these entity
types typically require.
7. Data Processing Application Operational Data Stores – these data entities are the functional
(rather than infrastructural) data storage component of the corresponding Data Processing Business
Applications entity types. The ETL data entity type pulls data from the stored data, transforms it
and loads it into the Data Warehouse and Data Mart entity types.
8. Data Warehouse – this represents the data entity type that holds long-term data from operations
systems.
9. Data Marts – this signifies data entity types that contain specific subsets of transformed
operational data used for specific reporting and analysis purposes.
11. External Data Analytics Co-ordination and Management – this represents a data entity that
manages the allocation of data analytics requests to external (cloud-based) data analytics services
and the retrieval of results.
12. External Data Analytics Services – these data entities provide external (cloud-based) data
analytics facilities.
Page 30
Describing the Organisation Data Landscape
Figure 21 – Example of Single Extended Data Flow across a Number of Data Entities
Page 31
Describing the Organisation Data Landscape
This single extended data flow can (and really should) be broken down into a number of specific data
flows using data entity interfaces that exist and are used for each distinct purpose.
The following diagram shows this sample extended data flow divided into three separate data flows:
Figure 22 – Splitting Sample Extended Data Flow into Two Separate Data Flows
Data Flow 2 – the population of data stores with different types with sensor data
Page 32
Describing the Organisation Data Landscape
The level of detail and the amount of process decomposition applied to a data flow depends on factors
such as:
The amount of detail to be represented and the value to be derived from that detail
The need to include details on the data handoffs between each interface and to describe any data
transformation that occurs
The value and utility that can be obtained from the level of detail
The following diagram shows a simplified representation of these previous sample data flows. In this case,
just the main data entities and their interfaces involved in the data flows are shown.
This version contains a reduced amount of detail when compared with the previous more detailed
illustration.
Page 33
Describing the Organisation Data Landscape
The data landscape model could be used to hold information at different levels of detail:
Level 1 – Data zone and entities, their types, relationships and interfaces. This is the
foundational definition of the data landscape. It identifies the major data entities in each data zone.
Level 2 – Additional details about the data entities, their constituent components, attributes
and characteristics. This includes platform details, technologies used including versions and
products used including versions.
Level 4 – Description of data contents and data processing. This level can contain additional
details on the data that is within the scope of the data entity such as datasets, files, tables or other
data constructs or data processing steps and activities.
Level 5 – Individual details of data contents. This can contain further levels of detail down to the
individual data field level.
The purpose of the data landscape view is not to become or replace any existing data dictionary or
semantic data function within the organisation by adding a parallel set of information. At its core it is a
data architecture planning approach.
Page 34
Describing the Organisation Data Landscape
The data model required to describe the core data landscape is quite simple. The following shows it
expressed as a simple Entity-Relationship Diagram.
The core data model is sufficient to provide a helicopter view of the data landscape.
Page 35
Describing the Organisation Data Landscape
The core model is sufficient to describe the primary components of the organisation data landscape and
to perform the analysis and planning described above.
The core data landscape model can be extended to allow for the inclusion of other information such as:
Components of data entities that could be used to provide more granular information on their
constituents – this is described on page 38.
Attributes of data entities and data zones to describe their characteristics – this is described on page
39.
Data contents that describe details on the data associated with a data entity – this is described on
page 45.
Application group that links several data entities into a wider application or service – this is
described on page 49.
Data management and operations processes as they apply to data entities and their status – this is
described in more detail on page 53.
Subject area model data concepts (part of the Enterprise Data Model) and which data entities are
involved in their processing – this is described in more detail on page 67
These are just examples of the types of extensions that can be performed. Such extensions must add
value, utility and insight to justify their use and the amount of work required to populate the data
structures.
These extensions are not sequential. They can be applied in any order to the core data model.
At a high-level, the core and extended data landscape model can be represented as follows:
Page 36
Describing the Organisation Data Landscape
Page 37
Describing the Organisation Data Landscape
The next four sections show possible extensions to the core data model to describe:
Data components are intended to hold an additional level of detail on the contents and composition of
data entities and the functions those components perform.
Page 38
Describing the Organisation Data Landscape
Not all these data elements are required for this data model extension.
Data entity attributes can be used to store extended details on data zones, entities, interfaces and flows
For example, an attribute called Database Platform could be defined.
IBM DB2
Informix
Microsoft Azure SQL Database
Microsoft SQL Server
MySQL
Oracle
PostgreSQL
The Database Platform attribute could then be associated with Data Entity Type of Data
Processing Application Operational Data Stores.
The Data Entity of Financial System Database can be assigned a Data Entity Type of Data
Processing Application Operational Data Stores.
The Database Platform attribute of the Data Entity of Financial System Database can then be
assigned a value from the list of possible values.
The objective of allowing data entity attributes in not to store and manage detailed configuration
information. The DMTF (Distributed Management Task Force) maintain and publish a Common
Diagnostic Model https://www.dmtf.org/standards/cdm that contains details on a possible set of IT
infrastructure specific attributes.
There are other examples of detailed data entity attributes from developers of CMDB (Configuration
Management Database) software whose data models contain examples of such attributes. These are some
instances of CMDB data models such as:
Page 39
Describing the Organisation Data Landscape
These details are included for information only. The data landscape data model does not need to include
this level of detail.
Page 40
Describing the Organisation Data Landscape
Page 41
Describing the Organisation Data Landscape
Not all these data elements are required for this data model extension.
Data entity attributes can be used to hold status and planning information about data entities. These
could include:
For example, the future plans for data entities could include some or all of the following values that
indicate the corresponding actions:
1. Reassemble – combine functionality of solution with other solutions to create new combined
solution
2. Redevelop – redevelop the custom application and retain its functionality
3. Reduce – stop using functional elements of the current application while retaining it
4. Refactor – change the internal application structure, design and implementation without changing
the external appearance and functionality
5. Rehost – move application to new platform without change
6. Relocate – move the data contained in the data entity to another platform
Page 42
Describing the Organisation Data Landscape
7. Repair – resolve problems and issues with the current application while retaining it on the same
platform
8. Replace – replace the application with a functionally similar one
9. Replatform – move application to new platform with some limited changes to enable the
application run on the new target platform
10. Research – this represents data technologies that are emerging and are being researched and piloted
for possible production application
11. Reserve – retain the application but encapsulate access to its functionality via some form of
interface
12. Retain – retain the application entirely in its current form
13. Retire – stop using an obsolete application without explicitly replacing it
The following diagram shows the rough general location of these options arranged along two axes of
future location of the data entity – from existing location to an external cloud or hosted one – and the
level of change involved to the data entity – from none to significant.
When this additional status information is available, data entities could then be filtered based on factors
such as their future plans.
There may be a temptation to create lots of data entity type-specific attributes that can be used to
record information about data entities. However, unless these attributes add value to the data landscape
model, they should not be added.
Once area that could add value is using data attributes to track the cost or financial impact of data
entities. This information can then be used to assess the financial impact of various data transformation
options. The following diagram shows a possible view of the financial impact of data entities imposed on
the sample data landscape on page 72.
Page 43
Describing the Organisation Data Landscape
Page 44
Describing the Organisation Data Landscape
Data will have security characteristics and requirements in terms of its sensitivity and confidentiality
and the impact on the organisation of its loss, from regulatory to financial and reputational.
The data attribute extension to the data model can be used to hold security profile information
regarding data entities or data subjects processed by those data entities (see details on the subject area
model on page 67).
In planning data transformations such as those examples listed on page on page 72, the security
implications can be identified and assessed if the security attributes has been defined.
Page 45
Describing the Organisation Data Landscape
Page 46
Describing the Organisation Data Landscape
Data entity contents are intended to hold an additional level of detail on the data contents of data
entities. This is separate from data entity components that are intended to represent functional elements
of a data entity.
Page 47
Describing the Organisation Data Landscape
Page 48
Describing the Organisation Data Landscape
Not all these data elements are required for this data model extension.
A set of data entities can belong to an application or service. The purpose of this extension to the core
data landscape model is to allow data entities be assigned to applications.
Page 49
Describing the Organisation Data Landscape
Page 50
Describing the Organisation Data Landscape
For example, an application that allows external users interact with it may consist of the following data
entities:
Common data entities such as those providing infrastructural-related data services can be shared
between applications or services.
Page 51
Describing the Organisation Data Landscape
Being able to group data entities to reflect their involvement and the role they perform in an application
or service means that the impact on that application or service can be determined if any of its
component data entities change.
The environment type data element can be used to identify separate environments for an application.
Environment values can be defined such as:
Production
Pre-Production
Operations Acceptance Test
User Acceptance Test
System/Integration Test
Development/Unit Test
Page 52
Describing the Organisation Data Landscape
Not all these data elements are required for this data model extension.
The data landscape is neither passive nor static. It must be designed, implemented managed,
administered and operated through the development and application of a range of processes. The extent
of their implementation and application should be part of any description and assessment of the state of
the data landscape. The state of these processes and the state of the application to specific entities is one
measure of the state of the data landscape.
If a means is required to assess the health of the data landscape with respect to its operational state then
a structured operational process framework is required against what that assessment can be performed.
This section contains notes on defining such an operational process definition and thus assessment
framework.
The objective here is not to define a complete, exact and rigorous process definition assessment
framework. The modelling principles listed on page 9 should be applied here. Complexity is the enemy of
quick and useful results.
Data Service Management Processes – these are data landscape-specific elements of what should
be more general information technology service management processes. They are sets of activities
performed to create a result. These are concerned with looking after the pure operational aspects of
the data landscape (as part of a wider information technology landscape). This is just one view of the
key service management processes that apply to the data landscape.
Page 53
Describing the Organisation Data Landscape
Data Capability Process Areas – these are data landscape-specific capabilities and the associated
processes that actualise their use. These represent skills that will be of varying degrees of importance
to each organisation.
Data Life Stages – this is a view of the stages to which data moves as it is being processed by data
entities. Not all of these stages apply to all entities. The entire set of stages may span a number of
data entities. The stage view applies to an individual data instance, a set of data processed by a
specific solution that may use the facilities of multiple data entities.
Each of these views describes a different aspect of the processes associated with the data landscape. The
service management process view describes how well these general service management processes have
been implemented and are operated for entities within the data landscape. The data process area are sets
of skills and abilities that must exist and be applied to the design and implementation of data entities.
The data life stage view takes a cross-functional perspective on data processing and movement through
its life stages
Page 54
Describing the Organisation Data Landscape
Page 55
Describing the Organisation Data Landscape
The service management processes and their applicability to the data landscape are:
Incident Management – handle and manage unplanned interruptions to or reduction in the quality
of a data service and to restore normal operation as quickly as possible, minimising the negative
impact on business operations.
Problem Management – analyse and determine the root causes of incidents to stop incidents from
happening, to eliminate recurring incidents and to minimise the impact of incidents that cannot be
stopped.
Event and Alert Management - detect events and alerts that represent significant occurrences to
entities, identify them and determine the appropriate actions to take and to collect data for analysis.
Service Level Management – agree and define service targets and then ensure these targets are
met for data entities.
Asset Management – track data entities though their lifecycle to identify ownership, cost of
operation and use and manage upgrade and replacement cycles
Resilience, Availability and Continuity Management – ensure that data entities are available,
can resist failure and recover quickly from failure and ensure continuity of operations in the event of
the loss of data entities.
Risk Management – identify, assess and control occurrences that could cause loss of or damage to
data entities.
People Management – manage people resources required to administer, manage and operate data
entities from a service operations view.
Supplier Management – manage suppliers of services across the duration of the product or service
supply contract.
Knowledge Management - manage information and knowledge systems so that personnel have
access to the knowledge needed to effectively perform their work and identify the knowledge needed
for service delivery.
Ideally, there will already be a service management framework in operation within the organisation that
will have implemented these service processes more generally. These can then be applied specifically to
the data landscape.
Data Governance – planning, supervision and control over data management and use, developing
data management and use standards and ensuring compliance with data management processes and
standards.
Page 56
Describing the Organisation Data Landscape
Data Architecture Management – defining data technology standards, defining the approach to
managing data assets, use and reuse of and compliance with existing data technology standards, use
and reuse of data infrastructural technology solutions.
Data Model Management – creating standard data models of the data that will be collected,
create, stored and processed that formally describe the data contents and structures, including
metadata and semantic data, integrating, controlling and providing metadata – descriptive data
about the underlying operational data, creation of data description standards and the collection,
categorisation, maintenance, integration, application, use and management of data descriptions.
Data Security Management – ensuring data privacy, confidentiality and appropriate access to
data, managing and implementing data classification and preventing data loss.
Data Solution Design and Implementation Management – ensuring that all the data aspects of
the design of information technology solutions are performed to a suitable standard and incorporated
into subsequent solution implementation.
Data Operations Management – providing data storage, data operations and service management
support from data acquisition to purging. The service management processes listed above could be
subsumed into this capability.
Data Master, Reference and Quality Management – managing master versions and replicas,
management of master versions of shared data resources to reduce redundancy and maintain data
quality through standardised data definitions and use of common data values, defining, monitoring
and improving data quality.
Data Audit, Control and Lifecycle Management – managing the definition, collection and
analysis of data audit information, using audit information to develop data controls
Data Location, Synchronisation and Access Management – managing data across storage
locations and platforms, both internal and external, synchronising data across platforms and
managing and controlling access to data across platforms.
Data Usability Management – ensure usability across all elements of the data landscape, ensuring
utility, accuracy, consistency, ease of interpretation.
Data Project Management – supporting and managing the data aspects of projects and solution
delivery and handover to production and support.
Data Insight and Presentation Management – creating data warehouses and data marts,
implementing reporting, data visualisation and analytics, defining data metrics and performance and
results indicators, implementing and operating processes to ensure action is taken based on data
insights.
These data capabilities are not isolated silos. They are interlinked. This does not mean that they cannot
and should be assessed and evaluated separately.
Page 57
Describing the Organisation Data Landscape
The interconnectedness of the data capabilities and their underpinning implementation and operating
processes illustrates the difficulty of assessing one capability independently of others. It is almost certain
that if an organisation is good at any one of these capabilities, it will be good at all of them.
2. How well the defined processes are applied, implemented and operated
These aspects apply to both the process in general and to its application for a specific data entity.
Page 58
Describing the Organisation Data Landscape
A process measurement and assessment framework that includes all of these elements would be very
complex to use.
Gaps in process definition and operation in the data landscape may indicate potential problem areas that
may require or would benefit from remediation.
Using a Data Process Framework to Assess the Health of the Data Landscape
The following approach could be used to assess data capabilities across the data landscape. For each data
capability, rate the overall importance, implementation status and operation and use status. Then for
each data entity, rate each data capability in the same way. This would result in a measurement
structure along the following lines:
Page 59
Describing the Organisation Data Landscape
This is very complex measurement structure as well as being time-consuming to create, maintain and use.
This approach breaches the design principles listed on page 9. While some form of data capability
process assessment would be useful in being able to detect potential problems and areas requiring
remediation, this approach, without simplification, would be too complex to use and be usable.
In terms of the extended data model, the data entity attribute approach described on page 39 could be
used to hold the process status/health information. For the purposes of identifying issues at a high-level,
this should be sufficient.
It is not possible to discuss the topic of (data) processes and their assessment without the subjects of
their maturity and the use of maturity models being raised. The purpose of this document is not to
Page 60
Describing the Organisation Data Landscape
discuss data maturity models in detail. This section covers the topic briefly. There has been a growth in
the number of informal and ad hoc maturity models across different aspects of data processes. These
models lack the rigour and validation and the detailed assessment framework to support their use.
5 - Optimising process
4 - Predictable process
3 - Established process
2 - Managed process
1 - Initial process
Specific goals describe the unique features that must be present to satisfy the process area.
Generic practices are applicable to multiple processes and represent the activities needed to manage a
process and improve its capability to perform.
Specific practices are activities that are contribute to the achievement of the specific goals of a
process area.
Page 61
Describing the Organisation Data Landscape
Maturity levels are intended to be a way of defining a means of evolving improvements in processes
associated with what is being measured.
These data maturity models have different areas of focus, as shown in the diagram below.
Page 62
Describing the Organisation Data Landscape
Data Capability Maturity Model – these define a set of general data capabilities that should
encompass all the required data competencies and that can be used to measure the organisation’s
overall data process maturity.
Data Governance Capability Model – these apply maturity to the subset of data capabilities
relating to data governance.
Data Stewardship Capability Model – these apply to the further subset of data governance
processes relating to data stewardship – the fitness, quality and usability of data and its metadata,
Data Analytics Maturity Model – these apply to the subset of processes that apply to data
analytics activities.
Big Data Maturity Model – these apply to the subset of processes that apply to big data and by
association data analytics activities.
The following table lists some of the maturity models currently available in these areas. Maturity models
come and go with great regularity in the data domain. There are a large number of obsolete and
Page 63
Describing the Organisation Data Landscape
unmaintained data maturity models. Many of the models are developed by vendors who use them to sell
their products and services rather than the models being independent assessments of actual and relevant
organisation data maturity.
Such maturity models may be useful for specific assessment engagements. But in terms of the overall
data landscape a considerably simpler approach is needed. The maturity models listed above are all quite
different, have different areas of focus and are both quite detailed as well as not covering the full scope of
the data landscape and the processes required to support and operate it.
The set of data capabilities listed on page 56 could form the basis of a maturity model. This could then
be used to create a data landscape view of data capability process health using a simple traffic light
display as shown in the following diagram.
Page 64
Describing the Organisation Data Landscape
Page 65
Describing the Organisation Data Landscape
The data entities within the organisation data landscape, grouped into applications, are used to operate
business processes. The data that flows into and out of the data entities is used by these business
processes.
Simplistically, business processes and their interactions with data entities can be represented as follows:
2. Business applications are used to assist with the performance of these tasks. Data is entered into
those applications, it is modified, new data is generated, data is output and some or all of this data is
stored.
3. The business applications consist of sets of data entities that combine to comprise those applications.
In the same way as individual entities can be grouped to comprise applications or services as described
on page 49, the business processes associated with data entities could be defined. This would then allow a
business process view of data entities to be created.
Within the data landscape model, such business process information could be useful but it strays from
the core purpose of understanding the operation and use of data at a high-level within the organisation
and to plan for future changes.
Page 66
Describing the Organisation Data Landscape
These are some notes on the relationship between the landscape data model and the EDM. The
landscape data model is concerned with tangible data entity assets rather than describing the contents
and detail of the data that they process. It exists at a higher level than and is not an Enterprise Data
Model (EDM). The EDM is concerned with the content of the data being processed by the organisation,
organised by data subject and business function. The EDM is both an abstraction from the more
tangible nature of the data landscape model as well as containing a far greater level of data content and
processing detail.
Page 67
Describing the Organisation Data Landscape
The following shows a sample Subject Area Model with some of the relationships between data subjects
shown. For example, an organisation may store information on customers. This logical and aggregated
set of customer information can be stored in many different data entities and processed in different ways.
Together, all of this information represents the organisation’s complete (or incomplete and inconsistent
or with multiple redundant data) view of the customer. Customer data is a subject within the
organisation’s subject area model.
Page 68
Describing the Organisation Data Landscape
Page 69
Describing the Organisation Data Landscape
There are different types of data with different processing and other requirements that move around the
data landscape. In some cases, it would be useful to be able to superimpose a subject area model view on
the data landscape in order to identify the data entities involves in processing different data subjects.
This would be worthwhile when using the data landscape model to assess compliance with the many
data and data-related regulations from data protection (such as GDPR) to money laundering to data
aggregation and risk reporting.
The data landscape model could be further extended to associate key data subjects with data entities.
This could then be used to identify those data entities that need to be examined in greater detail in
relation to a specific data subject.
The following diagram shows the sample data landscape described on page 72 overlaid with some of the
subject area model data concepts.
Page 70
Describing the Organisation Data Landscape
Figure 48 – Sample Data Landscape with Overlaid Subject Area Model Data
Page 71
Describing the Organisation Data Landscape
This section contains some examples on the use of the data landscape model to examine data changes.
The sample data landscape that is being used for these use cases is shown in the diagram below.
This simplified environment contains the following data entities. Their data entity types are based on
the list contained on page 16.
Page 72
Describing the Organisation Data Landscape
For ease of use, this sample data landscape excluded any infrastructural data entity types.
Page 73
Describing the Organisation Data Landscape
Page 74
Describing the Organisation Data Landscape
The data entities in the sample landscape diagrams are represented as follows:
This sample data landscape will be used to illustrate how the approach can be applied to a number of use
cases:
Move Data Entities to the Cloud below – move one or more data entities such as line of business
application to a cloud and implement the associated application and data access infrastructure.
Implement Cloud-based Data Analytics Capability on page 77 – use the analytics services of a
cloud service provider and implement the associated data access and application scheduling
capabilities.
Move Test Environments to the Cloud on page 79on page 81 – move test versions of business
applications to the cloud.
The following diagram shows the changed data landscape. In this example, the CRM application is
moved to a cloud service. This can involve ether migrating the existing application or re-implementing
the function of the existing solution on a cloud-based platform.
Page 75
Describing the Organisation Data Landscape
Page 76
Describing the Organisation Data Landscape
This change involves implementing a cloud-based data analytics service where requests for specific
analytics processing are sent to cloud-based targets and the results transmitted back to the organisation.
Page 77
Describing the Organisation Data Landscape
Page 78
Describing the Organisation Data Landscape
In this scenario, test environments are moved to the cloud in a form is Infrastructure As A Service.
The following diagram shows a simplistic view of this. The precise set of application test environments to
be moved to a cloud service provider would need to be defined: integration and system test, user
acceptance and operations acceptance test. The test environments would need to mirror their production
targets to make testing realistic. The topic of moving test data to the cloud-based test environments
would need to be considered and an approach defined.
Page 79
Describing the Organisation Data Landscape
Page 80
Describing the Organisation Data Landscape
Outsource IT Infrastructure
This scenario examines the changes to the data landscape associated with moving applications to an
outsourcing arrangement where the infrastructure is located in a facility provided by the outsourcer.
The same inter-application interfaces must be maintained. Additionally, interfaces between the
outsourced data entities and the residual data entities located within the organisation must be
implemented.
A set of security, access and authentication facilities must be implemented to allow secure connectivity
between the outsourcing facility and the organisation’s offices.
Page 81
Describing the Organisation Data Landscape
Page 82
Describing the Organisation Data Landscape
In this example, the organisation implements an external backup and recovery as a service arrangement.
Backup data is collected from on-premises applications and sent to the office service from where it can be
recovered.
Page 83
Describing the Organisation Data Landscape
Page 84
Describing the Organisation Data Landscape
In this sample scenario, the organisation implements a disaster recovery as a service facility. The disaster
recovery service provider maintains a set of target data entities in a state of readiness so they can be
brought into production use within an agreed time. Data is replicated from the production data entities
to these disaster recovery targets.
In addition, there is an extra data zone Work Area Recovery Facility. This is intended to represent the
location where organisation employees would be located in the event that the main location(s) were not
available
Page 85
Describing the Organisation Data Landscape
Page 86
For more information, please contact:
http://ie.linkedin.com/in/alanmcsweeney