KVH2.0 Application Understanding Document VMaster v2.0 Edit
KVH2.0 Application Understanding Document VMaster v2.0 Edit
0
Application)
Document Control
Revision History
Distribution
Approval Signatures
Table of Contents
1. BUSINESS.......................................................................................................................................................5
1.1. BUSINESS OVERVIEW......................................................................................................................................5
1.2. BUSINESS PROCESSES.....................................................................................................................................5
2. FUNCTIONAL OVERVIEW........................................................................................................................6
2.1. ARCHITECTURAL AND FUNCTIONAL OVERVIEW................................................................................................6
3. TECHNOLOGY OVERVIEW & COMPONENTS..................................................................................7
3.1. TECHNICAL OVERVIEW AND COMPONENTS OF KVH 2.0 :.............................................................................7
3.2. ENVIRONMENTS..............................................................................................................................................12
3.3. DIHUB WALKTHROUGH..................................................................................................................................29
3.4. DIHUB INTERFACES (PRIME)...........................................................................................................................37
3.5. DIHUB INTERFACES (HR)...............................................................................................................................37
4. LOGICAL AND PHYSICAL MODEL...................................................................................................................37
OBJECTIVES
The purpose of this document is to record all the knowledge and information that has been
captured during the knowledge transfer sessions and job shadow/reverse job shadow phases of
the knowledge transition, held between 18 th March 2025 to 18th April 2025. This includes all the
detail and topics as per the “knowledge objects” column in the associated Knowledge Acquisition
Plan (KAP).
This document is to be used as a reference point for the Capgemini knowledge recipient once the
sessions have been completed and as such doesn’t replace application/system documentation or
any existing application handbooks/manuals. For completeness, this document has been
reviewed by the respective SME for accuracy and correctness. “
For preparing, please examine all already existing information’s from application data list,
transfer them into the document and check them together with Cluster Lead of DNB
(KVH 2.0)
The KT approach is defined and documented in the <share point folder name>
The templates will be maintained and optimized during the KT for AMS, so please use
always the newest template stored in the sharepoint
1. BUSINESS
1.1.BUSINESS OVERVIEW
KVH 2.0 is an Enterprise Data Warehouse (EDW), where business
services and business users are under one or more business
capabilities, where the services are orderable by users, internal or
external customers via Service Catalog.
Description: Provisioning the corporate data warehouse as a
trustworthy source of information for the DNB Group.
Mission: Run and maintain the enterprise data warehouses and
support key processes in DNB.
Stakeholders:
Business Owner: PDI (Product Data and Innovation) Data
division (Isabel Barroso-Gomez)
IT Owner: T&S EDW EDM tech. family (Kathrin Volkmer)
Users: DNB group wide
1.2.BUSINESS PROCESSES
The Business offerings consist of platforms like IBM
Information Server (ETL Platform), Teradata platform, Core
Infra (Windows, Linux, DB2, MQ), Cloud (Azure, AWS) are
also used.
Applications include KVH2 batch, DI Hub, Basel Calc batch,
RDM, DM6.
KVH2.0 Datawarehouse, selected Teradata apartments,
and applications like profitability reporting, international tax
reporting, audit reports, etc. are the service offerings.
2. Functional Overview
The Data Sources (left part of above figure) consists of different tables like
Transaction, Organization, Scoring, Event, Case, etc.
The types of data are mostly retail, wealth and corporate banking data.
The other components which are necessary for KVH2.0 are ETL, DI Hub,
RDM, Erwin, and Teradata.
DI Hub and RDM together Process and Load into 1300 KVH 2.0 Teradata
Tables.
Data Integration (DI) Hub is the unit in which all the jobs, principles for how
the jobs are designed and executed.
Party data is coming from IAP (Inbound Access Point) to inflow (DS jobs are
present) and to OAP (Outbound Access Point).
Party data (Customer Data) are equal in KVH2.0 and KVH1.0, agreement
data are the different sources of KVH2.0 and KVH1.0.
The data is sourced from DI Hub to Load table for load procedure.
Then the control is handed over from DI Hub event model to KHV2.O
Smart Admin event model.
Then the Smart Admin runs the Flush Procedure which updates the target
table.
Then we have multiple views in KVH2.0.
Authorization tools authorize all the tables, those are running on Teradata.
Every database object which exposes data must be classified with a data
authorization category.
Every database object must be categorized with subject area and row level
access control setting.
Every column must be classified with a column category for masking.
All Data has relationship to time (has the timestamp column, which is
automatically added by the metadata tool i.e., Automation Studio).
All data shall have a defined lifecycle policy based on the timeline.
Referential Integrity are the rules which ensures consistency and
validation of data. (-1 = Unknown or Empty, -2 = Invalid, -3 =
Unknown non-empty by approved).
Physical data model shall be optimized.
Everything in KVH2.0 is always created in SDM (Erwin Model), we update
the model first then update metadata in Automation Studio.
Loading of the EDW is an automated process.
Manipulation of data in the production database is not allowed.
EDW is built on three schema concept (Physical Layer [tables], Conceptual
level [Simple view], External level [Application view]).
KVH2.0 uses Natural Key.
Where we have no access to the table, and there is a Trivial view created for tables, which is
called as the P_Wt (one to one views). And the top layer consists of views created with added
layer of security for the middle layer. (P_WtSec). The Tables are sorted based on the
environments, example – Production (P) or EDW.
KVH2.0 Overview:
Authorization: It is used for checking all the necessary access for the data.
Smart Admin: It is an event driven orchestrating procedures and
monitoring.
Automation Studio: Based on the information model (SDM), produces DDLs
to be deployed in Teradata.
Teradata platform: for data sourcing and writing.
c) RDM:
Reference Data Management Hub (RDM) is used for reference data
maintenance which can be later accessed as for lookup.
It is mainly consisting of all the required master data for lookup. (to be
decommissioned soon)
Design features of RDM:
It is known for its governance and stewardship of hundreds of master sets,
and thousands of source sets and mappings.
Providing metadata maintenance, versioning, policy management and
workflow support.
The current architecture consists of IBM RDM where the reference data is
sourced from the database DB2.
The DI Hub processes the ETL Jobs where the master transaction data and
the lookup data is compared, with populating Redis in the background
using operator (C++)
d) I-Know Tool:
It is a scheduling and monitoring tool used to monitor ETL Runs/Process
Managements/To add dependencies on upstreams and downstreams/Logs
e) Automation Studio:
We have two types of Automation Studio access, one is onsite VDI and another
one is Windows 11 (like a package) under the VDI.
The application on the left side here is talking to web service and we have the Windows
server in the middle. We have two servers which handles different environments.
The load table, the procedure, the flush procedure that moves data over from load into the target
and all the views that build shows data here to the end user and the applications. Everything
here is created by Automation Studio by using templates.
3.2. ENVIRONMENTS
PROD
The Confluence URL to See the Server details for Dev/Test/Prod are
given below:
Server Mappings - P-9159: Datastage ETL Upgrade - DNB
Confluence
SIT (Test) :
Same server but two projects one for SIT and other for UAT
Note:
vip 1: used for kvh1
vip2: used for kvh2
Prod:
Prod1: kvh1
Prod2: kvh2
(a) Datastage:
Datastage Folder Structure according to DI Hub Job Flow:
IAP
Initial jobs are like Staging jobs where initial data will flow/load and
create datasets. Datastage jobs will read Source which could be
Input Files/Third party database (tables) or within Teradata itself
and load it to staging area (datasets) which will act like source for
InFlows DS Jobs.
INFLOWS:
Some of major transformation will happen here through datastage
as shown below. It will also generate datasets which will be source
for OAP DS jobs.
Note: Initialize current files will be used for Delta logic.
S-Test
A-Test
f) RDM: dev/test/prod
Source Sets: It contains Key and Mapping sets (with Master sets)
MAP_Curency_0238 : It will start with MAP then RDM Set name (_Currency
here ) and Source (_0238 here)
g) Linux Server
Common Path : /dnb/prosess/InformationServer/
ATest: UAT Env
STest: SIT Env
Dev: Dev Env
Prod: Prod End
NR1219: This is the path where all incoming and outgoing files will be available
Above figure shows the path where all the incoming files will be landing
This is the path where file dispatcher runs to process the file and when event
handler runs it will trigger DS Jobs.
MFT team which is a third party system within DNB, they will fetch file from above
path and deliver it to target team. After successful deliver they will put copy of the
files in data arch (mentioned below) path.
So every execution of Datastage jobs will get its own dedicated parcel directory.
And all the HIR and Source file will be created under above dedicated parcel
directory.
Configuration File: This is the most important file in an event based system.
Configuration file for KVH Job: it is not fired through File arrival event.
g. Automation Studio:
It is a automatic deployment tool for deployment of DDLs, views,
Stored procedures in respective environments. Works in prod
server or devtest lab and decommissioned for Dev.
For Code generation, 95% of KVH2 is based on templates, same
structure for every table in the data warehouse.
Note: if you face any error while connecting to it, connect with DAB team
on this.
Open EAT: Used for deployment of any new table. Provide Erwin model
into this and this will do automatic deployments. It will automatically give
unique name to tables, Column name and index name.
Configure tables: Provide different authentication at the tables level,
column level etc. as shown below:
The DI Hub sends the data to the Queue Table, which are in MDS extension database and
then, the Smart Admin will create the instantiate jobs where it collects the statistics on
the load table, then will run the flush, does the cleanup so the flush done event is
completed.
Where the specific actions are carried out and triggered, jobs waiting for other jobs to
complete, etc.
One Special feature of Smart Admin is that the job group is having the jobs that are
running serially, but two or more different job groups are running parallelly.
Note – The Main job group queue can run 5 jobs serially.
k). VedApp:
Mostly used by KVH1, but sometimes we use this in KVH2 for copying the
small amount of data between environments.
Architecture
All folders are created under the area that we will be working
with.
In the above figure, we have created folder under DataArea, then
Inside InboundAccessPoints we have folder based on different sources.
IAP SEQUENCER:
All the jobs in IAP sequencer and OAP jobs must be multi sequencer
jobs to be handled by event handler.
IAP Job:
Details regarding Mapping and all, we will get that in Inbound Mapping
specification sheet
Example:
While Running the IAP Sequencer, we will run the PARTY HIR to HCR first and then
rest of the HIR to HCR to avoid FELIX call to happen multiple request in same time
for the same customer . In other words first we will be process customer, then we
will get all the details in CACHE for the new customer and then rest of everything
can be executed subsequently.
In the above snippet, We have kept dependency ion event in such a way that
Agreement job will be executed once Party Area is processed.
INFLOW JOBS:
Developed by Iknow Utilities having custome built stages written by C++.
There are multiple custom built stages developed in DNB :
1. RDM
2. GKM Lookup
3. Felix Stage
Every OAP job in KVH2 is a bulk load. That means in Bulk load, we drop all the
indexes in the WORK Table and delete data of that data in PARCEL_ID already
present. This will ensure that our OAP jobs are restartable in case of partial load in
previous runs (shown below):
In After SQL, we are calling FLUSH Procedure to load Target table from Load Table.
3.4. DIHUB INTERFACES (PRIME)
Some of the observations and exports will deliver based on prime.
DI Hub there are two kinds of export. One is where we are kind of a mediator between
downstream application, and we work as data provider to downstream application.
And we provide purely on source data to downstream and we do not intend to change any
data.
One export which we give to
Transaction team which is purely based on the data we receive from Prime.
exports are derived directly through the IAP stage and then we trigger an event which
triggers export job and deliver to downstream, application.
if we take an example of efraud which is directly give the flex data to downstream
application.
It means that we do not use KVH data to deliver majority of our export.
We deliver data from KVH to data and then we deliver that data to downstream application
and sometimes we share views that are built on top of KH2 table which we use and
eventually deliver data to downstream application.
If u see prime doesn't have any system to deliver data to IPA transaction lake.
And Transaction is a financial event that’s why we called it as financial event because we
can’t hear any word transaction in kvh2.0 table.
If we see any financial event, we can understand that is a transaction event.
Money Transaction is a IAP sequence for the transaction job event, which we are triggering this
event for this export so it is simple process there is no inflow or any other DI hub specific
transformation anything for that ,just simply reading data from source and create and event and
that event deliver the data, but we are using DiHub principles for extracting data from
source .Money Transaction is a IAP sequence for the transaction job event, which we are
triggering this event for this export so it is simple process there is no inflow or any other DI hub
specific transformation anything for that ,just simply reading data from source and create and
event and that event deliver the data, but we are using DiHub principles for extracting data from
source .
If we see below screen shot, created data set directly and reading data form source and then
extracting data by triggering an event after created an event.
If we see HIR implementation, that columns, we are receiving from prime and sending to
downstream like target and we are not responsible for any data related issue.
If we see above ss design flow, we created a dataset in the IAP job and then create the event,
that event it is the DIB host data set job we trigger that executed eventually and this is file arrival
event it means it files to be transferred.
If we observe above screen shot there will be job name:DIHub post dataset and invocation id :is
prime Transaction event and under parameters : schema name: Hir_Transaction_event and data
provider code and business date etc.. so we are passing to downstream event by OPA job .
Incoming files we cannot say directly from source same way out going files also not directly
send to Target there is intermediator between those like MFTS systems ,down steam to ritm
and ritm to MFTS and Autosys also involved if it delivery within DNB then It Taken care
Autosys or if it is from external MFTS get involved ,So MFTS put it in specific path that
standardized and deliver to downstream application.
Refer below confluence for more info about CMM file formatter
Export job:
Export ,If we takes CMM export prefer two kinds like If source have specific requirements then we
can create the custom report and other one generated for kvh2 its sometime scheduled on
based time and that particular time smart admin event will be fired and our EDW dispatcher will
read that event and trigger the export job else through smart admin schedule an flushed event
base.
DIHUB HR contains sensitive data that is related to associate or employee related things
for which.
We have like migrated all the jobs related to associate and employee to this DI of HR
where we have like access restricted.
Only KVH 2 and platform got the access restricted by using a general group with help of
EU support team where all our AB IDs will be under this group.
And whoever the joins in that specific to kv2.0 they get access along with DI HUB and our
environment so this Restricted access can be contains 3 applications, such has data stage
& Linux and AutoSys.
We have paths similarly DIHUB_HR like DIHUB ,only difference is prod HR instead of prod like
below ss.
So different paths:
Config file: All the related DIHUB HR related files will be resides
Source landing: Where all the source files related to HR files landed in this path.
And Target _archive and Target and object restore and shell script path.
If we open the Autosys, there we can see DIHUB AND DIHUB HR together under views Related
jobs and their status.
ViewsDIHUB-1219DIHUB AND DIGUB related DISPATCHERS we can see below so first half
would be DIIHUB AND second half would be a DIHUB HR.
If we see in DIHUB HR there will be four categories mainly observed below ss like Data area
Data areas:
Data source sources that is related to this associate data that is one is like answer data contact
center and simply HR and team jobs.
Payroll:
It is associate or employee data that we have inbound access point jobs like IAP jobs
Contact center:
Contact center has around 650 plus jobs that we have migrated from DI hub to DIHUB HR
And the others are like 10 jobs, and we have like majority of the files would be like CMM files
related and the export files that we deliver to the downstream.
DB2 is not only important for process any sources, but also for processing all the
metadata and the utility related data.
There are several custom procedures and views are created at the DB2 level. (example –
people related data)
MDM DB1 will no longer be required for us as we are about to move out of RDM by 2025.
The data which are needed for custom development apart from regular RDM data
residing in MDM DB1, will be migrated to IM DB1. (Currently using both)
Tables in DB2:
Finance ODS 1602 data warehouse. This is the Finance data warehouse. Maintained by DXC
team.
Mercur (4010) and Origo (4027) are the data stores where the DNB gets the data from.
We are just providing the data from the sources to the RCM team for reporting.
Note – We use ODBC connection for this database, in SQL Assistant.
IBM RDM application is a web-based application that we use and also this application is maintained by platform
team and which is the 110 routine of about DNV.
As part of the development activity, EDM and kvh2 team is manage this code and all the environments.
whenever we start a project and if you have any requirement of adding RDM lookups on that.
Example:
If we are working on any source data and source file which has any country codes and other codes, so our data
modeler or architect will create the IAP mapping specification and
We have tab in RDM, from there they will fill up information and they order the reference codes.
If we take an example for codevork , we get the business request from the jan-egil for RDM and EDM team
and other tech families as well like AML, FEC, RCM, so data molder will take action on their request and let
them completed as their standards follows, it should go through this code ever mail chain they update .
So They order or update the master set request on SharePoint and then they send us mail on code event and
then from there we take it forward.
So marinating this maser sets comes under codever responsibility in all environments like prod and test.
In test environment we have source sets and mapping usually taken care by project team in lower
environment, but prod, plat form team will be taken care.
So when you raise the crq you add a task for implementation for platform team and we should guide them
what they needs to be do, similarly they should follow the same step for prod as well.
master sets will go through codeverk only.
Below share point link is use for reference, how we can order the master set code and reference set name
Test RDM Link - IBM Reference Data Management Hub
And as a developer no need to do anything but as kodevrk team we should guide them the steps.
If we take an example for latest master data set ,it contains like Duration type and content type and order type
like New master set and their reference set name and duration type and project id and then who ordered that
person name and different process statuses under that we can choose like prod ,Atest ..etc.. They will update.
And Each and every master set which is modelled in RDM is also modelled in KVH 2 using the same process
we follow for our continuous or any other tables. We followed the same process for RDM as well go through
SIM, SDM PDM and EAT and Automation Studio.
So they will update if the SIM modelling is done or not and then they can update the e-mail and these are like
not mandatory things but if we it's good if we have more information and everything and then this is kind of
Excel like below.
They have the own format they will attach. So what basically for RDM what we need code and which is the
master code which will be mastered in EDW.
Then what is the name of that code and what we are saving Norwegian name and Norwegian description. So
you will have English name English description Norwegian name like that.
And Norwegian description. So if you can see that for one month like on frequency manner.
And another important column is sort order, If we want add any column so order will change.
we have some of these values are there ,those are like -1,-2,-3 but -3 is not using now a days ,it doesn't apply
when the RDM actually happens through inflow but -1 and -2 are important ,-1 is like whenever the null and
values getting from source and -2 is for unknown means some other values getting instead of actual .so RDM
cannot understand the those values(whenever they have defined unknown values from sources) .
Master tab : which is related to important master tab which have all the information about the code.
Meta tab:
This is also like updated by whoever is ordering the set so they will give the description of the code. What is the
Norwegian description effective date and Who is the owner of that particular set.
And we do not have any request or anything for sources and We only get request for master set here, so sets
and mappings will be updated in our mapping specification. I have been having specification and from there the
developer or the project team. Whoever is working on the project they took it from there and then they updated
in RDM.
This is the UI like below ss,and we have some different tabs in this main thing which that we want to look into
right now is this 01 master sets, so 01 master sets will have all your master sets which are mastered in EDW.
For example, if you see a AML_agreement country then you will see different versions of it. So how you create
like if you see here this AML agreement country might be first mastered in 20, 2020, 1906. So we give there is
an naming standard that we follow, which is a period and then underscore. 01 It starts with underscore 01
if you get multiple requests in the same period then it goes 02,03 and always have to see the approved version
of it. like below ss:
If you click on this you will have different sets, there are 368 values which are mastered for this AML agreement
country. And then if you click one of it then you will see that OK you have code name, description and there are
some other columns as well and you have the translations. This is the same Norwegian name and Norwegian
description so which we take from the accent sheet.
And we have one more dataset which is General set ,we have only one elimination of intercompany like below
ss:
This is the email we receive every month like below ss. We update it accordingly. As you can see, people from
FVCC are copied on it. This particular case involves the elimination of the Intel company. Every month, we
receive a request from this person—he updates some information, and then we have to retire it. This is an
exception case where he is the only one involved.
KVH2.0 sets:
We have some KVH and Housekeeping columns for which we perform lookups. For those, we also have two
sets of KVH data.
Source set:
specific sources related to our source data. As we know, we receive data from many sources, and each source
has its own folder.
whenever you create the source sets, we always give it the naming convention for follow the naming
convention like below this the master set name underscore source set code.
We have two kind of mappings those are copy mappings and independent mappings.
1. Copy Mappings: This seems to imply that the codes (or data) are being directly copied from the
source and inserted into the target without any change. If the data is already "mastered" in the system
(like KVH 2), then you're basically just copying the same values into the target, resulting in blanks
because no transformation or changes are needed.
2. Independent Mappings: These might involve some form of transformation, where the codes or values
from the source are mapped to the target in a way that is independent of the original source codes.
This could involve things like changing formats, applying business rules, or other transformations that
make the mapping not just a direct copy.
In the context of 84 ODS (which could be an Open Data Structure or a specific operation in a process),
So, when you mention that if the codes from the source are already mastered in KVH 2 and follow the same
structure, it would result in blanks, I assume you're saying that no transformation is needed, and hence there’s
nothing to fill out in the target mapping because everything aligns directly.
The IAP mapping specification sheet should be updated by your data modeler or data architect, depending on
who is working on it. for example,
We have worked on this Once updated below sheet, you will find an RDM tab in the mapping specification. This
tab will contain information on all the RDMs used for that particular mapping."
In the Ino utility, custom templates may be created for these RDM lookups. These templates are pre-
configured to handle the mappings or transformations.
Template Setup:
The key point is that you don't need to worry about how the RDM lookups are set up, as they have already
been configured in the system using the custom templates.
When you're working with these templates, it’s important to focus on creating the necessary datasets.
Dataset Name:
When creating a dataset, you need to ensure that the dataset name is correct.
The dataset name refers to the master set or target set—whatever name you’re using to define the dataset in
your system.
The inflow job runs based on the configuration from the HIR/IAP, performing lookups on the master set and
validating against the mappings based on the data provider code.
The system will check the lookup for the business location RDM during the inflow job.
Based on this lookup, it will determine whether any values are missing or invalid in the data.
->Handling Null Values:
If the system finds null values in the lookup (for example, missing or invalid business location values), it will
map those nulls to default values.
As you mentioned, default values like -1 or -2 may be used to handle these nulls, ensuring that the job can
continue processing without errors.
->Error Logging:
If the job encounters an issue, such as being unable to find a specific RDM code, it will log the error.
The log will show:
The reason for the failure (e.g., missing data or incorrect mapping).
Which RDM code could not be found during the lookup.
The specific rows where the issue occurred, including the null values that were mapped to default codes.
->Debugging Failures:
If your job fails, you can refer to the log to see exactly what went wrong and which RDM code caused the issue.
This helps you identify any mapping problems or missing data that need to be addressed.
The lookup checks the business location RDM, handles null values by defaulting them to -1 or -2, and logs
any issues with specific RDM codes that couldn't be found, providing clear error messages for debugging.
Example below:
The system attempted to perform a lookup for agreement status_45364 with value 0 but couldn't find a
matching mapping in the RDM.
As a fallback, it returned -2 as the default value for this unmapped status code.
CD columns :
Always have nullable = NO.
The reason is RDM (Reference Data Management) maps any incoming NULL in CD columns to -1
automatically.
This ensures CD columns are always populated (never truly null), hence nullable = NO.
CD columns → nullable = NO
Code Source columns → nullable = YES
Because It’s a standard RDM check to maintain consistency and avoid nulls in CD fields.
DB2:
o All operations (approve, update, etc.) are done through the UI.
2. If issues occur
o In DMI (Data Management Interface), no manual updates are typically done on RDM tables.
o There is a useful reference page with all relevant RDM queries (though the message cuts off
here).
If we want more info related to DB2 sql an tables, we can refer below confluence :
RDM - New reference data orders
RDM:
o This allows you to add, update, or modify entries while keeping the base intact.
o After importing and reviewing changes, the set goes into "Pending Approval" status.
For multi-key master sets, go to the "Administration" tab to define the additional keys.
3. Set Dates
Effective Date: Current or required date.
Review Date: Set to a date 2 days ahead (or per project standards).
5. Import Data
Prepare an Excel sheet with new or updated values.
Use Import option in RDM UI to load the data into the Draft version of the set.
Ensures that only one current version of the master set remains active.
o Type: M Default (for single key). For multiple keys, configure them under the Administration
tab.
Code → code
Name → name
Description → description
Translation Values:
Review Date → Set to 2 days ahead of the current date (used for testing/demo sets).
📝 These fields don’t appear by default, so you must enter them manually and then save.
Even if the data is different, the import process remains the same.
Click Finish.
o System message:
“Import is successful. 28 imports done. No updates. No errors.”
Confirm that all expected values are present and correctly mapped.
🔸 3. Request Approval
Once data is verified:
Choose the parent folder (e.g., E021) where the source set should be stored.
Description:
o Why? Because the source set only has one key (just like the master set).
o This allows for ongoing updates as new data comes in from the source system.
o Example: 202504_01
Description:
Once the source set is updated with data from your source file, database, or HIR, go to the Mapping
section, click “New”, and select the source set for which you're creating the mapping to link it with the master
set.
To create a mapping in RDM, go to the Mapping section, click “New”, select the source set (e.g.,
TEST_E021) and target set (e.g., TEST_KITTY), then name the mapping using the convention
MAP_<SourceName>, set the version (e.g., 202504_01), choose the mapping type (Copy Mapping for direct
values or Source to Independent for manual mapping), and finally enter effective/expiration dates, description,
and save it — the mapping is now created and ready for manual value linking.
To manually map source values to master values, go to the created mapping, click “New”, then use the
option “Show only unmapped source values” to find values like One Month that aren’t yet mapped; select
the source value, choose the correct target (master) value, set the effective and expiration dates, and click
OK to complete the mapping.
o Letting too many people make changes could lead to errors or inconsistencies.
o These updates are often based on external requests, so careful handling is needed.
o Once you have access, you can try things in the test environment.
Main display
Privacy
Common and resolution
PAM
If we take PAM small er diagram like entity relationship it is CMM(common information mode)l
and we should follow the some standards column name for the table like name should be the
capital and not under score and those has relationships like one -one and one to many and
hierarchy’s and data types.
And all the data types are predefined as per DNB because data types are categorized so we
should use those only.
we can navigate like below:
And all the data types are pre-defined and categorized as per DNB, so we should use those standards .
SDM:
Sdm is a semantic data modeler, it is not a physical layer, but it is
kind of physical and which is the bridge between common
information model & physical model.
PDM:
And initially created the ddl by using pdm run it in Teradata and for
their future deploy ddl by using Automation Studio.
Mapping specification:
Mapping specification are different types such as
inflow specification,
inbound specification
outbound specification
reference data specification
source mapping specification etc..
CIMX MODEL:
Cimx model is a Common model which will be generated, and it
can be used in two scenarios. One is new system comes, and we
want to send some data to data warehouse but multiple system
can use the same model. So by using the same model they can
send to us. So, there's something called CMM.
Baselines: base line is new setup but we not using this folders.
Cimx transformations :
Whenever we generate all the steps will be locally created And
these folders are used for the internal local runs.
sub models:
submodels configurations is there, but we are not using
environment branches
once CiMx is generated , it will have CMM schema and cmm CSV
file and HCR CSV file and different folder structures .
trainings:
it created for couple of Mark runs. like kind of a full file run and if
we want to remove the particular attribute or if we modify the
5.1. REFERENCES
Monthly run is called as Kort Run and the link is attached below(will be covered by Simran
and kirthika):
KVH2.0 Monthly Kort Run - EDW Service Delivery - DNB Confluence
The 1219 page has all the necessary topics like DI Hub and target tables, etc and
attached below:
1219 DI Hub - DNB Confluence
CMM Guide:
CMM-Guide - IT Business Intelligence - DNB Confluence
CMM Guide:
CMM-Guide - IT Business Intelligence - DNB Confluence
KT Videos sessions:
Enterprise Data Management - KVH2 - All Documents
GitLab link:
Working with Gitlab - EDW Service Delivery - DNB Confluence
Shared drives:
VedApp mostly used by KVH1, but needed for KVH2.0 sometimes, They are used for
batch run:
5.2.DOCUMENTS