100% found this document useful (1 vote)
474 views27 pages

Data Publication

This document provides an overview of Data Hub, which is a system that enables end users to easily feed data from external systems into hybris systems. It acts as a staging area where external data can be analyzed, corrected, and composed into canonical items before being published to target hybris systems through RESTful web services. The data goes through stages of import, composition, publication, and inspection before being published to hybris systems like the Core system.

Uploaded by

Derek Young
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
474 views27 pages

Data Publication

This document provides an overview of Data Hub, which is a system that enables end users to easily feed data from external systems into hybris systems. It acts as a staging area where external data can be analyzed, corrected, and composed into canonical items before being published to target hybris systems through RESTful web services. The data goes through stages of import, composition, publication, and inspection before being published to hybris systems like the Core system.

Uploaded by

Derek Young
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

INTRODUCTION TO DATA

HUB + DATA
PUBLICATION

BY : - Nitin
Vohra

What is Data Hub?


The Data Hub is a single tenant system that enables end
users to easily feed data from external systems into any
hybris system.
It acts as a staging area where external data can be
analyzed for errors and corrected before being fed into
hybris.
Why Required?
It provides a service-oriented, data integration solution
that can lower implementation time and costs because
implementation partners do not have to build manual
data integrations anymore.

Contd..
The complexity of the hybris data structure is hidden
from the users that are designing the integration.
Data can be input in a raw format with multiple rows per
item in any order.
No knowledge of dependencies between input items is
required to input data.
It is based on RESTful Web Services

Overview

Workflow
1.Data Import
2.Composition
3.Publication
4.Data Inspection

1. Data Import
Use Spring integration to create a new data feed and then
using it to load the data
Localizable items have separate rows for each locale with
the ISO code specified.
These separate rows are combined, split, and any
duplicates are resolved as necessary during internal
processing in the Data Hub.

2. Composition
Imported data is thencomposedinto canonical items.
Canonical items represent a master data type view, and
are independent of the structure of target systems.
Composition is manually triggered by a RESTfulPOSTto:

http://{host:port}/datahub-webapp/v1/pools/{poolName}/compositions

It transforms data from the Raw Model to a Canonical


Model
Raw Model

simple, flattened view,


matching inbound data;
schema-less definitions;
fragmented data

Canonical Model

ideal representation of
domain object like
"customer";
independent of source
and target structures

3. Publication
The canonical items can then bepublishedto a specific defined
target system manually.

POST - http://{host:port}/datahub-webapp/v1/pools/
{poolName}/publications
The above is published with a content-type ofapplication/json, and a
JSON structure such as,
{

"targetSystemPublications":[

"targetSystemName":"TestCoreInstallation"

]
}
,contained in the request body.

4. Data Inspection
Data can be inspected at any stage of processing in the Data
Hub.
A RESTfulGETrequest to the following URL returns all the
history of all actions in the named pool:
http://{host:port}/datahub-webapp/v1/pools/{poolName}/pool-history.
(json,xml)

Data Hub
Publication

Canonical
Items

Published to

Target
System

Publication Phase
It is started by initiating aPOSTrequest to the following with one or more system
names specified in thetargetSystemPublications metadata and the{poolName}.
POST - http://{host:port}/datahub-webapp/v1/pools/
{poolName}/publications
{
"poolName": "GLOBAL",
"targetSystemPublications": [
{
"targetSystemName": "ITTestSystem"
},
{
"targetSystemName": "HybrisPlatform"
}
]
}

The publication phase runsasynchronouslyand thePOSTrequest returns


immediately indicating the publication action is IN_PROGRESS.

Publication Status
It is up to the adapter to update the status of the publication after it is completed.
The status of the publication can be verified by executing the following URL:
GET - http://{host:port}/datahub-webapp/v1/pools/{poolName}/publications/{actionId}
"actionId": 4,
"type": "PUBLICATION",
"startTime": 1399479311692,
"endTime": null,
"status": "IN_PROGRESS",
"poolName": "GLOBAL",
"targetSystemPublications": [
{
"publicationId": 5,
"startTime": 1399479311689,
"endTime": null,
"status": "IN_PROGRESS",
"targetSystemName":
"ITTestSystem",
"errorList": [

],
"actionId": 4

{
"publicationId": 6,
"startTime": 1399479311689,
"endTime": 1399479319685,
"status": "SUCCESS",
"targetSystemName": "HybrisPlatform",
"errorList": [

],
"actionId": 4
}
]
}

Contd..
Currently, the Data Hub returns the following statuses:
SUCCESS -Indicates that the publication was completed successfully.
COMPLETE_W_ERRORS -Indicates the publication was completed with
errors.
IN_PROGRESS -Indicates that the publication is in progress.
FAILURE -Indicates that the publication has failed or crashed.
Any failed publication to the target system marks the whole publication as a
FAILURE.
However, data is published to the target systems whose publication status
is set to SUCCESS or COMPLETE_W_ERRORS.

Target Metadata
A business user has to describe how data should look in that
system.
This is done by defining target system metadata in the extension
xml document of the data hub.

Contd..
The target metadata is presented by the following item types in the
Data Hub:
TargetSystem-defines the hybris systems where data can be
published. This entity is mainly used to distinguish one publication
destination from the other.
TargetItemMetadata-defines an item type in the target system. It
specifies the item type name and links to a set of attributes that are
specific to that item type in the target system.
TargetAttributeDefinition-defines a single attribute in an item
type. It specifies the name, data type, validation rules, and how the
canonical value should be transformed for the attribute values in the
target system item type.

<extension xmlns="http://www.hybris.com/schema/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"


xsi:schemaLocation="http://www.hybris.com/schema/ datahub-metadata-schema-1.0.0.xsd" name="customer-b2b">
<canonicalItems>
<item>
<type>CanonicalCustomer</type>
<description>Canonical representation of the customer</description>
<status />
<!-- defaults to ACTIVE -->
<attributes>
<attribute>
<name>customerId</name>
<model>
<localizable>false</localizable>
<collection>false</collection>
<type>String</type>
<primaryKey>true</primaryKey>
</model>
<transformations>
<transformation>
<rawSource>RawCustomer</rawSource>
<expression>integrationKey</expression>
</transformation>
</transformations>
</attribute>

ereferencing During Publication


To dereference a canonical item theSpEL resolve functionis used to dereference
another canonical item.
The result of the dereferencing is recorded on the referencing canonical item's
publication status for the target system publication being performed.
If dereferencing fails, the referencing canonical item's publication status for the
target system publication being performed is set to FAILURE and the item is not
published.
In the case of a failure, a more detailed explanation of the cause of failure will be
added to the referencing canonical item's publication status for the target system
publication being performed.
if the dereferencing is successful the referencing canonical item continues on in
the publication process.

Contd..
The possible values for the cause of a failure during dereferencing are as follows:
MISSING_REFERENCE - Indicates the referenced canonical item did not exist at the time
of publication preventing this canonical item from being published. Canonical items with this
status are eligible to be published for any subsequent publications as long as
themissingreferenced canonical item exists at the time of the next publication.
INCOMPLETE_REFERENCE_KEY - Indicates that this canonical item did not contain the
needed attribute data to lookup the referenced canonical item, which prevented this
canonical item from being published. Canonical items with this status can be updated by
importing the missing data. Then, in the next publication, they are eligible to be published
with the required attribute data to lookup the referenced item.
INVALID_REFERENCE_KEY - Indicates this canonical item is of a type that does not have
defined the required attributes needed to lookup the referenced item, which prevents this
canonical item from being published. Canonical items with this status are not eligible for any
further publications. This referencing canonical item type will need to be recreated and
reimported to include the required attributes.

Adapter Services
All target system adapters must implement
thecom.hybris.datahub.adapter.AdapterServiceinterface.
Any new target system adapter can be added to the Spring context registry similar
to the way the core platform adapter is configured now:/datahub/datahub-exportservice/datahub-core-adapter-service/src/main/resources/META-INF/coreadapter-service-spring.xml
...
<alias name="defaultCoreAdapterService" alias="coreAdapterService" />
<bean name="defaultCoreAdapterService"
class="com.hybris.datahub.core.adapter.impl.CoreAdapterService">
<property name="targetSystemType" value="HybrisCore" />
<property name="impexService" ref="impexService" />
<property name="mediaStorage" ref="impexMediaStorage" />
<property name="exportClient" ref="exportClient" />
</bean>
...

Core Adapter Services


TheAdapterServiceimplementation for a hybris core installation is located
in thecore-adapter-servicemodule.
It uses theImpExformat to publish the data.
For eachTargetItem,anImpExheader is first created for all non-localized
attributes, and then additional blocks for every subsequent localization of
that item.

Data Hub Adapter provides a web-based service API to upload data into
a Core system from the Data Hub.

Contd..
The data should be provided in anImpExfile format, which typically looks like
this:
$baseProduct=baseProduct(code,
catalogVersion(catalog(id[default='apparelProductCatalog']),version[default='Stage
d']))
$catalogVersion=catalogversion(catalog(id[default=apparelProductCatalog]),version[
default='Staged']))
[unique=true,default=apparelProductCatalog:Staged]
INSERT_UPDATE Category;supercategories(code,$catalogVersion);code[unique=true];
$catalogVersion
;;3912;
;;1;
;;2625;
;;tshirts;
INSERT_UPDATE Product;code[unique=true]; name[lang=en]; Unit(code);
$catalogVersion[unique=true,allowNull=true];description[lang=en];approvalStatus(co
de);ean;manufacturerName
;123A;A12334567890BC; pieces; ;Very good product;approved;testEAN;A Manufactor"

Contd..
However, large volumes of data that can make the transmission very
time-consuming.
To bypass these problems and to load the data more efficiently, the Data
Hub Adapter uses a special format for theImpExfile.
Only headers are present and there is information about how to retract
the data from the Data Hub.
Also paging used during the creation and loading of theImpExfiles.
The default page size of each block is 1000 items i.e. for each block of
1000 items a newImpExheader is generated with the appropriate
metadata information (including the page number and page size to be
retrieved).

Modified Impex File


$baseProduct=baseProduct(code,
catalogVersion(catalog(id[default='apparelProductCatalog']),version[default='Staged']))
$catalogVersion=catalogversion(catalog(id[default=apparelProductCatalog]),version[default=
'Staged'])[unique=true,default=apparelProductCatalog:Staged]
INSERT_UPDATE Category;;supercategories(code,$catalogVersion);code[unique=true];
$catalogVersion
#$URL: https://data.integration.host/datahub-webapp/target-systempublications/123/ApparelProductCatalog.txt?
targetName=TestHybrisCore&fields=parent,catalog,version&pageNumber=0&pageSize=1000
#$HEADER: x-TenantId=master
INSERT_UPDATE Product;;code[unique=true]; name[lang=en]; Unit(code);
$catalogVersion[unique=true,allowNull=true];description[lang=en];approvalStatus(code);ean;
manufacturerName
#$URL: https://data.integration.host/datahub-webapp/target-systempublications/123/ApparelProduct.txt?
targetName=TestHybrisCore&locale=en&fields=code,name,unit,
catalogVersion,description,approvalStatus
#$HEADER: x-TenantId=master

Contd..
1. The Data Hub Adapter uses special comments to retrieve information about the call back
to the Data Hub, and then replaces the comments with the data received from the call
back into the file.
-$URL:a URL for the call back. This should be used "as is" in order to request data
from the
Data Hub.
-$HEADER:an HTTP headername=valueto add to the call back HTTP Request. The
headers
can be used to pass required information, for example, tenant ID or
security context, to the
Data Hub.
2. The very first field in theImpExcommand declaration is empty. This means the values
are ignored by the ImportServicebut can be used by the Data Hub Adapter. The Data
Hub Adapter requires the canonical item ID be placed into the empty header column for
each data record being imported. This ID is reported back to the Data Hub in case of an
error and lets the Data Hub tie that error to a specific item.

THANK YOU :)

You might also like