0% found this document useful (0 votes)
21 views

10 Graph Engine Service

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

10 Graph Engine Service

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Graph Engine Service (GES)

Foreword
⚫ GES uses the Huawei-developed graph engine Eywa to facilitate
querying and analysis of multi-relational graph data. It is
specifically suited for scenarios requiring analysis of rich
relationship data, including social relationship analysis, marketing
recommendations, public opinions and social listening,
information communication, and fraud detection.
⚫ This course describes the concept of graph computing, GES
functions and features, and GES application cases.
2
Objectives
⚫ Upon completion of this course, you will be able to:
 Understand basic concepts of graph.
 Understand functions and features of Huawei Cloud GES.
 Use GES for graph analysis.

3
Contents
1. Graph Computing

2. What is GES?

3. Features

4. Success Stories

4
Explosive data growth
⚫ Graph computation outperforms conventional data processing methods in relational big data analysis.

Source: IDC's Data Age 2025 study Source: IDC's Data Age 2025 study

Global data increased to 163 Interactions increased to


zettabytes 4785 times per person per
day

5
Ubiquitous Graph

Communications networks Social networks


Vertices: devices and routers; edges: network flow Vertices: users and posts; edges: relations and likes

User-product graph Wiki article on climate changes


Vertices: users and items; edges: ratings Vertices: Wiki articles; edges: links

6
What Is Graph Computing?

Definition: A graph is a representation of real-word relational data.

Description: G = (V, E, D) where V = vertex (a vertex in a graph); E = edge (an edge); D = data (a property and weight)

Expertise: query and analysis of large amounts of changeable data with various connections

Features Graph relationships Application scenarios


 Diverse data independent of  Social relationships  Exploration of opinion
structures  Information propagation leaders
 Multi-source data association, network  Friend recommendations
auto propagation  Communications network  User grouping
 Dynamic data change and real-  Organization structure  Organization structure
time interactive analysis  ... analysis
 Interpretability  ...

7
Advantages

Highly expressive, suitable for representing complex relationships, In the case of a large amount of data, potential
compatible with a range of semantics relationships can be quickly detected.
Diverse data expression and scalability Fast multi-hop relationship query

Relational database Graph database


1:1 or 1:N N:N

High-performance parallel computing:


graph vs conventional data parallel

8
Graph: Next Key Basic Technology for the Future of AI
AI deep learning solves problems in low-dimensional Graph computing performance is expected to improve
dense data processing by 100 times in high-dimensional scenarios.
Speech Logical
recognition reasoning
PageRank
Matrix (tensor) (search engine) Relationship (graph)
computing computing
Social analysis
(criminals mining)

Recommendation Text analysis


engine

Feature Knowledge
Cyber attack analysis
engineering graph
Image and video Semantic
processing network

Low-dimensional/Dense High-dimensional/Sparse

Summary of viewpoints on databases, graph, and AI in NIPS/SIGMOD/NDSI:


• Mathematically, graph computing and deep learning are two sides of coins in AI computing. They are
equivalent and interchangeable.
• Graph computing is suitable for processing high-dimensional and sparse data. (The computing
efficiency is improved by more than 100 times with graphs, according to an MIT Lincoln Lab paper.)
9
Increasingly Improved Graph Technologies and
Stable Ecosystem
Improved technologies

Display

Analysis

GES
Compute

Storage

Language

10
Contents
1. Graph Computing

2. What is GES?

3. Features

4. Success Stories

11
GES: Hyper-Scale, Integrated Graph Analysis and
Querying
⚫ GES facilitates querying and analysis of graph-structure data based on various relationships. It uses Huawei's in-
house, high-performance graph engine EYWA, which is powered by multiple patented technologies. GES is
specifically suited for scenarios requiring analysis of rich relationship data, including social apps, company
relationship analysis, logistics, domain knowledge graph, and risk control.

• Social relationships
• Transactions Individual
• Call history analysis
GES
• Information
propagation
• Diverse data support, not merely
• Browser history
structured data
• Transportation
• Multi-source data association, auto Group
networks analysis
propagation
• Communications
• Dynamic data changes, and real-time
network
interactive analysis without training
• ...
• Visual and interpretable results
Link
Massive, complex,
analysis
associated data is
naturally graph data.

12
Huawei-Developed Eywa Kernel: Integration of Graph
Database, Analysis, and Graph Visualization
Support for
Compatible with Integrated
extended Innovative graph
Algorithm open source APIs design of Web Portal
attribute engine kernel
development diagrams analysis and
design
Release query
Visualizer
Service Graph Engine Service
modeling Result display
Eywa Service app
Submit High-performance embedding
graph engine
User High-performance graph computing kernel

Mobile client
⚫ Abundant graph analysis algorithm libraries
⚫ Distributed high-performance graph storage engine

⚫ Graph analysis algorithms: More than 30 in-depth optimized basic algorithms covering many industry scenarios
give Huawei advantages over competitors. More than 10 graph neural networks and graph embedding algorithms.
⚫ Graph database: native APIs for adding, deleting, modifying, and querying data, a wide range of filtering queries,
multiple data ingestion modes, standard Gremlin and Cypher (international standard GQL basis), and full-text
indexes
⚫ Graph visualization: UI-based graph analysis can be performed by graph editing and entity drill-down, and wizard-
based algorithm operations and graphical results make data analysis easier.

13
Advantages
Large scale Integrated query and analysis
Efficient data organization for Integrated query and analysis and
analysis and querying of graph analytics algorithms
graphs with 10+ billion facilitating analysis for scenarios
vertices and 100+ billion edges such as relationship analysis, route
planning, and marketing
Eywa: high- recommendation
performance cloud
graph engine
Abundant graph analysis
algorithm libraries
High-performance graph
computing kernel
Distributed high-performance
High performance graph storage engine Easy to use
In-depth optimized distributed Wizard-based, easy-to-use analysis
graph computing engine, and interface, what you see is what
high-concurrency and multi- you get; standard property graphs
hop real-time query and extensions
capabilities for property graphs Gremlin and Cypher queries
with 1 billion vertices and 100+
billion edges

14
Application Scenarios
Internet Knowledge graph
• Friend • Knowledge storage
recommendation • Intelligent Q&A
• Commodity/Informati • Knowledge
on recommendation disambiguation
• Abnormal behavior • Learning path
analysis recommendation
• Public opinion analysis

Financial risk Smart city


control • Route planning
• Real-time fraud • Pipeline pressure
detection adjustment
• Missing person tracking • Urban road network
• Credit control
• Funding tracking

15
Contents
1. Graph Computing

2. What is GES?

3. Features

4. Success Stories

16
Process of Using GES

Preparations Metadata Graphs creation Management Analysis Tasks

Register with Import metadata Create a custom Manage the Analyze graph with Check graph
Huawei Cloud from a local path graph graph the graph editor overview

Create a graph Call APIs to execute


Grant GES Import metadata Enter the task
using an industry- the queries and
permissions from OBS center
specific template algorithms

17
GES Functions
⚫ Extensive Algorithms
 PageRank, K-core, Shortest Path, Label Propagation, Triangle Count, and Link Prediction
⚫ Visualized Graph Analysis
 Wizard-based UI for graph exploration, and visualized query result
⚫ Query/Analysis APIs
 APIs for graph query, metrics statistics, Gremlin query, algorithms, and graph and backup
management
⚫ Good Compatibility
 Compatibility with Apache TinkerPop Gremlin 3.4 and OpenCypher 9
⚫ Graph Management
 Dashboard, graph management, graph backup, and metadata management functions
18
GES Graph Data
Pay Customer Resides Location 1

Graph database
Account Buy
Deliver Stores real-world service relationships
Receive and does not require table association.
Order Buy Product
Flexible schema
Notify Excellent for complex transactions
Relational database Key-value database Supplier Excellent for in-depth analysis
Ship Location 2

⚫ GES imports graph data based on the property graph model. A


property graph is a directed graph consisting of vertices, edges, Homogeneous Heterogeneous How to Show
labels, and properties. Data Data Relationships
⚫ GES supports only raw graph data in standard CSV format. Relational Each row in the Join multiple
Different tables
⚫ GES graph data consists of the vertex, edge, and metadata files. Database Service table tables
 Vertex files store vertex data. Vertices of the Vertices of
Graph databases Edges
 Edge files store edge data. same type different types
 Metadata is used to describe the formats of data in vertex
and edge files.
19
Importing Data to GES

REST API

REST API
REST API
Historical Data ingestion/cleaning Data storage service
data (DIS/CloudStream/MRS) (OBS/CloudTable)

Import data Export/


in batches. Back up
data.

Graph Engine

SDK/Gremlin
JSON/CSV Graph computing Service
Real-time Service Import incremental data.
data integration SDK
Web Console BI (third-party)
SDK
Huawei Cloud
User data

ISV application

20
Mainstream Graph Query Languages: Gremlin
and Cypher
⚫ Graph query: Native APIs, Gremlin, and Cypher
⚫ A range of methods for multi-hop queries
⚫ Calling a native API:
POST /ges/v1.0/{project_id}/graphs/{graph_name}/action?action_id=filtered-query

BODY: { "executionMode": "sync", "visulized": "false", "filters": [ { "operator": "out" }, {


"operator": "out" }], "full_path": false, "vertices": [ "tr_10" ] }

⚫ Using the Gremlin language:


g.V().has('vid', 'attr', 'tr_10').bothE().otherV().dedup().bothE().otherV()

⚫ Using the Cypher language:

match ({vid:tr_10}) --> () --> (u) return u;

21
Graph Algorithms
⚫ GES provides extensive basic graph algorithms, analytics algorithms, and metrics algorithms, meeting
requirements of a variety of scenarios.

Multi-hop filtering and query: Analyzes Public security: Mines the close Graph-based connectivity component
the spread of viruses (healthcare), submaps of entities in complex social detection: Discovers internally connected
equity penetration and risk warning networks to identify organized crime. substructures in graph structures to enable
(finance), and tracks criminal e-commerce merchant group stability
relationship networks (public security) detection (Internet) and credit cardholder
relationship network detection (finance).

22
GES UI

Vertex editing
and drill-down
Algorithm library Visual schema
editing

Custom operations

Gremlin query

23
Contents
1. Graph Computing

2. What is GES?

3. Features

4. Success Stories

24
Application Scenarios
Government Finance Device management Internet Material management

Potential sensitive event Device management Personalized Manufacturing BOM


Credit card anti-fraud
detection network recommendations for management
e-commerce
Event profile and multi- Pre-loan/Medium-loan
Digital power station
dimensional analysis risk analysis
E-commerce risk control
Data tracing for bank
Analysis of guarantees
responsibility-shirking
behaviors in service Data mining for anti-money
ticket handling laundering

25
Graph Compute Engine Supports Government Public
Opinion Survey in Shanghai
⚫ Pain points
 The public opinion survey involved a large volume of heterogeneous data across locations and time. A typical engine would struggle to represent, organize, and store the
data.
 Real-time data update is required to detect and identify public demands, handle problems, and collect public comments and identify impacts. The system needs to support
real-time analysis and dynamic visual analytics.
 The system should be able to process complex tasks, such as detection, association mining, root cause analysis, emotion analysis, and solution recommendation. Intelligent
perception and computing engine capabilities are urgently required.

Multi-dimensional Highlight: GES provides integrated graph-structure


Fact-based solution data storage, real-time query, and an association
analytics of public
opinions design analysis and inference engine to organize multi-
dimensional heterogeneous data. The graph algorithm
Opinions are analyzed and Well-designed solutions are engine enables importance analysis, association mining,
presented based on topics, provided, solutions that take group discovery, real-time recommendations,
complainants, owners, and the association among topics, knowledge-based inference, classification prediction,
solutions from the complainant emotions, and and many other algorithms. Supporting standard query
perspective of multiple measures into account. interfaces, GES can provide powerful knowledge-based
dimensions such as time,
compute capabilities and an interactive GUI.
space, and association.

Upgraded analysis
Cause analysis efficiency
and warning
The graph knowledge
Comprehensive effective association establishes closer
factors are captured from associations between subjects.
upstream and downstream Six degrees of separation are
activities for cause analysis reduced to one degree when
and prediction of possible, greatly improving
subsequent impacts. knowledge mining efficiency.
26
Data Mining for Topic Associations
Topics of public concern raised in a certain period are somehow related with each other. For example, Sam reports a problem
about topic A and complains about topic B, and at the same time, another person submits a service ticket complaining about
topics A and B. In this scenario, the graph engine can mine hidden/internal associations between the two topics.

Output association modes


Calculate the similarity of the modes
Build a compute graph model
Sam submitted a service ticket about problem A through the city hall hotline.
submitted a service ticket about problem B through the city hall hotline.
Subject-Topic (proposal):

Topic-topic (repeat occurrence):

Topic A Presentation and explanation


1) Historical service tickets
from associated people
Topic C
Topic B
Create a heterogeneous data graph

Wong submitted a service ticket about problem A through the city hall hotline. 2) Is pigeon raising related to the road maintenance?
submitted a service ticket about problem B through the city hall hotline. Pigeon raising -> (public hygiene, transportation?) ->
road maintenance

28
Data Mining for Detecting Topic-Sensitive Groups
Pain points
To find out who are concerned with what issues, multi-dimensional heterogeneous data needs to be analyzed and processed with
graph engine capabilities.
Solution
Build a graph structure for the data, integrate heterogeneous information of multiple dimensions, analyze and mine data based on
groups, predicate information associations, and detect and classify groups based on their concerns.

Organize data and Group features Result


build a graph model
They submitted a lot of They concern about
Detect associations among topics, service tickets. various topics. Detect topic-sensitive groups Topics
subjects, owners, service tickets, and
measures to build a large
heterogeneous graph for query, Subjects
computing, and multi-dimensional
analytics.

Similar group
People who concern about similar content and have
similar number of concerned topics
'150****8044', '54... .53', '139****2823',
'135***5348', '181***298', '189****2412' , ... ..,
'180****8953', 137****5757 "greenery restoration",
"corridor piles", "greenery destruction", "arrears
of wages", "vehicle parking", "group rent
management", ..., "parking violation"

29
Public Opinion and Sentiment Tracking
Pain points: Negative emotions may spread among the crowd and there is no mechanism for tracking public sentiment.
Solution: Track public emotions by extracting information form STs, including hot topics, residents' emotions, expectations, and issue
status, and allocate STs based on graph computing results. This can help to track public opinions and control adverse impacts.

Building a GES graph model Benefits


• Resident satisfaction is
improved.

Resident A is • STs are properly


satisfied with the dispatched to optimize
Residents measures for the ST.
the handling process.

• Hot topics are


Resident B is angry identified and public
about the processing opinions are tracked.
Public of the ST and it needs
emotions to be handled as soon
as possible. • Service agents are
evaluated to help
Household registration policies improve the quality.

Transportation policies • Policy adjustment


and optimization are
facilitated.
Tax policies

30
Finance: Credit Card Anti-fraud
Pain points: The rule-based detection is not accurate enough. It cannot explain cash flows or identify criminal organizations.
Solution: Construct a graph model of spending patterns and transaction relationships based on the associations between credit cards and merchants. Track
the cash flow to the merchants, analyze spending patterns and merchant importance, and identify credit card fraud based on merchant attributes and on
the preceding analysis.

Building a GES Features of credit card fraud Results


graph model Merchants
Credit cards Merchants Highly similar spending behavior occurs in a
Cash outflows from a given merchant within
certain period of time.
a short period of time after credit card use.
Out

Consumption
Frequent
transactions with the A card owner receives a transaction
same merchant notification of almost the same amount Fraud Fraud Fraud Fraud
within a short period after the consumption. card card card card

… …
• Number of fraud cases
Frequent transactions with the same It is possible that the fraudster fakes the among transactions with
merchant are recorded. merchant or an upstream/downstream card user merchants of a certain type
to get the cash out. • Total number and size of the
Credit cards with Normal transactions of each case
10 1 frequent transactions
transaction/ are more likely to be ? • Transactions description
transactions/
month month attacked than those • Credit rating of merchants
that are not • ...
frequently used. Suspicious

Graph database provides excellent The results are fact-based and can represent
Advantage 1: storage and compute services for a Advantage 2: An unsupervised learning model can detect cash-out fraudsters, Advantage 3: the association between fraud features
which is something a rule-based model cannot accomplish.
large volume of graph data. and the directions where cash flows.

31
Finance: Loan Risk Assessment
Pain points: Credit risk management is static. It is hard to track risk in real time based on a network of dynamic customer relationships and make loan
adjustments in a timely manner. Risk can accumulate and the credit asset quality is reduced.
Solution: Build a platform with richer customer relationship information and associations among a variety of information, so there are more comprehensive
insights available, from static perception to dynamic identification. Improve customer information, identify potential risk factors, and predict changing trends.

Building a GES graph model Results


Stockholder
Guarantee

Investment
Explicit
Relationships Investment


Natural person
Upstream and
downstream
Relatives

Registration address
Invisible
Relationships

0.8
Chain of trade

Impacts on production
and operations
Build a customer feature library, the better to integrate available data inside and outside the bank.
Identify associations between customers throughout the entire network, and establish a ... According to the risk spread predication,
the customer group takes high risks.
comprehensive view of customer relationships.

Advantage 1: The information asymmetry between banks and customers is broken down, data and information extracted, and an informational foundation is created
for insight into patterns of risk based on customer relationships.
When a risky event occurs, the most likely path of propagation is identified path in a timely manner. The account manager is better positioned to prioritize
Advantage 2: properly, take action faster, and take appropriate preventive measures. Risks are controlled before they have chance to propagate and get out of hand.

32
Finance: Operations Data Tracing for Bank
Guarantees
Pain points: It is difficult for banks to trace and monitor operations status of the target company in real time, and the risk assessment capability needs to be improved.
Solution: Associate the bank's risk assessment system with company operations status through the graph model. Mine information about the actual controller of the
company to support the risk assessment.

A
Building a GES
Result
B graph model
Unsatisfactory
operations

Holding 60%
Unsatisfactory Risk assessment
operations
report
Tracing Holding 75%

• The overall risk level is 3


Holding 75% Tracing New holding company (medium).
Tracing
• Company A has recently been
Holding 40% suffering from poor operations.
100% holding However, considering the fact that
its investment in enterprise X
C accounts for less than 0.1% of the
Unsatisfactory operations total investment, the risk for
company A is low.
Holding 0.1% • About a 40% stake in company C
30% 26% 19% 11% 9% 4.9%
was held by a holding company,
which is a major change and has
certain risks.
Regularly monitors the operations status of • ...
each natural person and investment Credit business
institution and evaluates the impacts on Assessment assessment result
the target company.

E2E evaluation reports ready to be exported, accurate


Advantage 1: Dynamic prediction of operations and investment Advantage 2: risk assessments, and appropriate response measures
status, timely perception and warnings
33
Finance: Data Mining for Anti-Money Laundering
• According to Price Waterhouse Coopers, the amount involved in money laundering crimes worldwide accounts for 2% to 5% of
global GDP each year.
• One of the important methods of money laundering is round-tripping, where funds are returned after being transferred to multiple
parties – giving the impression that the funds have derived from a clean source and thus completing a round trip.

Pain points X
• Query efficiency of a large volume of A simplified example Possess
data is low. Banks are faced with a
large number of daily transactions and Account Account
Transfer Might be a
complex transaction trips. Account Transfer
Account B C fake account
B C
• Round-tripping is complex and cannot
be identified by a single rule. Criminals Possess
Transfer
Transfer
Transfer
create complex and changeable Transfer
transfer trips for money laundering.
Sam Account
Account A
Solution A
• GES uses a native graph structure to
Friends Possess
store transaction data. It can trace Use the GES round-tripping Mike
Money laundering
round trips efficiently. detection algorithm to mine all offenders
• The round-tripping detection algorithm trips shown in the graph.
is used to mine data of the round trips.

34
Section Summary
⚫ GES uses the Huawei-developed graph engine Eywa to facilitate
querying and analysis of multi-relational graph data.
⚫ This course introduces basic concepts and advantages of graph
computing, then gives you an overview of GES functions, and
shows some successful cases.

35
Q&A
1. (True or false) Currently, you can only upload graph metadata
from OBS to GES. ( )
A.True
B. False

36
Q&A
2. (Multiple-choice) Which of the following algorithms are
supported by GES? ( )
A.PageRank
B. k-core
C. Shortest path
D.Label propagation

37
Q&A
3. (Multiple-choice) Which of the following query languages are
supported by GES? ( )

A. SQL

B. Gremlin

C. Cypher

38
Acronyms and Abbreviations
⚫ AI: Artificial Intelligence
⚫ GES: Graph Engine Service

39
Recommendations
⚫ GES portal: https://www.huaweicloud.com/intl/en-us/product/ges.html
⚫ User Guide: https://support.huaweicloud.com/intl/en-us/usermanual-ges/ges_01_0002.html

40
Thank You.
Copyright© 2023 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including,
without limitation, statements regarding the future financial and operating results,
future product portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially from those
expressed or implied in the predictive statements. Therefore, such information is
provided for reference purpose only and constitutes neither an offer nor an
acceptance. Huawei may change the information at any time without notice.

41

You might also like