0% found this document useful (0 votes)
4 views

Lecture 9 - Knowledge Graph

The lecture by Dr. Reem Essameldin Ebrahim covers the concept of Knowledge Graphs, their components, and applications, emphasizing their role in connecting data, information, and knowledge. It discusses the advantages of using Knowledge Graphs over traditional data representations, particularly in terms of relational data and query efficiency, exemplified by comparisons between graph databases like Neo4j and relational databases like MySQL. The lecture also explores the structure of Knowledge Graphs, including nodes, edges, and labels, as well as the importance of context and meta-paths in extracting relationships between entities.

Uploaded by

ahmeddhamed179
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture 9 - Knowledge Graph

The lecture by Dr. Reem Essameldin Ebrahim covers the concept of Knowledge Graphs, their components, and applications, emphasizing their role in connecting data, information, and knowledge. It discusses the advantages of using Knowledge Graphs over traditional data representations, particularly in terms of relational data and query efficiency, exemplified by comparisons between graph databases like Neo4j and relational databases like MySQL. The lecture also explores the structure of Knowledge Graphs, including nodes, edges, and labels, as well as the importance of context and meta-paths in extracting relationships between entities.

Uploaded by

ahmeddhamed179
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

09.

Knowledge
Graph
Lecturer: Dr. Reem Essameldin Ebrahim

Social Network Computing


Based on CS224W Analysis of Networks Mining and Learning with Graphs: Stanford University
Copyright © Dr. Reem Essameldin 2023-2024
What is Knowledge Graph?

Why Knowledge Graph?

Knowledge Graph Components

Knowledge Graph Applications

In this Lecture Knowledge Graph Representation

Topics to be covered are:


Knowledge Graph Meta-Paths

Copyright © Dr. Reem Essameldin 2023-2024


Note that Data, Information, Knowledge
One of the core concept in data management is DIKW: Data, Information,
Knowledge, Wisdom. Sometimes this hierarchy is depicted as a knowledge
pyramid (i.e. describing the amount of data leading to a smaller amount of information etc.), sometimes
it is a linear chain (i.e. understanding and context with past and future).

Copyright © Dr. Reem Essameldin 2023-2024


Knowledge Graph
The term knowledge graph (sometimes also called a semantic network), were
introduced by Google in 2012, when the Google Knowledge Graph was
published on the use of semantic knowledge in web search.

This is a representation of general knowledge in


graph format. Knowledge graphs also play an
important role in the Semantic Web and are
also called semantic networks in this context.
Thus, a knowledge graph is a systematic way
to connect information and data to knowledge.
It is thus a crucial concept on the way to generate
knowledge and wisdom, to search within data,
information and knowledge. Context is the most
important topic to generate knowledge or even
wisdom. Thus, connecting knowledge graphs with
context is a crucial feature.
Copyright © Dr. Reem Essameldin 2023-2024
Why Knowledge Graph
The main issues while using classical data representations like XML or JSON is that they
cannot describe relations between entities .

This led to the development of a W3C standard called RDF (Resource Description
Framework). It is widely used withing web resources, but also within the context
of scientific data representation.
RDF is capable to link data by describing relations. It was
developed in the 1990s for the Semantic Web and inspired a
lot of approaches like Linked Open Data.

This vision is not fulfilled till now. RDF leads to a


graph structure which forms a Knowledge Graph.
Structuring complex datasets into a
knowledge graph grants users access to
deeper insights and facilitates more
accurate predictive models.
Copyright © Dr. Reem Essameldin 2023-2024
Note that Graph databases
A lot of research has also been done with respect to analyses and optimization of
graph queries, especially with focus on Cypher and Neo4j.

Illustrative Example
In video games, identifying the friends of a friend
of a given player using a SQL database would
have involved a cumbersome set of table joins
and conditional filters. The same task using the
Cypher query language was rather
straightforward, and the speed of those queries
on a Neo4j graph database is much greater than
the SQL equivalent.

Copyright © Dr. Reem Essameldin 2023-2024


Note that Graph databases
A lot of research has also been done with respect to analyses and optimization of
graph queries, especially with focus on Cypher and Neo4j. The results of an
experiment conducted between the speed of a Graph Database and Relational
Database showed:

For friends of friends query Neo4j was 60% faster


than MySQL.
For friends of friends of friends Neo4j was 180
times faster than MySQL.
For the depth four query Neo4j was 1,135 times
faster than MySQL.

Copyright © Dr. Reem Essameldin 2023-2024


Knowledge Graph – Components
A knowledge graph represents a network of real-world entities—i.e. objects,
events, situations, or concepts—and illustrates the relationship between them.
This information is usually stored in a graph database and visualized as a graph
structure, prompting the term knowledge “graph.” Knowledge in graph form, captures
entities, types, and relationships.
1 2 3
A knowledge graph is made up of three main components: Nodes, Edges, and Labels

Nodes are entities labeled with their


types. Any object, place, or person can be a node.

Edges between two nodes capture


relationships between entities.

Note that: A represents the subject, B


represents the predicate, C represents the
object.

Copyright © Dr. Reem Essameldin 2023-2024


Knowledge Graph
A knowledge graph is a systematic way to connect information and data to
knowledge on a more abstract level than language graphs. The “context” of data
is a significant topic to generate the knowledge necessary for further analysis.

Mathematical Definition
We define a knowledge graph as a graph
𝐺 = (𝐸, 𝑅) , with entities 𝑒 ∈ 𝐸 = 𝐸1
∪ 𝐸2 ∪ … ∪ 𝐸𝑛 , coming from formal
structures 𝐸𝑖 and relations 𝑅.

where 𝐸1 = 𝐴1 , … , 𝐴𝑚 is a set of 𝑚
actors, forms an extended social network.
All other structures 𝐸2 , … , 𝐸𝑛 describe data
from other sources or other domains. Both
𝐸 and 𝑅 are finite discrete spaces.

Copyright © Dr. Reem Essameldin 2023-2024


Bibliographic Networks
Example
Nodes types: paper, title, author, conference, year.
Relation types: pubWhere, pubYear, hasTitle, hasAuthor, cite.

As we can see, a knowledge graph has


multiple different types of nodes that are Example:
connected by different types of Paper #061 (a node of “paper” type) is published in (a relation type
relationships, each relationship has a “pubYear”) 2004 (a node of “year” type), and it is published at (a relation
certain meaning. type “pubWhere”) the VLDB conference (a node of “conference” type)
Copyright © Dr. Reem Essameldin 2023-2024
Social Networks
Example
Nodes types: account, song, post, food, channel.
Relation types: friend, like, cook, watch, listen.

Copyright © Dr. Reem Essameldin 2023-2024


Google Knowledge Graph
Example

An example of a very large knowledge


graph import from the web data, is the
huge knowledge graph of Google. It
extracts nodes and relations using text
data from the web.

Example:
Mona Lisa (a node) and Da Vinci
(another node) and they are connected
with the relationship type (paintedBy).
Moreover, nodes will have connected
attributes like data of births, death, etc.

Copyright © Dr. Reem Essameldin 2023-2024


Serving information
Another example is “homes for
sale in Bellevue”. Here we have
nodes representing homes, nodes
representing locations. One of
these locations is Bellevue and
then there will be edges between
homes that are located in Bellevue.

Knowledge Graph

Applications
One very popular application is how information is served. So, knowledge graph
essentially what powers Google on the internet. For example, for the query “latest films
by the director of titanic”, Google has to figure out what titanic is, what director is, what
is the relation between director and titanic and then what is the relation between this
director and the other the latest films. Therefore, given this query it was able to give all
this different answers.
Question answering agents

Knowledge Graph

Applications
Siri or any of these modern conversational agents, are powered essentially by
knowledge graphs. For example, if you say something like that first query in the above
given Figure, many different entities are extracted (e.g. travel, thanksgiving, NY, etc.).
Then it has to figure out the relations between these entities and provide answers
accordingly.
Knowledge Graph – Representation
How do we represent knowledge graph is very important because that will
essentially help us to query and reason about these knowledge graphs. One
popular way to do that is heterogenous networks.

Knowledge graphs are typically made up of datasets from various


sources, which frequently differ in structure. Schemas, identities and
context work together to provide structure to diverse data.

Schemas provide the framework for the knowledge graph,


Identities classify the underlying nodes appropriately, and
The context determines the setting in which that
knowledge exists.

Note that: These components help distinguish words with


multiple meanings. This allows products, like Google’s
search engine algorithm, to determine the difference
between Apple, the brand, and apple, the fruit.
Copyright © Dr. Reem Essameldin 2023-2024
Heterogeneous Networks
Every entity 𝑒 ∈ 𝐸 may have some additional
metainformation which needs to be defined with
respect to the application of the knowledge graph.
For instance, there may be several node sets (some
ontologies, some actors (employees, stakeholders, …), locations, …). The
same holds for 𝑅 when several context relations come
together such as “is relative of”, “has business
affiliation”, “has visited”, etc.

Knowledge graphs are heterogeneous networks:


they have different node types and relationships

How can we represent heterogeneous networks?


Answer is Network Schema

Copyright © Dr. Reem Essameldin 2023-2024


Heterogeneous Networks

This particular instance of a knowledge graph can be represented using


network schema. Schema is essentially a definition of how a particular
knowledge graph could look like.
Network schema are meta-level description of a (heterogeneous) network
Copyright © Dr. Reem Essameldin 2023-2024
Representing Knowledge Graph
Network schema are meta-level description of a (heterogeneous) network.

Example
Bibliographic networks have the schema:

Papers are written by authors.


Papers have terms.
Papers are published at venues.
Papers cite other papers.

Copyright © Dr. Reem Essameldin 2023-2024


Types of Network Schema
Common network schema of heterogeneous networks

Multi-relational network Bipartite network


with single- typed object schema
Examples: Facebook Examples: user-Item,
document-word

We just have one type of node, but This schema type has two or more one type of node.
there are a lot different types of In this example, we have nodes that represent
edges. In this example, different documents and other represent words. This schema
users are connected to each other says documents contain words. Each paper with
via different types of relationships certain words will be connected to them.
Copyright © Dr. Reem Essameldin 2023-2024
Types of Network Schema
Common network schema of heterogeneous networks

Star Schema Multi-hub network schema

Examples: bibliographic, movie Examples:


data, US patent data bioinformatics data

We have one central type of nodes, In this schema type we have two or more central
and the other nodes types can be types of nodes (hub; node with multiple connections)
treated as attributes of this type each has its attributes (connections to other types of
node. E.g. a paper can has an nodes). E.g. genes are connected with other genes
attribute of venue. with chemical reactions.
Copyright © Dr. Reem Essameldin 2023-2024
Multi-hub network schema
Example
We have nodes representing users,
posts, words, and communities. This
schema says that users can subscribe to
communities, posts belongs to
communities. Users create posts, can
upvote posts and downvote posts.
Where posts contains words.

Note that:
There is no self loop on the “user” node type; this
means users are not connected to each others. They
only can subscribe to communities, create posts, etc.
Instead, two users are considered connected if they
subscribe to the same community. The two nodes are
connected via the common interaction (Meta-path).

Copyright © Dr. Reem Essameldin 2023-2024


Meta-Paths How to extract relationships between different nodes
Meta-paths are high-level description of a path between two objects. They denote
an existing or concatenated relation between two object types.
They are defined as paths on the network schema. (i.e. not on the graph)

Figure: two authors are connected by their common paper.

We can consider one path instance where, Jim write paper 1 (P1) and Ann is a co-author of that.
Ann has two different authors for two different papers. In figure, meta-paths are sequences of
paths, author write a paper and the paper is written by the author.

Copyright © Dr. Reem Essameldin 2023-2024


Meta-Paths How to extract relationships between different nodes
A meta path 𝑃 = (𝑅1 , … , 𝑅𝑛 ) is a sequence of relations (Denotes a composition of
relations).

Examples:
𝑊𝑟𝑖𝑡𝑒 𝑊𝑟𝑖𝑡𝑒 −1
co-author relation: 𝐴 𝑃 A (short for A-P-A)

𝑊𝑟𝑖𝑡𝑒 𝐶𝑖𝑡𝑒 𝑊𝑟𝑖𝑡𝑒 −1


citation relations between authors: 𝐴 𝑃 𝑃 A
(short for A-P-P-A)

Note that:
Relations can be inverted to query in reverse direction.
E.g., write(Author) gives Paper while 𝑊𝑟𝑖𝑡𝑒 −1 (Paper) gives Author

Copyright © Dr. Reem Essameldin 2023-2024


Meta-Paths Examples
Different meta-paths can be derived from the network. Each meta-path has a
different meaning. The beauty of meta-path is that, by creating different meta-
paths, you can find relations that are very far apart in the network but has very
useful meaning.

Note that:
What we are doing is that we are trying to find relationships between end nodes. In
figure, what is the relationship between two authors (they write the same paper).
Copyright © Dr. Reem Essameldin 2023-2024
Meta-Paths Composing Meta-paths
Given a network schema and an instantiation of the network, meta-paths can be
composed. Thus, the composed meta-path can have different meaning than the
individual meta-paths.
Meta-path 1: P-A-P (Papers by same author)
Meta-path 2: P-V-P-A (Author of a paper published at same venue)

Note that:
The meaning of that composed path is that papers published by similar experts in the same
area. In the figure the two authors worked in the same area as they published in the same
venue. This meta-path returns papers that are written by authors that work in the same area.
Copyright © Dr. Reem Essameldin 2023-2024
Meta-Paths Composing Meta-paths - Example

Note that:
Different meta-paths give different results. They can be generated using domain knowledge or
expertise. In general, given a network schema you can answer interesting questions like the
one given in that example even without doing any computer vision (just from the knowledge graph)
Copyright © Dr. Reem Essameldin 2023-2024
Important KG Tasks

How can we reason on the


Relationship prediction knowledge graphs using Meta-paths. Similarity search
how to find missing or hidden how to find similar objects in
relations between entities. the network.
Knowledge graphs are rarely • Given: A directed heterogeneous
complete. There are: graph G and A seed node s
• missing entities
• Output: A ranked list of nodes
• missing labels
that are most similar to s
• missing relationships

There are two meta-path-based Solution: Calculate pairwise


random walk techniques for similarity metrics between s and all
knowledge graph completion: relevant target nodes T. Train a
• Path Ranking Algorithm classifier based on these metrics.
• PathPredict: an extension of Path Ranking
Algorithm

Copyright © Dr. Reem Essameldin 2023-2024


Important KG Tasks Relationship Prediction
Given: Output:

A directed heterogeneous graph 𝐺 Nodes 𝑡 which should have edge


A starting node 𝑠 𝑅 from 𝑠

Query edge type 𝑅

Copyright © Dr. Reem Essameldin 2023-2024


Relationship Prediction Random walk- based Inference
Idea: Intuition:

Random walk following a How many times does a random


particular meta path sequence walk starting from “start” reach
can be indicative of relations. the “target” following the given
meta-path?
𝑃𝑟𝑜𝑏(𝐿𝑒𝑛𝑛𝑜𝑛 ➝ 𝐺𝑢𝑖𝑡𝑎𝑟 |𝑐𝑜𝑤𝑜𝑟𝑘𝑒𝑟, 𝑝𝑙𝑎𝑦𝐼𝑛𝑠𝑡𝑟𝑢𝑚𝑒𝑛𝑡)

Copyright © Dr. Reem Essameldin 2023-2024


Relationship Prediction Path Ranking Algorithms – Calculating Probability
Given: All meta-paths from start to target:

Start = Lennon Coworker, playsIntrument


Target = Guitar albumArtist, hasIntrument
Relation 𝑅 = 𝑝𝑙𝑎𝑦𝑠𝐼𝑛𝑠𝑡𝑟𝑢𝑚𝑒𝑛𝑡

Calculate probabilities of all metapaths: 𝑃𝑟𝑜𝑏 𝐿𝑒𝑛𝑛𝑜𝑛 ➝ 𝐺𝑢𝑖𝑡𝑎𝑟 𝑐𝑜𝑤𝑜𝑟𝑘𝑒𝑟, 𝑝𝑙𝑎𝑦𝐼𝑛𝑠𝑡𝑟𝑢𝑚𝑒𝑛𝑡


𝑃𝑟𝑜𝑏 𝐿𝑒𝑛𝑛𝑜𝑛 ➝ 𝐺𝑢𝑖𝑡𝑎𝑟 𝑎𝑙𝑏𝑢𝑚𝐴𝑟𝑡𝑖𝑠𝑡, ℎ𝑎𝑠𝐼𝑛𝑡𝑟𝑢𝑚𝑒𝑛𝑡
A PRA model scores a source-target
pair by a linear function of their path
probabilities.

Where, 𝑃 is the set of all relation paths with length


≤ 𝐿 , 𝐿 is given, 𝑃𝑟𝑜𝑏(𝑠 ➝ 𝑡 | 𝑃) = Probability of
meta-path P starting from s and ending at 𝑡, 𝜃𝑃 is
the importance or weight of meta-path 𝑃 (Learned
during training)
Copyright © Dr. Reem Essameldin 2023-2024
Relationship Prediction Path Ranking Algorithms – Calculating Probability
Given: All meta-paths from start to target:

Start = Lennon Coworker, playsIntrument


Target = Guitar albumArtist, hasIntrument
Relation 𝑅 = 𝑝𝑙𝑎𝑦𝑠𝐼𝑛𝑠𝑡𝑟𝑢𝑚𝑒𝑛𝑡

Calculate probabilities of all metapaths: 𝑃𝑟𝑜𝑏 𝐿𝑒𝑛𝑛𝑜𝑛 ➝ 𝐺𝑢𝑖𝑡𝑎𝑟 𝑐𝑜𝑤𝑜𝑟𝑘𝑒𝑟, 𝑝𝑙𝑎𝑦𝐼𝑛𝑠𝑡𝑟𝑢𝑚𝑒𝑛𝑡


𝑃𝑟𝑜𝑏 𝐿𝑒𝑛𝑛𝑜𝑛 ➝ 𝐺𝑢𝑖𝑡𝑎𝑟 𝑎𝑙𝑏𝑢𝑚𝐴𝑟𝑡𝑖𝑠𝑡, ℎ𝑎𝑠𝐼𝑛𝑡𝑟𝑢𝑚𝑒𝑛𝑡
A PRA model scores a source-target
pair by a linear function of their path
probabilities.

For metapath: coworker, playsInstrument


• 2/3 instances of this metapath lead to Guitar
For metapath: albumArtist, hasInstrument
• 2/4 instances of this metapath lead to Guitar

Copyright © Dr. Reem Essameldin 2023-2024

You might also like