0% found this document useful (0 votes)
20 views64 pages

SSRN 4608445

This document presents FinDKG, a dynamic Knowledge Graph system for global finance, developed by Xiaohui Victor Li under the supervision of Dr. Francesco Sanna Passino. It introduces innovations such as the KGTransformer for dynamic KG learning and the Integrated Contextual Knowledge Graph Generator (ICKG), a large language model fine-tuned for KG construction. The study emphasizes the practical applications of FinDKG in financial analytics, including risk management and thematic investing, and makes the platforms publicly accessible for further research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views64 pages

SSRN 4608445

This document presents FinDKG, a dynamic Knowledge Graph system for global finance, developed by Xiaohui Victor Li under the supervision of Dr. Francesco Sanna Passino. It introduces innovations such as the KGTransformer for dynamic KG learning and the Integrated Contextual Knowledge Graph Generator (ICKG), a large language model fine-tuned for KG construction. The study emphasizes the practical applications of FinDKG in financial analytics, including risk management and thematic investing, and makes the platforms publicly accessible for further research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Department of Mathematics

FinDKG: Dynamic Knowledge Graph with Large


Language Models for Global Finance

Xiaohui Victor Li
CID: 01234567

Supervised by Dr Francesco Sanna Passino

October 20, 2023

Submitted in partial fulfilment of the requirements for the


MSc in Machine Learning and Data Science of
Imperial College London

Imperial College London, [email protected]

Electronic copy available at: https://ssrn.com/abstract=4608445


ii

Abstract
Navigating the intersection of graph-based machine learning, large language models,
and finance, this study pioneers the application of dynamic Knowledge Graphs (KGs) in
modelling global financial systems. We introduce three primary innovations: KGTrans-
former, a deep learning architecture for dynamic KG learning; Integrated Contextual
Knowledge Graph Generator (ICKG), a 7-billion parameter large language model fine-
tuned to specialize in KG construction; and FinDKG, an open-source system that lever-
ages dynamic KGs for financial analytics. These advances harness the dynamic nature
of KGs for time-aware analytics, employ cutting-edge language models for streamlined
graph construction, and provide actionable insights into broad financial applications, in
areas such as risk management, thematic investing, and economics forecasting.
We empirically demonstrate the performance of KGTransformer in temporal graph
analytics and the utility of ICKG in creating knowledge graphs. The study answers
key questions about the practical value of the FinDKG system in the fields of global
macroeconomics and investment. Bridging theory and practice, we have made the ICKG
and FinDKG platforms publicly accessible, aiming to facilitate further interdisciplinary
research that integrates graph-based artificial intelligence into economics and finance.
The FinDKG platform can be accessed at https://xiaohui-victor-li.github.io/FinDKG/.

Electronic copy available at: https://ssrn.com/abstract=4608445


iv

Contents
1. Introduction 1

2. Basics of Dynamic Knowledge Graphs and Large Language Models 5


2.1. Foundations of Knowledge Graphs . . . . . . . . . . . . . . . . . . . . . . 5
2.2. Dynamic Knowledge Graph Learning . . . . . . . . . . . . . . . . . . . . . 6
2.3. Knowledge Graphs and Generative Large Language Models . . . . . . . . 7

3. Real World Dynamic Knowledge Graph Data 9


3.1. Benchmark Knowledge Graph Datasets . . . . . . . . . . . . . . . . . . . 9
3.2. Financial Dynamic Knowledge Graph (FinDKG) Dataset . . . . . . . . . 10
3.2.1. Global Deep-History Financial News Curation . . . . . . . . . . . 10
3.2.2. An Open-Source, Temporally-Resolved Financial Dynamic Knowl-
edge Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4. Generative Knowledge Graph Construction with Fine-tuned LLM 14


4.1. Fine-tuning Large Language Models . . . . . . . . . . . . . . . . . . . . . 15
4.1.1. Large Language Models as Knowledge Graph Generators . . . . . 15
4.1.2. Supervised Fine-tuning for Knowledge Graph Construction . . . . 16
4.1.3. Open-source Knowledge Graph Generator LLM . . . . . . . . . . . 17
4.2. Generative Knowledge Graph Construction Pipeline . . . . . . . . . . . . 19

5. Dynamic Knowledge Graph Learning with KGTransformer 21


5.1. Knowledge Graph Transformer . . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.1. Learning Knowledge Graph Embeddings . . . . . . . . . . . . . . . 22
5.1.2. From Multi-Relation to Meta-Relation . . . . . . . . . . . . . . . . 22
5.1.3. KGTransformer Architecture . . . . . . . . . . . . . . . . . . . . . 23
5.1.4. Time-Evolving Update with Recurrent Neural Networks . . . . . . 28
5.2. Probabilistic Framework for Temporal Knowledge Graph Learning . . . . 29
5.2.1. Modeling Event Time Using Temporal Point Processes . . . . . . . 30
5.2.2. Modeling Network Structure Through Conditional Event Density . 31
5.2.3. Parameter Learning and Inference . . . . . . . . . . . . . . . . . . 32

6. Experiments 36
6.1. Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2. Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3. Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3.1. Results on Benchmark Temporal Knowledge Graphs . . . . . . . . 38
6.3.2. Results on Financial Dynamic Knowledge Graph . . . . . . . . . . 38

Electronic copy available at: https://ssrn.com/abstract=4608445


Contents v

6.4. Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7. Applications of Financial Dynamic Knowledge Graph 41


7.1. Dynamic Financial Risk Tracking with FinDKG . . . . . . . . . . . . . . . 42
7.2. FinDKG for Thematic Investing . . . . . . . . . . . . . . . . . . . . . . . 45
7.3. An Open-source FinDKG Portal for Global Financial Systems . . . . . . . 47

8. Conclusion 49

A. Financial Dynamic Knowledge Graph (FinDKG) Dataset A1

B. ICKG Instruction-following Large Language Model for Knowledge


Graph Construction A3

C. Dynamic Knowledge Graph Learning A4


C.1. Recurrent Neural Network (RNN) . . . . . . . . . . . . . . . . . . . . . . A4
C.2. Multilayer Perceptron (MLP) . . . . . . . . . . . . . . . . . . . . . . . . . A4

Electronic copy available at: https://ssrn.com/abstract=4608445


1

1. Introduction
“It can also be maintained that it is best to provide the machine with the
best sense organs that money can buy, and then teach it to understand and
speak English. This process could follow the normal teaching of a child.
Things would be pointed out and named, etc.” (Turing, 1950)

In the transformative shift of modern Artificial Intelligence (AI), the significance of


learning algorithms continues to be substantiated in both academia and industry, echoing
Turing’s prescient vision of constructing intelligent machines capable of learning (Turing,
1950). A notable advancement in this direction is Knowledge Graphs (KGs), graph-based
data structures that enable machines not merely to store, but also to comprehend and
contextualize structured knowledge (Nickel et al., 2015).
The rise of KGs has catalyzed the establishment of Knowledge Graph Learning, which
aims to augment machine learning models with the ability to interpret and reason over
these intricate networks (Ji et al., 2021). By encoding information as a network of
interconnected entities and relations, KGs function as powerful “sense organs” for AI
systems, enabling them to assimilate a wide spectrum of human knowledge. This capa-
bility unlocks the evolutionary application potential, situating KGs at the core of this
study.

Shift into Dynamic Knowledge Graphs


Opening with the dictum “things, not strings”, the advent of modern KGs in 2012 by
Google marked a turning point from keyword-based search to semantic, context-aware
understanding (Singhal et al., 2012). This milestone represented more than just a tech-
nological upgrade; it signalled a conceptual leap that highlighted the interconnections
between entities (Dong et al., 2014). Subsequently, the development of KGs greatly
advanced and expanded into various sectors, including recommendation systems (Wang
et al., 2018, 2019a), question-answering systems (Bordes et al., 2015), and beyond.
Real-world KGs, however, are not static — they evolve. New information arises, while
outdated facts become obsolete. To make the static KG adaptive, the concept of Tempo-
ral Knowledge Graphs (TKGs) emerged , integrating time dimensions to capture evolv-
ing relationships (Leblay and Chekol, 2018). Figure 1.1 illustrates the time-dependent
dynamics intrinsic to TKGs. In TKGs, each edge is associated with a time stamp to

The terms “Temporal Knowledge Graph (TKG)” and“Dynamic Knowledge Graph (DKG)” are used
interchangeably in practice. Throughout this study, the term “TKG” is mainly used in the modelling
context for consistency with existing academic literature.

Electronic copy available at: https://ssrn.com/abstract=4608445


1. Introduction 2

represent the period for which a relationship between two entities is valid. This tem-
poral tagging enables fine-grained temporal inference and facilitates the recognition of
dynamic patterns and trends (Cai et al., 2022).
While static KGs received substantial research attention and industry adoption (Ji
et al., 2021), the study of dynamic KGs remains an emerging frontier (Wang et al., 2023).
This research emphasizes the significance of dynamic KGs and introduces an extended
TKG learning model architecture — KGTransformer — that outperforms static KG
models in graph analytics when applied to real-world KG datasets.

Knowledge Graph Construction with Large Language Models


Knowledge Graph Construction (KGC) is an active research domain aimed at build-
ing KGs from expansive text resources, which has long been a labour-intensive process.
Traditional KGC relies on a step-by-step pipeline of specialized Natural Language Pro-
cessing (NLP) tasks such as Named Entity Recognition (NER), Entity Linking (EL), and
Relation Extraction (RE) (Ye et al., 2022). Though e↵ective, these pipelines are inflex-
ible and prone to error accumulation, often demanding dedicated teams and extensive
manual annotation for implementation and maintenance (Pan et al., 2023).
The recent advent of Large Language Models (LLMs), trained on extensive data and
featuring astronomical model parameter sets, has significantly pushed forward the NLP
performance (Zhao et al., 2023). Language models ranging from embedding-based BERT
(Devlin et al., 2018) to the latest generative GPT series (Brown et al., 2020; OpenAI,
2023), have fueled the development of generative KGC techniques, able to extract knowl-
edge triplets in an automated sequence-to-sequence manner (Xu et al., 2023).
Nevertheless, the adoption of advanced LLMs like GPT-4 in KGC tasks remains lim-
ited, largely due to their restricted availability (OpenAI, 2023). Current KGC meth-
ods frequently utilize more accessible but dated, BERT-based models (Zhang et al.,
2022). This study makes a novel contribution by introducing a 7-billion parameter LLM,
termed Integrated Contextual Knowledge Graph Generator (ICKG), that is specifically
fine-tuned for generative KGC. The ICKG model is publicly available on the HuggingFace
platform for non-commercial research . This pioneering KGC LLM aims to contribute
to the burgeoning field of generative KGC techniques utilizing LLMs.

Knowledge Graph Applications in Finance


Financial texts, including news articles, serve as a rich information source in economics
and finance, acting as indirect indicators of macroeconomic trends and market dynam-
ics (Gentzkow et al., 2019). Although the importance of extracting insights from these
texts has been acknowledged (Chen et al., 1986), traditional NLP techniques like senti-
ment analysis and topic modelling often fall short in fully capturing the interconnections
between economic variables (Tetlock, 2007).
Financial Knowledge Graphs (FinKGs) o↵er a promising solution to extract the quan-
titative insights embedded in the financial texts. By organizing real-world financial en-

Access the ICKG LLM at: https://huggingface.co/victorlxh/ICKG-v2.0

Electronic copy available at: https://ssrn.com/abstract=4608445


1. Introduction 3

tities and their relationships in an interconnected framework, FinKGs can infer both
explicit and implicit knowledge, thus enabling systematic reasoning over global financial
systems. Compared to the well-established field of Financial NLP, the area remains
relatively underdeveloped. Early FinKG applications coming from industry are closed-
source and built on static KG models (Cheng et al., 2020; Fu et al., 2018), which fails
to capture the rapidly evolving dynamics of financial markets.
Recognizing these gaps and limitations, this study introduces the Financial Dynamic
Knowledge Graph (FinDKG), an open-source dynamic knowledge graph system. Built
from global financial news and leveraging the fine-tuned ICKG model, FinDKG pow-
ers an analytics platform available through an online portal . This platform delivers
real-time, structured insights into financial systems, empowered by our FinDKG and
KGTransformer model. Case studies in the application Chapter 7 demonstrate how
FinDKG can enhance risk management, thematic investment, and economics forecast-
ing, marking the first open-source, real-world KG in the financial domain and heralding
new interdisciplinary opportunities in graph AI for economics and finance research.

Research Questions and Contributions


This research investigates several fundamental questions that address the e↵ectiveness
and applicability of Temporal Knowledge Graphs (TKGs), particularly within the finan-
cial sector:

1. How e↵ective are open-source Large Language Models (LLMs), when fine-tuned,
in generating Knowledge Graphs?
2. How does the newly-developed KGTransformer model perform in temporal knowl-
edge graph learning tasks compared to existing graph learning models?
3. What are the extra values that the new Financial Dynamic Knowledge Graph
(FinDKG) o↵er to the existing economics and finance domains with financial news
data?

The primary contributions of this study are multi-faceted, extending from theoretical
advancements in TKG learning to practical applications in Knowledge Graph Construc-
tion (KGC) with LLMs and the final FinDKG portal for global financial systems:

Generative Knowledge Graph Construction: The study presents an innova-


tive, efficient, and scalable generative pipeline for constructing KG. It specifically
introduces a 7-billion parameter fine-tuned LLM, termed Integrated Contextual
Knowledge Graph Generator (ICKG), specializing in the KGC task. The ICKG
model is made publicly accessible .
Dynamic Knowledge Graph Learning: A specialized machine learning model
named KGTransformer for TKG is developed, explicitly designed at addressing the
complexities and dynamics of evolving TKGs.
Visit the FinDKG portal at https://xiaohui-victor-li.github.io/FinDKG/

Electronic copy available at: https://ssrn.com/abstract=4608445


1. Introduction 4

Financial Applications: The FinDKG system is introduced as a core application


contribution, demonstrating the utility of dynamic KGs in real-world financial
scenarios, such as risk management and thematic investing. These features are
showcased via a self-developed online analytics portal .

Open-source Contributions: The FinDKG dataset, model implementation codes


are publicly released to catalyze further academic and industrial research in Graph
AI within economics and finance.

Structure
The rest of this report is structured as follows: Chapter 2 provides an in-depth review of
Knowledge Graphs, including their construction, model architecture and uses. Chapter
3 presents existing Temporal Knowledge Graph datasets and introduces our Financial
Dynamic Knowledge Graph (FinDKG). In Chapter 4, we discuss the role of Large Lan-
guage Models in automating Knowledge Graph Construction, specifically detailing our
Integrated Contextual Knowledge Graph Generator (ICKG) model. Chapter 5 focuses
on Temporal Knowledge Graphs Learning and the architecture and framework of our
KGTransformer model, while Chapter 6 presents comprehensive experimental results
evaluated on TKG datasets. Chapter 7 explores the practical financial applications of
our developed FinDKG system. Chapter 8 concludes the paper and discusses limitations
and future research directions.

Temporal Knowledge Graph

Relation
Update

Relation
Entity of Creation
different type
Entity
Relation of Creation
different type

......

t0 = 0 t1 ...... tn
Time Dimension

Figure 1.1. Illustration of temporal dynamics in a Temporal Knowledge Graph (TKG).


The temporal edges of the TKG evolve with time.

Electronic copy available at: https://ssrn.com/abstract=4608445


5

2. Basics of Dynamic Knowledge Graphs


and Large Language Models
This chapter lays the groundwork for knowledge graphs (KGs) and the intersection
potential between KGs and large language models (LLMs). We touch upon the essentials
of KGs, introduce the concept of dynamic KGs, and explore the synergy between KGs
and LLMs.

2.1. Foundations of Knowledge Graphs


KGs have emerged as an e↵ective graph data structure for representing structured knowl-
edge, serving as a fundamental foundation for Artificial Intelligence (AI) systems and
their knowledge representation. A knowledge graph is essentially a data structure that
enables the storage and encoding of information as entities, along with the relationships
that connect these entities (Ji et al., 2021).

Definition 2.1.1. Knowledge Graph: A knowledge graph is represented as G =


{E, R, F }, where E, R, and F signify the sets of entities, relations, and facts respectively
(Nickel et al., 2015). The elements in the set E are discrete units of knowledge or
entities. The set R comprises various relations which are heterogenous and of varying
type, representing the connections between the entities. Lastly, F contains the factual
information that bridges entities and relations, typically represented as triples.

The triplet (s, r, o) 2 F is the core building block of KG G where s stands for the
source entity, r for the relation, and o for the object entity (Nickel et al., 2015). For ex-
ample, the triplet (OpenAI, Invent, ChatGPT) illustrates how entities and relations are
synthesized to form a fact, where OpenAI and ChatGPT are entities while Invent is the
relation. When these fact triplets are aggregated into a comprehensive KG, the collective
understanding and interpretation of the global structurally represented knowledge are
available.
Knowledge Graph Learning aims to represent such knowledge by converting KGs into
numerical vectors, hence embedding structural information — particularly the interre-
lationships between entities — within a continuous vector space (Bordes et al., 2013).
This vector representation enhances the capacity for richer interpretation of graph struc-
tures, by accommodating heterogeneous types of entity relationships, marking a more
expressive approach than traditional graph embeddings of unit relation type.

Electronic copy available at: https://ssrn.com/abstract=4608445


2. Basics of Dynamic Knowledge Graphs and Large Language Models 6

2.2. Dynamic Knowledge Graph Learning


Temporal Knowledge Graphs (TKGs) extend the conventional, static KGs by incorpo-
rating a temporal dimension. This additional time layer enables TKGs to model not
only the multifaceted relationships between entities but also the temporal dynamics
of these relations. Understanding these temporal properties is vital for predictive and
interpretive applications. For example, a TKG could record a time-stamped event as
(OpenAI, Invent, ChatGPT, 2022-11-30), thus capturing not only the fact that OpenAI
created ChatGPT, but also precisely when. Such temporally enriched data contribute
to deeper insights and more robust predictive models. This study specifically addresses
TKG learning problem.
Notations and Definitions. Following the prior works (Jin et al., 2019; Park et al.,
2022), we establish the notation and mathematical framework for TKG modelling. Let a
TKG G0:t be a directed, multi-relational graph, where each edge, representing an event,
is tagged with a discrete time stamp. Such temporal edge is described as a 4-tuple
(s, r, o, t), where s signifies the subject entity, r the relation type of edge, o the object
entity, and t the discrete time stamp. The graph G0:t canSbe viewed as the union of its
time-sliced subgraphs G⌧ , formally represented as G0:t = t⌧ =0 G⌧ .

Definition 2.2.1. Temporal Knowledge Graph (TKG): A TKG, denoted as Gt , is


an extension of a standard knowledge graph with each event or fact associated with a
time stamp t (Jin et al., 2019). The validity of a fact in Gt may be time specific, and
new facts may be introduced as time progresses.

To capture the temporal evolution of the TKG, let N (t) denote the cumulative number
of edges up to time t. Each edge is assigned an ordering index i. Hence, G0:t can be
mathematically described as:
N (t)
G0:t = {(si , ri , oi , ti )}i=1 subject to 0  t1  t2  . . .  tN (t) = t

The symbols used in this research for TKG modelling are listed in Table 2.1.
Problem Formulation. In the context of graphs machine learning, the objective of
TKG learning is to approximate the underlying distribution of TKGs. This typically
involves data-driven training of neural networks, designed to model both the structure
and the temporal dynamics of the TKG over time.
Temporal Link Prediction. Recent advancements in TKG learning methodologies
concentrate predominantly on solving two core problem setups: TKG interpolation and
extrapolation. Interpolation-based methods fill in missing facts for given timeframes
t; (0  t  T ) (Leblay and Chekol (2018)), whereas extrapolation-based methods predict
future facts beyond the known time horizon (Jin et al. (2019)). The extrapolation
problem setup, despite its challenges, is intriguing due to its potential to forecast future
events, thereby holding immense practical value for real-world dynamic KG applications.
The focus of this study is the extrapolation task, particularly on the future link predic-
tion. Given a subject entity s, a relation r, and an unknown object entity at some future
time t, the objective is to predict this unknown object entity, expressed as (s, r, ?, t).

Electronic copy available at: https://ssrn.com/abstract=4608445


2. Basics of Dynamic Knowledge Graphs and Large Language Models 7

This prediction task naturally aligns with a ranking framework. Formally, given the
historical graph G<t , we aim to rank potential future links (s, r, ?, t) based on their
likelihood. The ranking function f can be mathematically framed as:

f : (s, r, O, t, G<t ) ! R

where O represents the set of all possible object entities, and f assigns a real-valued
likelihood score to each candidate link. The higher the score, the more likely the link.
Machine learning TKG models, especially deep learning architectures, are trained to
approximate this ranking distribution function f . These models take in the historical
TKG G<t as input to learn both the structural and temporal patterns prevalent in
the data. Post-training, the model can then forecast future links at times t0 > T by
appropriately ranking the set of potential object entities for any query (s, r, ?, t0 ).

2.3. Knowledge Graphs and Generative Large Language


Models
The evolution of Large Language Models (LLMs) has extended beyond conventional NLP
tasks to complex problem-solving capabilities (Zhao et al., 2023). Instruction-following
LLMs, which have undergone additional fine-tuning using specialized human feedback
datasets (Ouyang et al., 2022), are adept at aligning with user intents and executing
complex tasks through conversational user prompts.
The integration of LLMs and KGs has recently emerged as a spotlight of interdisci-
plinary research. The fast advancement not only amplifies the capabilities of each indi-
vidual system but also creates a unified framework that substantially improves knowledge
representation and reasoning (Pan et al., 2023). Specifically, KGs can enrich LLMs dur-
ing both pre-training and inference phases by injecting structured external knowledge,
thereby enhancing the model’s interpretability. On the flip side, LLMs can significantly
advance KG-related operations — ranging from KG embedding, KG construction to KG
Question Ansering task. This synergized framework hence leverages the complemen-
tary strengths of both LLMs and KGs to achieve superior performance in a variety of
applications (Ye et al., 2022).
Generative Knowledge Graph Construction (KGC) represents a promising area where
LLMs can contribute to KG-relevant studies significantly. Unlike traditional KGC pro-
cess, which is reliant on a multi-step NLP pipeline, LLM-facilitated generative KGC
employs an end-to-end approach, translating unstructured text directly into structured
knowledge triples (Ye et al., 2022). This eliminates the need for intermediate processes
like Named Entity Recognition (NER), thereby minimizing error accumulation and in-
creasing system adaptability (Xu et al., 2023).
Though the capabilities of recent most advanced LLMs are promising, their pro-
prietary black-box nature hampers widespread research adoption. Community-driven,
open-source alternatives like LLaMA (Touvron et al., 2023), Stanford’s Alpaca, and LM-
SYS’s Vicuna o↵er an alternative pathway (Zheng et al., 2023). These open research

Electronic copy available at: https://ssrn.com/abstract=4608445


2. Basics of Dynamic Knowledge Graphs and Large Language Models 8

platforms serve as the foundation for this study, which focuses on advancing genera-
tive KGC using open-source LLMs. Further insights into the fine-tuning of open-source
LLMs for generative KGC are discussed in Chapter 4.

Symbol Description
G Knowledge Graph modeled as a multi-relational, directed graph
G[0:t] Temporal Knowledge Graph composed of events up to time t
Gt Snapshot of the Temporal Knowledge Graph at specific time t
G<t Subset of the Temporal Knowledge Graph containing events
prior to time t
p(G[0:t] ) Probability distribution of the Temporal Knowledge Graph up
to time t
e Event triplet (s, r, o), comprising subject entity s, relation r,
and object entity o
⌧ Time elapsed since event e last occured
p(⌧ |e, G<t ) Conditional probability density function for time ⌧ between
events
⇤ Denotes dependency on historical events up to time t
ti , ui Temporal and structural embeddings for entity i, respectively
t⇤i , u⇤i Updated temporal and structural embeddings for entity i after
events up to time t
tr , ur Temporal and structural embeddings for relation r, respectively
t⇤r , u⇤r Updated temporal and structural embeddings for relation r af-
ter events up to time t
g⇤ Global embedding of the Temporal Knowledge Graph up to
time t, calculated as the maximum of all updated entity struc-
tural embeddings

Table 2.1. Summary of notations for Temporal Knowledge Graph variables.

Electronic copy available at: https://ssrn.com/abstract=4608445


9

3. Real World Dynamic Knowledge


Graph Data
This chapter provides an comprehensive overview of Temporal Knowledge Graphs (TKGs)
datasets, encompassing both widely-recognized benchmark datasets and our innova-
tive Financial Dynamic Knowledge Graph (FinDKG). We highlight the novel FinDKG
dataset, an endeavour aimed at democratizing access to high-quality financial dynamic
knowledge graphs. This online hosted dataset (Li, 2023) not only enriches the existing
benchmark datasets of TKGs but also sets the stage for advancing graph AI applications
in financial domain, a subject that will be explored in the chapters that follow.

3.1. Benchmark Knowledge Graph Datasets


To assess the performance of knowledge graph learning models, we employ a list of
five real-world TKGs that have garnered attention in recent research, as is shown in
Table 3.1. These dynamic multi-relational graph datasets can be conceptually grouped
into two categories: (i) event-driven TKGs, and (ii) TKGs that feature temporally-
bounded facts.
Event-Driven TKGs This category comprises the Integrated Crisis Early Warn-
ing System datasets from the years 2014 (ICEWS14 Trivedi et al. (2017)) and 2018
(ICEWS18 Boschee et al. (2015)), as well as the Global Database of Events, Language,
and Tone (GDELT Leetaru and Schrodt (2013)). Specifically, ICEWS18 encompasses
data collected between January 1, 2018, and October 31, 2018, while ICEWS14 spans
from January 1, 2014, to December 31, 2014. GDELT is confined to the temporal scope
from January 1, 2018, to January 31, 2018.
Temporally Bounded Fact TKGs This class consists of the WIKI dataset Leblay
and Chekol (2018) and the YAGO dataset Mahdisoltani et al. (2013). These datasets
contain temporally annotated facts in the form (s, r, o, [ts , te ]), where ts and te denote
the initiation and termination time points, respectively. A preprocessing step is applied
to these datasets, converting each temporally annotated fact into a sequence of events
{(s, r, o, ts ), (s, r, o, ts + t), . . . , (s, r, o, te )}, where t signifies a unit temporal increment.
A key feature distinguishing these two categories is the temporal behavior of the facts.
In event-driven TKGs, facts recur multiple times and may even manifest in a periodic
fashion. In contrast, in temporally bounded fact TKGs, facts tend to endure more
consistently over extended periods but seldom recur once terminate.

Electronic copy available at: https://ssrn.com/abstract=4608445


3. Real World Dynamic Knowledge Graph Data 10

Dataset Ntrain Nval Ntest #Entities #Relations Temporal


Interval
YAGO 161,540 19,523 20,026 10,623 10 1 year
WIKI 539,286 67,538 63,110 12,554 24 1 year
GDELT 17,691 45,995 12,554 305,241 240 15 minutes
ICEWS14 275,367 48,528 341,409 12,498 260 24 hours
ICEWS18 373,018 45,995 49,545 23,033 256 24 hours

Table 3.1. Descriptive statistics of the real-world TKGs employed for evaluation. Tem-
poral interval refers to the minimum temporal spacing between two adjacent
events.

3.2. Financial Dynamic Knowledge Graph (FinDKG)


Dataset
While existing benchmark knowledge graphs often focus on general Internet knowledge,
industry-specific knowledge graphs are relatively scarce. This is particularly true for
the financial sector, where the cost of curating accurate, reliable, and timely data is
prohibitive. To bridge this gap, our study takes on the ambitious task of constructing a
global financial knowledge graph (FinDKG) dataset from scratch.

3.2.1. Global Deep-History Financial News Curation


Financial news provides a treasure trove of real-time indicators, reflecting the ever-
evolving landscape of global economics and markets. These indicators extend beyond
mere quantitative data; they encapsulate a rich tapestry of human sentiments, policy
changes, geopolitical events, and institutional perspectives. This complexity necessitates
a data-driven approach for modeling macroeconomic phenomena and market behaviors,
and herein lies the value of our curated news corpus. This corpus serves as a com-
prehensive, multifaceted, and real-time set of qualitative and quantitative markers that
underpin the FinDKG.
The Wayback Machine, an initiative by the nonprofit organization Internet Archive,
o↵ers a robust mechanism to inject a temporal dimension into our dataset. This digital
archive has cataloged over two decades of the web, providing an unprecedented resource
for historical analysis. Its capability to furnish point-in-time snapshots of web data,
particularly financial news, allows us to reconstruct the evolving landscape of market
and public sentiment over time.
Data Collection and Preprocessing. Utilizing the Wayback Machine, we amassed
approximately 400,000 financial news articles from the Wall Street Journal, spanning
from 1999 to 2023. This comprehensive dataset includes both metadata—such as re-
lease time, headlines, and categories—and the full textual content of the articles. To
maintain focus on the economics and financial sector, we applied human-supervised
filters to exclude articles related to irrelevant subjects like entertainment and book rec-

Electronic copy available at: https://ssrn.com/abstract=4608445


3. Real World Dynamic Knowledge Graph Data 11

ommendations. We also pruned opinion columns to ensure that our dataset is rooted in
factual reporting. A modular Python engine was developed to facilitate both the initial
data collection and ongoing updates, designed with the foresight to accommodate future
news articles seamlessly.

Financial News (LHS) FinDKG (RHS)


3500 8000

7000
3000
Monthly Number of Financial News Article

Monthly Number of FinDKG Event Triplets


6000
2500

5000
2000

4000

1500
3000

1000
2000

500
1000

0 0
9

3
-9

-0

-0

-0

-0

-0

-0

-0

-0

-0

-0

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-2

-2

-2

-2
n

n
Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja

Ja
Date

Figure 3.1. This time series graph presenting two key coverage metrics: (1) The dark
blue line on the left-hand axis represents the monthly coverage of historical
financial news articles collected, and (2) the light blue line plotted against
the right-hand axis shows the number of event triplets captured to construct
the Financial Dynamic Knowledge Graph (FinDKG). The construction of
FinDKG specifically started in 2018.

Figure 3.1 illustrates the monthly frequency of our collected financial news, averaging
around 40 articles per month. These articles form the raw material for the event triplets
that populate the FinDKG, which we will elaborate in the next section. By synergis-
tically integrating these historic and real-time data sources, we aspire to construct a
high-touch, temporally-aware representation of global economic and market dynamics.

3.2.2. An Open-Source, Temporally-Resolved Financial Dynamic


Knowledge Graph
Building upon state-of-the-art Large Language Models (LLMs) for knowledge graph
generation, this study introduces the Financial Dynamic Knowledge Graph (FinDKG),
a daily-resolved, temporal knowledge graph generated from curated global financial news.
To the best of the author’s knowledge, this represents the first endeavor to establish an
open-source financial knowledge graph that can be easily accessed and utilized by the
research community.

Electronic copy available at: https://ssrn.com/abstract=4608445


3. Real World Dynamic Knowledge Graph Data 12

As elaborated in Section 2.2.1, a Temporal Knowledge Graph (TKG) comprises event


triplets tagged with specific timestamps. In the case of FinDKG, key triplets from
each news article are extracted corresponding to their release dates. The extraction
is performed using an end-to-end LLM and further processed by systematic entity dis-
ambiguation techniques. Comprehensive methodology details will be provided in the
subsequent Chapter 4. Considering computational constraints, FinDKG is initially built
with data spanning from 2018 to the present, a reasonable timeframe of approximately
five years. Notably, the real-world FinDKG by design aims to reflect all the information
set from news, thereby including both temporally bounded facts and temproal events.
Figure 3.1 illustrates the monthly volume of event triplets identified, aligning with the
coverage trend of collected financial news articles.
FinDKG Relationship and Entity Schema. One of the features distinguish-
ing KGs from standard graphs is their multifaceted relationships between entity nodes.
In FinDKG, we employ a set of 15 predefined relationships, encapsulating the major
interactions within the global financial news ecosystem. As depicted in Table A.2 in
the Appendix, fundamental relationships like relate to and impact serve as the building
blocks for modeling relevance and causal e↵ects. Furthermore, FinDKG exploits the
Named Entity Recognition (NER) capabilities of the LLM to assign specific entity types
from a predefined set of 12 categories, ranging from influential PERSON to impactful
EVENT. These are detailed in Table A.1 in the Appendix.

Figure 3.2. Snapshot of FinDKG’s most influential entities subgraph as of January 1,


2023.

Figure 3.2 presents a snapshot subgraph of FinDKG as of January 2023, highlight-


ing key entities at the time, ranked by graph centrality metrics. This visualization

Electronic copy available at: https://ssrn.com/abstract=4608445


3. Real World Dynamic Knowledge Graph Data 13

demonstrates the graph’s potential utility for investigating contemporary issues, such as
geopolitical tensions between the United States and China, the rising global economic
pressure of high inflation, and the ongoing influence of the COVID-19 pandemic and its
variants.
For TKG model training and evaluation, a customized FinDKG study dataset is cre-
ated, adopting the conventional temporal knowledge graph dataset structure, including
train/validation/test splits organized chronologically. The event triplets are aggregated
at weekly intervals to serve as the basic unit of time. Table 3.2 provides detailed statis-
tics of the dataset, which will be leveraged in Section 6.3 for a comprehensive comparison
of our proposed KGTransformer against existing models.

Dataset Ntrain Nval Ntest #Entities #Relations Temporal


Interval
FinDKG 119,549 11,444 13,069 13645 15 1 week

Table 3.2. Descriptive statistics of the FinDKG study dataset for TKG learning model
comparison.

The FinDKG dataset serves as the foundation for subsequent analyses and model
development, aimed at harnessing graph-based AI methods for generating actionable
insights in global financial markets. To facilitate further research and ensure repro-
ducibility, the dataset is available for open access via our designated FinDKG website
(Li, 2023).

Electronic copy available at: https://ssrn.com/abstract=4608445


14

4. Generative Knowledge Graph


Construction with Fine-tuned LLM
In this chapter, we introduce a novel pipeline for generative knowledge graph construc-
tion (KGC), as defined below, leveraging the capabilities of Large Language Models
(LLMs).

Definition 4.0.1. Generative Knowledge Graph Construction. The process of


generative Knowledge Graph Construction (KGC) refers to employing LLMs to system-
atically extract entities and relationships from textual data via given prompts, subse-
quently assembling them into event triplets e = (s, r, o), where s represents the subject,
r signifies the relationship, and o denotes the object.

Figure 4.1 o↵ers a real-world example of this pipeline, demonstrating its applicability
in transforming textual data into structured knowledge graph (KG) triplets. At the
core of our approach lies ICKG (Integrated Contextual Knowledge Graph Generator),
an instruction-following LLM that we have developed and made publicly accessible
for academic research. This model is fine-tuned on a high-quality supervised dataset
generated by the state-of-the-art but close-sourced GPT-4 AI model. This inheritance
empowers ICKG with GPT-4’s aptitude for automated knowledge graph construction,
thereby facilitating the efficient creation of large-scale knowledge graphs at a reasonable
cost.

Figure 4.1. Illustrative Example of the ICKG-enabled Knowledge Graph Generation


Pipeline: conversion of textual news articles into structured knowledge graph
triplets through prompt engineering with LLMs.

Electronic copy available at: https://ssrn.com/abstract=4608445


4. Generative Knowledge Graph Construction with Fine-tuned LLM 15

4.1. Fine-tuning Large Language Models


The recent advancement of LLMs has revolutionized the frontiers of Natural Language
Processing (NLP), o↵ering unprecedented proficiency in both understanding and gener-
ating text. Trained on enormous volumes of text data, these models excel in multiple
tasks such as text generation, summarization, and question-answering. This section
furnishes a comprehensive examination of LLMs, with a focus on their generative and
embedding-based variants. Special attention is given to the application area of its trans-
formative potential in automating the construction of knowledge graphs.

4.1.1. Large Language Models as Knowledge Graph Generators


Generative Large Language Models. Generative LLMs are designed to produce
text sequences that are statistically similar to the text data they have been trained
on. Models like GPT series, (Generative Pre-trained Transformer, (Brown et al., 2020;
OpenAI, 2023)) fall under this category. They are often employed for tasks such as text
generation, translation, and summarization.

Definition 4.1.1. Generative LLM. Given a sequence of text tokens X = {x1 , x2 , . . . , xN },


the objective of a generative LLM is to maximize the likelihood of generating this se-
quence. The objective function can be expressed as:
N
X
max log P (X; ✓) = log P (xt |x1 , x2 , . . . , xt 1 ; ✓)

t=1

Here, token is a unit of text that has been extracted from a larger corpus during the
tokenization process. Tokens are the building blocks used by language models for both
training and inference. fgenLLM is the generative function parameterized by ✓genLLM
defined as:

xt = fgenLLM (x1 , x2 , . . . , xt 1 ; ✓genLLM )

where xt is the token generated by fgenLLM , and ✓genLLM are the parameters of the
neural network model optimized during training.

One paramount model variant of generative LLMs, commonly referred to as instruction-


following LLMs, are designed to interpret and act upon specific instructions embedded
within the input text or prompt. These models blend the generative capabilities of LLMs
with a more task-focused approach, generating outputs that are not just statistically co-
herent but also responsive to a given instruction.
ChatGPT serves as a quintessential example of instruction-following LLMs, as it not
only leverages its generative capacity but also interprets and executes tasks as per the
user’s instructions. Instruction-following models like ChatGPT hold promise in the realm
of knowledge graph construction. When given specific instructions, such as “Extract
triplets of relationships and entities”, these models are capable of generating structured

Electronic copy available at: https://ssrn.com/abstract=4608445


4. Generative Knowledge Graph Construction with Fine-tuned LLM 16

outputs like list of knowledge triplets (s, r, o) that form the foundational elements of a
knowledge graph.
Embedding-based Large Language Models. Language models like BERT (Bidi-
rectional Encoder Representations from Transformers) are examples of embedding-based
LLMs. These models are not designed to generate text but rather to understand and
represent it in a high-dimensional space. They are highly e↵ective for Nature Language
Understanding (NLU) tasks like text classification, sentiment analysis, and entity recog-
nition.

Definition 4.1.2. Embedding-based LLM. Given a text sequence X = x1 , x2 , . . . , xN ,


the objective of an embedding-based LLM like BERT is to produce embeddings Z =
z1 , z2 , . . . , zN encapsulating the sequence’s contextual intricacies. This is mathematically
represented as:

Z = fembLLM (X; ✓embLLM )


where fembLLM denotes the BERT-styled model and ✓embLLM represents the model
parameters.

Embedding-based LLMs o↵er great utility in the task of entity disambiguation. By


leveraging the high-dimensional semantic similarity between embeddings, one can resolve
the ambiguity of entities appearing in di↵erent textual contexts.
Prompt Engineering to Harness LLM. Prompt engineering is a technique that
manipulates the input query or “prompt” to better guide the model’s response. This
technique has been particularly useful in fine-tuning LLMs for specific tasks without
having to retrain them. By designing e↵ective prompts, one can indirectly instruct the
model to perform complex tasks, including extracting relationships and entities from
text, which is vital for knowledge graph construction.
The capabilities of LLMs, particularly when enhanced through prompt engineering,
can be leveraged for constructing knowledge graphs. By generating tailored prompts,
one can e↵ectively instruct an LLM to extract relationships, entities, and other relevant
facts from text data. Figure 4.1 provides a concrete example of prompt the LLM to
generate KG triplets for given news articles. This serves as a promising data processing
engine for our automated and scalable construction of the long history financial dynamic
knowledge graph (FinDKG).

4.1.2. Supervised Fine-tuning for Knowledge Graph Construction


Generative LLMs, while inherently proficient in a wide array of language-based tasks,
often require customization to excel in more specialized applications, such as knowledge
graph construction. To this end, supervised fine-tuning serves as an e↵ective methodol-
ogy for adapting LLMs to such specialized tasks.
Supervised fine-tuning involves the further training of a pre-trained LLM like GPT-3,
on a curated dataset that is specifically tailored to the task at hand. Given the extensive

Electronic copy available at: https://ssrn.com/abstract=4608445


4. Generative Knowledge Graph Construction with Fine-tuned LLM 17

capabilities of LLMs in natural language understanding and generation, supervised fine-


tuning becomes an instrumental methodology for enhancing the model’s performance
for specific tasks such as extracting relationships and entities from unstructured text.

Definition 4.1.3. Supervised Fine-tuning. Under the context of generative LLMs,


supervised fine-tuning can be formulated as the following optimization problem:
n
X
min LNLL (yi , f✓ (xi )),

i=1

where:

✓ are the parameters of the model.

LNLL is the Negative Log-Likelihood loss function.

xi are the input text segments of a fine-tuning dataset.

yi are the corresponding human or more advanced LLM-generated responses.

f✓ (xi ) is the text sequence generated by the model when given xi as input.

Prompt-based Fine-tuning. Fine-tuning often employs a form of instruction or


“prompt” that guides the language model to generate output in the desired format. For
example, a prompt such as“Extract entities and relationship” can lead to the generation
of a list of triplets for knowledge graph construction.
The capability of LLMs to adapt to knowledge graph construction via supervised fine-
tuning is a transformative advancement. It leverages the model’s capacity for compre-
hending unstructured texts and transforms it into structured knowledge. This approach
synergizes the generative powers of LLMs with the structured knowledge representation
in graphs, thereby enabling large-scale automated knowledge graph construction without
intense human e↵orts.

4.1.3. Open-source Knowledge Graph Generator LLM


We introduce ICKG (Integrated Contextual Knowledge Graph Generator), an LLM
fine-tuned from Vicuna-7B open-source instruction-following LLM (Zheng et al., 2023),
which builds on top of the Meta’s LLaMA model (Touvron et al., 2023). ICKG is trained
on instruction-following demonstrations for knowledge graph constructon, utilizing the
advanced GPT-4 API (OpenAI, 2023) for data generation. Preliminary evaluations
demonstrate that for knwoledge graph construction task, ICKG outperforms both GPT-
3.5 (gpt-3.5-turbo API) and the original Vicuna-7B model and show comparable per-
formance to GPT-4, while also being efficient for deployment on consumer-level GPUs.
The ICKG model is publicly available .
The diagram Figure 4.2 illustrates how we obtained the ICKG model. Training
an e↵ective instruction-following model under an academic budget presents two main
challenges: securing a robust pretrained language model and obtaining high-quality

Electronic copy available at: https://ssrn.com/abstract=4608445


4. Generative Knowledge Graph Construction with Fine-tuned LLM 18

instruction-following data. Vicuna-7B open-source LLM address the first challenge. To


tackle the second, we employ supervised fine-tuning by using GPT-4 to automatically
generate instruction-response. Specifically, ICKG is fine-tuned on 2,500 demonstrations
generated by OpenAI’s GPT-4.

Vicuna-7B

Supervised
Fine-tuning
ICKG
Imperial Knowledge Graph
gpt-4 Generator LLM

Response
Generation 3K KG Instruction
Response Training Data Quality
Data
Filter

Financial News Dataset


+
Knowledge Graph
Extraction Prompt

Figure 4.2. Flowchart of ICKG Large Language Model (LLM) for knowledge graph con-
struction. The chart outlines the training methodology.

Limitations of Existing Closed-source LLM API. While extant commercial


language models such as GPT-4, PaLM , and Claude o↵er substantial capabilities,
their closed-source architecture imposes constraints on academic exploration and usage.
In our task of generating knowledge graphs from news articles, GPT-3.5, despite its
prevalence in practice, exhibits limitations in accurately yielding predefined relational
triplete.
Conversely, GPT-4 demonstrates a higher quality response in adhering to instruction
prompts for KG triplet extraction. Nevertheless, the financial overhead and computa-
tional latency associated with GPT-4’s API constitute realistic obstacles. Specifically,
with a token-based costing model of 0.03 per 1,000 prompt request tokens , the esti-
mated expenditure for processing our 400,000 financial news articles exceeds 18,000.
Additionally, the API’s inference time, approximately 10 seconds per article, translates
to 1,111 hours of single API running time. These far surpass this research’s budget
constraints.
Open-source LLM. In the realm of open-source language models, Meta’s LLaMA
(Large Language Model Meta AI) serves as a democratizing force by providing smaller,
more computationally efficient models that are accessible to a broader swath of the
research community. LLaMA is designed as a foundational large language model to ad-

https://ai.google/discover/palm2/
https://www.anthropic.com/index/claude-2
https://openai.com/pricing

Electronic copy available at: https://ssrn.com/abstract=4608445


4. Generative Knowledge Graph Construction with Fine-tuned LLM 19

vance various subfields within AI. Its compact size — ranging from 7B to 65B parameters
— and lower computational requirements make it ideal for researchers without access to
extensive computational infrastructure.
Building upon LLaMA’s foundation, Vicuna emerges as an open-source chatbot model
families fine-tuned on user-shared conversations sourced from ShareGPT. This fine-
tuning has imbued Vicuna with the ability to generate highly detailed and well-structured
instruction-following responses, rivaling the quality of more renowned models like Chat-
GPT. Preliminary evaluations indicate that Vicuna-13B outperforms other models like
LLaMA and Stanford Alpaca in over 90% of cases (Zheng et al., 2023). The public avail-
ability of Vicuna’s code and weights, o↵ers a promising avenue for further application-
specific customizations in leveraging instruction-following LLM.
Knowledge Graph Construction Instruction Dataset. The dataset for instruction-
following demonstrations on knowledge graph construction was generated by leveraging
the GPT-4 API. We initiated the data collection with a sample of 3,000 news articles.
Subsequent to generating triplet responses via the GPT-4 API, we instituted a second
layer of data quality filter. In essence, only responses that strictly adhered to the instruc-
tion prompt and yielded a sufficiently rich set of triplets were retained. Specifically, data
points generating fewer than five triplets per article were excluded. This data filtering
strategy further refined the quality of our supervised fine-tuning dataset, even above
the native capabilities of GPT-4, thereby promoting enhanced triplet extraction in the
resulting LLM.
Armed with this curated dataset, we proceeded to fine-tune the Vicuna-7B models,
utilizing Hugging Face’s training framework. This process capitalized on efficient GPU
training techniques, such as Int8 quantitization. Our preliminary fine-tuning e↵orts,
specifically on a 7B LLaMA model, were conducted over approximately 10 hours, uti-
lizing eight A100 GPUs with 40GB memory each. The total computational cost was
estimated to be less than 200 on cloud-based GPU compute resources.
Preliminary Evaluation. We conducted human evaluations on randomly sampled
examples to assess ICKG’s utility in generating FinDKG. The detailed results are dis-
played in Table B.1 within the Appendix. Upon examination, it reveals that while
GPT-3.5 and Vicuna-7B can generate lists of triplets, they fall short of adhering to the
instruction rule of utilizing predefined entities and relations. In contrast, both GPT-
4 and ICKG successfully fulfill this criterion, with identified entities and relationship
types perfectly aligning with our predefined sets. This makes ICKG well-suited for our
use case of large-scale knowledge graph triplet extraction from financial news. More
comprehensive evaluations are leaved for future work that is more dedicated on LLM
direction.

4.2. Generative Knowledge Graph Construction Pipeline


Finally, we present an end-to-end pipeline for constructing knowledge graphs, specifically
focusing on creating the Financial Dynamic Knowledge Graph (FinDKG), by utilizing
the fine-tuned ICKG LLM.

Electronic copy available at: https://ssrn.com/abstract=4608445


4. Generative Knowledge Graph Construction with Fine-tuned LLM 20

Knowledge Graph Triplet Extraction. The initial stage of the pipeline employs
the ICKG LLM for the extraction of KG triplets from financial news articles. A while-
loop control structure is implemented to ensure that the generated output not only
adheres to predefined relation sets but also contains a minimum of five distinct triplets.
If these conditions are not met, the LLM is run again on the same input text to generate
a new set of triplets.
Embedding-based Entity Disambiguation. Following the extraction of KG triplets,
the next phase involves the disambiguation of entities using Sentence-BERT embeddings
(Reimers and Gurevych, 2019). Semantic embeddings for raw entities within the triplets
are generated using Sentence-BERT. Next, entities with a pairwise cosine similarity of
over 95% are clustered together. The most frequently occurring entity within each cluster
is then chosen as the representative for all the set entities. As a final step, human reviews
the top 200 most frequently occurring entities within FinDKG to manually disambiguate
them, further refining the quality of the constructed FinDKG.

Electronic copy available at: https://ssrn.com/abstract=4608445


21

5. Dynamic Knowledge Graph Learning


with KGTransformer
In this chapter, we delve into the learning problems of Temporal Knowledge Graphs
(TKGs), a fundamental task to capture the inherent structural and temporal character-
istics of TKGs. As a cornerstone to navigate these challenges, we propose the Knowl-
edge Graph Transformer (KGTransformer), an advanced Graph Neural Network (GNN)
designed for enriching and contextualizing knowledge graph embeddings. Extending
KGTransformer’s architecture, we adapt it to encapsulate the time-evolving nature of
TKGs, thereby boosting its temporal predictive capability. Furthermore, we unveil a
probabilistic framework that allows the model to capture the dynamics of TKGs e↵ec-
tively. This framework guides the learning and inference process for KGTransformer
model, and we provide a customized training recipe for efficient parameter optimization
on real-world TKG datasets.

5.1. Knowledge Graph Transformer


The contemporary challenges in the modelling of TKGs require a specialized model
that can capture both the structural intricacies and the temporal dynamics inherent
within them. Addressing this, we introduce the Knowledge Graph Transformer (KG-
Transformer), a novel GNN designed to contextualize and enrich knowledge graph em-
beddings by incorporating meta-relations with an extended graph attention mechanism.
Figure 6.1 provides an overview of the KGTransformer framework, the details of which
we will delve into in the subsequent sections.

Meta-Relation Graph Attention Message Aggregation Graph Embedding

Target Node ...


- Entity

Type
Edge - Relation ...

...

Source Node
- Entity ...

: Attention weight

Figure 5.1. Diagram of KGTransformer model.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 22

5.1.1. Learning Knowledge Graph Embeddings


Given a G = (E, R, T ) representing a TKG, where E is the set of entities, R is the set of
relationships, and T is the set of temporal instances. Each entity i 2 E and relationship
r 2 R is associated with an initial trainable embedding vector hi and hr , respectively.
These embeddings serve as the initial layer hidden representations h(0) for each node in
the graph.

Multi-Relational Graph Convolutional Network


Multi-Relational Graph Convolutional Network (R-GCN) is an e↵ective KG embeddings
method introduced by Schlichtkrull et al. (2018) to handle multi-relational and multi-hop
graph structures.
R-GCN Graph Aggregator. Formally, the aggregator for the node neighbors N (s)
of a subject node s in the lth layer is expressed as:
0 1
X X
g(N (s)) = h(l+1) = @W (l) h(l) + Wr(l) h(l)
o
A,
r2R o2N (s,r)

(l)
where is an activation function, W (l) is the shared weight matrix for layer l, Wr
are relation-specific weight matrices, and h(l) is the hidden representation of the node
at layer l.
Each relation r creates a local graph structure among the entities, which is utilized
to produce a relation-specific message. The overall message for each entity is then
constructed by aggregating these messages across all relations R:
X X
Overall Message = Wr(l) h(l)
o .
r2R o2N (s,r)

The overall message is combined with historical information to produce the new hidden
representation h(l+1) :
⇣ ⌘
h(l+1) = W (l) h(l) + Overall Message .

R-GCN e↵ectively captures multi-relational context, o↵ering a rich, contextual rep-


resentation essential for modeling complex structures in KGs. However, R-GCN has a
notable limitation: it does not di↵erentiate the importance of nodes connected through
the same type of relationship, thus assigning equal weight to all of them. This can lead
to less accurate embeddings when the importance of relationships varies.

5.1.2. From Multi-Relation to Meta-Relation


In the context of a knowledge graph, the relationships between entities have been con-
ceived as simple triples s, r, o, where only the multi type of relationship r is modelled.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 23

However, this simplistic model is often inadequate for capturing the nuanced intercon-
nections between nodes in heterogeneous graphs, where multiple types of nodes and
relations may co-exist. For the FinDKG graph, for each entity we also categorize into
multiple entity group like company or person name. As a more expressive representation,
we introduce the concept of Meta-Relation proposed by Hu et al. (2020).

Definition 5.1.1. Meta-Relation generalize the multi-relation attribution to con-


sider both edge and node. Given an event edge e = (s, r, o) that links from a source
node s to a target node o with the relation r, its Meta Relation is formally denoted as
h⌧ (s), (r), ⌧ (o)i. Here:

⌧ (s) and ⌧ (o) represent the entity types or categories of the source node s and the
target node o, respectively.

(r) signifies the type of the relationship r between s and o.

This Meta-Relation encapsulates the semantic context surrounding the event edge e,
including the types of both the node entities and the relation involved.

As an illustrative exmaple, consider the relationship between the entity OpenAI, which
is of type Company, and the entity ChatGPT, which is of type Product. The relation
between them is Invent. In the context of Meta Relations, this could be represented as:

hCompany, Invent, Producti


This encapsulates not just the Invent relationship between OpenAI and ChatGPT,
but also categorizes OpenAI as a Company and ChatGPT as a Product.

5.1.3. KGTransformer Architecture


The Knowledge Graph Transformer (KGTransformer) aims to enrich Knowledge Graph
embeddings by leveraging meta-relations within knowledge graphs. As an extension to
Graph Attention Network (GAT, Veličković et al. (2017)), KGTransformer integrates
meta-relation-based attention, message passing, and target-specific aggregation to con-
textualize node representations.
Let H (l) 2 RN ⇥D denote the output of the l-th KGTransformer layer, where N repre-
sents the number of nodes and D signifies the dimension of the node features at layer l.
This serves as the input for the succeeding (l + 1)-th layer. After sequentially applying
L such layers, we arrive at H (L) , a contextualized representation of the graph. This rep-
resentation is amenable for end-to-end training schemes and is subsequently employable
for various downstream graph-oriented learning tasks.
In general GNNs, the mutual attention between a source node s and a object node o
is articulated as follows:

H (l) [o] Aggregate (Attention(s, o) ⇥ Message(s)) (5.1)

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 24

In this equation, Attention gauges the saliency of nodes, Message computes the infor-
mational content emanating from the source node s, and Aggregate consolidates these
messages in a weighted manner based on the attention scores.

Graph Attention Networks


GAT pioneered in the incorporation of attention mechanisms into the graph neural
network landscape, enabling individual nodes to preferentially focus on salient neighbors.
The GAT architecture consists fundamentally of three phases: Attention, Message, and
Aggregate. Following the formulation in Equation 5.1, the formal operations are detailed
below:
Attention. The attention mechanism in GAT is formulated as:
⇣ ⇣ ⌘⌘
AttentionGAT (s, o) = Softmax aReLU W H [l] [s] k W H [l] [o]

Here, k signifies concatenation operation, W is a weight matrix of dimensions D0 ⇥ D,


where D and D0 are the feature dimensions before and after the transformation, respec-
tively, H [l] is the node feature matrix at layer l, and aReLU is an attention mechanism
employing a ReLU activation function.
Message. The message transmitted from a source node s to an object node o is given
by:

MessageGAT (s) = W 0 H [l] [s]


Here, W 0 is a weight matrix with dimensions D0 ⇥ D, H [l] is the node feature matrix
at layer l, and D0 is the dimension of the node features at layer l. In typical GAT
architectures, it’s common to use the same weight matrix for both the attention mech-
anism and the message-passing phase to reduce the number of learnable parameters for
computation efficiency, that is,

W0 = W
Aggregate. The aggregation operation is specified as:

Aggregate(·) = (Mean(·))
Here, refers to a nonlinear activation function, usually the sigmoid or ReLU, and
Mean(·) computes the arithmetic mean of its inputs.
GAT, despite its empirically proved proficiency in discerning relevant nodes, operates
under the restrictive premise that the source s and the object o nodes share an analogous
feature distribution, dictated by a singular weight matrix W . This assumption can be
suboptimal for complex knowledge graphs where features can exhibit temporal variations
or inter-node discrepancies. KGTransformer ameliorates this issue by introducing a more
nuanced, meta-relation based attention mechanism, thereby o↵ering a more generalized
and expressive model.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 25

Meta-relation Attention
The KGTransformer introduces an extended attention mechanism that incorporates
meta-relations, defined as triplets h⌧ (s), (r), ⌧ (o)i, where s and o are the source and
target nodes, and r is the relation between them. The meta-relation captures not only
the relationship type but also the types of the nodes involved. In this section, we delve
into the meta-relation attention.
Type-Specific Linear Projections. For the source node s, we utilize type-specific
⌧ (s)
linear projection matrices KLinear to transform the node’s feature into the Key vector,
denoted as K(s). This allows the model to account for the unique characteristics and
distributions that may be specific to a certain type of node ⌧ (s). Formally, the trans-
formation can be expressed as:
⌧ (s)
K(s) = KLinear ⇥ H (l) [s]. (5.2)
Similarly, the target node o is projected into a Query vector Q(o) using a type-specific
⌧ (o)
linear projection QLinear :
⌧ (o)
Q(o) = QLinear ⇥ H (l) [o]. (5.3)
Edge-Based Relation-Specific Weight Matrices. In knowledge graph, it’s often
the case that di↵erent edge types (r) can exist between the same node type pairs ⌧ (s)
and ⌧ (o). Therefore, a distinct edge-based weight matrix W (r) is maintained to capture
the semantic nuances of these relations.
Adaptive Meta-Relation Scaling. Furthermore, we introduce an adaptive scaling
tensor µ that contains the relative importance of di↵erent meta-relations in the form
of triplets h⌧ (s), (r), ⌧ (o)i. This scaling tensor serves as an adaptive mechanism to
fine-tune the attention scores, allowing the model to emphasize or de-emphasize certain
relationships as needed.
Using the above components, the attention head score is finally formulated as:

K(s)W (r) Q(o) · µh⌧ (s), (r),⌧ (o)i


Attn(s, r, o) = p , (5.4)
dk
p
where dk is the dimensionality of the Key vector, and the dk term is a scaling factor
commonly used in attention mechanisms to stabilize gradients during backpropagation.
Multi-Head Attention in KGTransformer. In conventional Transformer archi-
tectures, multi-head attention is an essential feature that permits the model to focus on
disparate sections of the input for varying tasks or functionalities. However, within the
framework of KGTransformer, this paradigm is considerably extended to allow multi-
head attention across heterogeneous types of nodes and relations.
Formally, let Attni (s, r, o) represent the attention vector obtained from the i-th head.
For h di↵erent heads, h distinct attention vectors are calculated. These vectors are then
concatenated to produce a composite attention vector:

Concatenation (Attn(s, r, o)) = khi=1 Attni (s, r, o), (5.5)

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 26

Here, the symbol k designates the concatenation operation, and h serves as a hyper-
parameter specifying the number of heads in the multi-head attention mechanism.
To ensure a distribution of attention that sums to one for each target node o, a Softmax
function is applied over the concatenated attention scores across all outgoing edges from
source nodes s to o:
⇣ ⌘
AttentionKGTransformer (s, r, o) = Softmax khi=1 Attni (s, r, o) , (5.6)

GAT models, despite their ability to assign di↵erent attention scores to di↵erent neigh-
bors, do not natively accommodate the nuances of di↵erent types of relations or nodes.
KGTransformer, on the other hand, explicitly integrates node types, relation types, and
meta-relation scaling, o↵ering a richer representation of the complex relationships within
heterogeneous knowledge graphs.
Recall the previously introduced illustrative example of the Invent relationship be-
tween OpenAI (s) and ChatGPT (o), with their meta-relation represented as:

hCompany, Invent, Producti


In the context of KGTransformer, each head could focus on di↵erent aspects of this
relationship, capturing the multi-faceted nature of a Company producing a Product. For
example, one head might concentrate on the financial transactions between the entities,
while another could scrutinize the technological synergy, and yet another could assess
market competitiveness. These attention vectors are then concatenated, as per Equation
5.5, to form a comprehensive representation of the relationship. This feature intuitively
enrich the GAT attention mechanism, which might only represent one generalized aspect
of the relationship, thus lacking the nuanced approach that the KGTransformer o↵ers.

Message Passing in KGTransformer


The message-passing component of KGTransformer is formulated to include specific
aspects of meta-relations, enhancing its ability to adapt to the varying distribution of
node and edge types found in a heterogeneous knowledge graph. Similar to the attention
mechanism, message passing within this architecture is enriched by incorporating meta-
relation specific properties.
Mathematically, the core operation in message passing is defined as follows:

MessageKGTransformer (s, r, o) = Concatenation (MSG-headi (s, r, o)) , (5.7)

i ⌧ (s), (r)
MSG-headi (s, r, o) = MLinear H (l) [s]WMSG . (5.8)
i
Here, MLinear denotes a type-specific linear transformation for projecting the source
node s into the i-th message space, and WMSG signifies a weight matrix that encodes
edge or relation-specific properties into the message. H (l) [s] represents the feature vector
of node s at layer l.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 27

For each pair of nodes (s, o), the Concatenation computes the message vectors across
all h heads as follows:

MessageKGTransformer (s, r, o) = khi=1 MSG-headi (s, r, o), (5.9)


Here, k represents the concatenation operation that fuses all h message vectors to form
a single message vector. This is analogous to how the attention vectors are combined in
Equation 5.5, reinforcing the importance of capturing diverse aspects of the relationships
in the graph.

Aggregation in KGTransformer
After the multi-head attention and message computations for each edge (s, r, o), from the
source node s to the target node o, we proceed to aggregate these quantities to generate
an updated feature representation for o. Mathematically, the aggregation operation is
computed as:

M
Hr(l+1) [o] = AttentionKGTransformer (s, r, o) · MessageKGTransformer (s, r, o). (5.10)
L
Here the symbol represents the aggregation operation, where the aggregated feature
(l+1)
vector Hr [o] for a node o is computed by weighting and summing the messages
received from its neighboring nodes s. This equation enables the target node o to
assimilate information from its neighboring source nodes s, which may have diverse
types of feature distributions and relationships. The attention scores, normalized to
sum up to one for each target node o, act as optimal weights for this aggregation.
(l)
The aggregated feature vector Hr [o] is then transformed into a type-specific fea-
⌧ (o)
ture space. This transformation is realized by applying a linear projection ALinear and
incorporating a residual connection, as shown below:

H (l+1) [o] = ALinear (Hr(l+1) [o]) + H (l) [o]. (5.11)


The residual connection ensures that the updated node features not only encapsulate
the aggregated contextual information but also retain their type-specific characteristics.

KGTransformer as Knowledge Graph Encoder


The KGTransformer model extensively employs meta-relations h⌧ (s), (r), ⌧ (o)i to pa-
rameterize weight matrices for di↵erent node types and relations. Unlike vanilla Transformer-
based models like GAT, KGTransformer is better suited for handling the complexity and
distribution di↵erences in multi-relational knowledge graphs.
Moreover, KGTransformer’s approach to parameter sharing allows for better gener-
alization and adaptability, especially for relations with fewer occurrences. This enables
the model to handle a diverse set of relations while using a relatively small parameter
set.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 28

The stacked KGTransformer layers can serve as an KG encoder layer, replacing tradi-
tional knowledge graph GNN like RGCN to propagate initial node embeddings into more
contextually rich embeddings. In cases where the knowledge graph lacks entity types,
the meta-relation can be simplified to focus solely on multi-relational attention. In this
configuration, KGTransformer acts as a variation of multi-relational GAT relative to
Graph Convolution Networks (GCN), thereby facilitating various downstream KG tasks
like node classification and link prediction. In our case, we adopt KGTransformer to
encode the raw entity embeddings into the temporal and structural embeddings.

5.1.4. Time-Evolving Update with Recurrent Neural Networks


TKGs are not static but evolve over time. The introduction of temporal dynamics adds
an extra layer of complexity and provides rich, time-sensitive contexts for various tasks.
We extend the KGTransformer framework to address these challenges, capturing both
the temporal and structural evolution of entities and relations in the knowledge graph.

Capturing Temporal Dynamics


To capture the evolving temporal aspects of entities and relations, we introduce temporal
embeddings denoted by ti for entities and tr for relations. Specifically, these temporal
(l+1,t) (l+1,t)
embeddings at layer l + 1 and time t are represented as ti and tr .
(l+1,t)
ti = KGTransformer(ti , Gt ), (5.12)
t(l+1,t)
r = KGTransformer(tr , Gt ), (5.13)
Here, Gt represents the state of the knowledge graph at time t.
We incorporate Recurrent Neural Networks (RNNs) to capture the sequential evolu-
tion of these temporal embeddings over di↵erent time slices:

t(l,t) = RNNtemporal (t(l,t) , t(l,t 1)


), (5.14)
This RNN-based update mechanism allows the model to adaptively learn from the
sequential dependencies between events and provides an e↵ective way to capture dynamic
interactions among entities. See detaills of RNN layer in Appendix C.1.

Capturing Structural Evolution


Analogous to the temporal case, we introduce structural embeddings ui and ur to cap-
ture the time-evolving structural properties of entities and relations. The update equa-
tions for these embeddings mirror the temporal embeddings but employ separate RNNs,
RNNstruct entity for entities and RNNstruct relation for relations:
(l,t)
ui = RNNstruct entity (u(l,t) , u(l,t 1)
),

u(l,t)
r = RNNstruct relation (u(l,t) , u(l,t 1)
).

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 29

These structural embeddings are designed to model the global structural relationships
between di↵erent entities and relations, evolving over time.
KGTransformer combines these evolving temporal and structural embeddings to pro-
duce enriched, context-sensitive embeddings. Specifically, the embeddings ti and ui for
a given entity i can be combined to generate a comprehensive representation, encom-
passing both temporal and structural profiles. At the edge level, KGTransformer further
enriches the representation by concatenating the evolving entity-specific and relation-
specific embeddings.
By integrating time-evolving embeddings, KGTransformer achieves a high level of
adaptability to temporal and structural changes in the TKG. These rich embeddings
serve as core inputs for downstream modelling of KG event time and structure, which
will be elaborated upon in subsequent section.

5.2. Probabilistic Framework for Temporal Knowledge


Graph Learning
This section introduces a probabilistic approach to model the distribution of TKGs,
building upon the entity and relation embeddings obtained from KGTransformer. We
leverage a decomposition of structural and temporal probabilities, initially found by
Park et al. (2022). This framework is especially useful for temporal link prediction and
optimization of model parameters through loss functions.
To facilitate the forthcoming discussions, we follow the notation shown in Table 2.1
to formulate the framework.
Temporal Graph Representation. Let G[0:t] represent a TKG composed of a series
of occured events up till time t, and e = (s, r, o) represent an event triplet within the
graph. The goal is to model the probability distribution of the TKG up to time t,
denoted as p(G[0:t] ).
To build our TKG model, we propose the following three key assumptions:

1. Temporal Dependency: We posit that the events at a specific time t are influ-
enced by all previous events.

p(Gt |G<t ) 6= p(Gt ) (5.15)

This equation signifies that the probability distribution over the graph at time t
is not independent but depends on the preceding events captured in G<t .

2. Conditional Independence at Same Time t: Events occurring presently at


time t are assumed to be conditionally independent given all prior events up to
that time. Y
p(Gt |G<t ) = p(e|G<t ) (5.16)
(e,t)2Gt

Here, the equation states that, given the history G<t , each temporal event (e, t) at
time t is conditionally independent of the other events.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 30

3. Initial State Assumption: We assume p(G0 ) follows a specific prior distribu-


tion as external input, which could be uniform, empirical, or based on a parametric
model. This initial distribution serves as the starting point for the temporal evo-
lution of the TKG.

Utilizing these assumptions, the joint probability distribution p(G[0:t] ) can be factorized
as:
t
Y t
Y Y
p(G[0:t] ) = p(G0 ) p(G |G< ) = p(G0 ) p(e, |G< ) (5.17)
=1 =1 (e, )2G

The conditional probability term p(e, t|G<t ) at time t can be further decomposed using
the chain rule of conditional probabilities into:

p(e, t|G<t ) = p(t|e, G<t ) ⇥ p(e|G<t ) (5.18)


Equation (5.18) partitions the conditional probability into two components: p(e|G<t )
captures the evolving network structure, while p(t|e, G<t ) models the temporal dynam-
ics. Our primary objective is to accurately estimate these components to capture both
temporal and structural characteristics of TKGs.

5.2.1. Modeling Event Time Using Temporal Point Processes


To model the intricacies of event timings in TKGs, we adopt Temporal Point Processes
(TPPs) as proposed by Park et al. (2022). TPPs o↵er a statistical framework for repre-
senting events occurring irregularly over time.

Definition 5.2.1. Inter-Event Time. Let ⌧n denote the time di↵erence between
consecutive occurrences of the same event triplet e in a TKG. Formally, given a sorted
sequence of times {tn 1 , tn } corresponding to e, we define:

⌧ n = tn tn 1, for n > 1, (5.19)

and for the first occurrence,


⌧1 = 0. (5.20)

The inter-event time parameter ⌧ e↵ectively captures the time elapsed between suc-
cessive events and serves as a crucial input feature for predictive models dealing with
time. The utility of ⌧ lies in its ability to encapsulate local temporal dynamics, thereby
enriching the feature space for subsequent predictive tasks.
Temporal Point Processes. TPPs provide a probabilistic mechanism for represent-
ing event timings via the conditional intensity function (t|G<t ), defined as:
1
(t|G<t ) = lim P (Event in [t, t + t)|G<t ) .
t!0 t
Here, G<t signifies the event history up to time t.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 31

To adapt TPPs for TKGs, we introduce the conditional probability density function
p(t|e, G<t ) in previous Equation (5.18) as follows:

p(t|e, G<t ) = (t|G<t ) t,


where t is an infinitesimal time window around t.
Density Estimation via Mixture Model. For a robust and flexible approach
to estimating inter-event time ⌧ , we utilize a mixture of log-normal distributions for
p(⌧ |e, G<t ):
M
X
p(⌧ |e, G<t ) = wm (⌧ ; µm , m ),
m=1

where (⌧ ; µm , m ) is the log-normal density function, and wm , µm , m are the mixture


weight, mean, and standard deviation for the mth component, respectively. The mixture
weights are constrained to sum to 1.
The mixture model parameters are learned through a Multilayer Perceptron (MLP)
neural network that takes a context vector c as input for the specific event e , where
c is the concatenation of entity and relation-specific temporal embeddings derived from
KGTransformer. Please refer to details about the MLP layer in Appendix C.2.

5.2.2. Modeling Network Structure Through Conditional Event


Density
TKGs dynamically evolve as new events take place. For example, if a war erupts between
two countries, this impacts not only their immediate relationship but also influences the
wider network within the graph. This section discusses how to model the conditional
probability of observing a given triplet e at time t, denoted as p(e|G<t ), or specifically
p(s, r, o|G<t ).
To approximate p(e|G<t ), we utilize embeddings that represent the temporal struc-
tural components of both entities and relationships. Let u⇤i and u⇤r be the structural
embeddings for entity i and relation r updated until time t, which are derivied from the
KGTransformer in the previous section. Note that the asterisk (⇤) denotes dependence
on the history up till time t.
Global Graph Representation. To enable accurate prediction of events in the
TKG, it is crucial to capture not only the local attributes of individual entities and
relations but also the global state of the entire network up to a certain time t. A global
graph-level representation g ⇤ serves this purpose by o↵ering a high-level summary that
represent the structural characteristics of all entities in the graph up to time t.

Definition 5.2.2. Graph-level Representation. Global embeddings g ⇤ is defined as


a high-dimensional vector that aggregates the structural embeddings u⇤i of all entities in
the TKG up to time t. Formally, each dimension j of g ⇤ is computed as follows:

gj⇤ = max {u⇤i,j }, (5.21)


i2Entities(G<t )

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 32

where maxi represents taking the maximum value over the j-th dimension of all u⇤i
vectors, for i being all entities in G<t and j being the dimension index. This aggregated
vector g ⇤ serves as a global conditioning variable for computing event-related conditional
probabilities p(e|G<t ).

The need for a global representation arises from its ability to capture complex de-
pendencies between entities and relations that might not be directly connected. It acts
as a bridging feature that unifies disparate local structures into a coherent whole, thus
providing a richer context when estimating conditional probabilities related to events.
Decomposing the Conditional Probability in Symmetry. Lastly, we decom-
pose p(e|G<t ) = p(s, r, o|G<t ) into entity-relationship level components to model. To
account for the edge directional nature of the TKG, we propose two separate conditional
probability models for both source and object entity:

Model for Predicting the Source Entity Link (s): In this model, we decom-
pose the joint conditional probability p(s, r, o|G<t ) as:

p(s, r, o|G<t ) = p(o|G<t ) ⇥ p(r|o, G<t ) ⇥ p(s|r, o, G<t )

We parameterize each term using an MLP as follows:

p(s|r, o, G<t ) = MLP([u⇤r k u⇤o k g ⇤ ]),


p(r|o, G<t ) = MLP([u⇤o k g ⇤ ]),
p(o|G<t ) = MLP(g ⇤ ).

Model for Predicting the Object Entity Link (o): For symmetry, we also
define a separate model with the decomposition:

p(s, r, o|G<t ) = p(s|G<t ) ⇥ p(r|s, G<t ) ⇥ p(o|s, r, G<t )

Similarly, each term in this model can be parameterized using an MLP by using
as source entity embeddings u⇤s to replace the object entity embeddings u⇤r in the
previous model.

These two models aim to capture the evolving structure by considering both the source
and object entity links. Each model is tailored to predict either s or o conditioned on
the other variables, thereby providing a more symmetrical approach to modeling the
dynamics of the graph.

5.2.3. Parameter Learning and Inference


Training the KGTransformer model to adequately capture the dynamics of TKGs re-
quires a specialized approach. In this section, we introduce the proposed loss function
and the learning algorithm, both of which are designed to allow the model to learn both
the temporal and structural elements e↵ectively.

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 33

Composite Loss Function


The composite loss function L aims to holistically capture three major aspects of a
dynamic knowledge graph: the timing between events Liet , the source entity’s relation
to other entities Ltriple src , and the object entity’s relation to other entities Ltriple obj .
Formally, given an event triplet e = (s, r, o) at time point t in a TKG, we define the
individual loss components as:

Liet (e, t) = log p(t|e, G<t ) = log p(⌧ |we⇤ , µe⇤ , e⇤ )


Ltriple src (e, t) = log p(s, r, o|G<t )
= log p(o|G<t ) log p(r|o, G<t ) log p(s|r, o, G<t )
Ltriple obj (e, t) = log p(s, r, o|G<t )
= log p(s|G<t ) log p(r|s, G<t ) log p(o|s, r, G<t )
Here, Liet captures the question of “when” — it tries to model the time that should
elapse between events in the KG. Ltriple src and Ltriple obj capture the “who does what
to whom” — they model the interactions between entities and relations. While both
model the same distribution, they serve distinct purposes: Ltriple src focuses on predicting
the object entity given the source entity and the relation. It uses the source entity
as the starting point to understand how interactions emanate from it; Ltriple obj does
the opposite by looking at the interactions coming into the object entity. These two
components together allow our model to grasp the symmetric nature of relationships in
KGs.
Finally, we combine these individual loss components into a single composite loss
function L as follows:
XX
L= 1 Liet (e, t) + 2 Ltriple src (e, t) + 2 Ltriple obj (e, t) (5.22)
t e2Gt

In this equation, 1 and 2 are hyperparameters that balance the importance of inter-
event timing against event link prediction.

Learning Algorithm
Given the complex temporal and relational dependencies in TKGs, simplifying events
into independent sequences to train is suboptimal, as it would lead to a loss of relational
information. Moreover, tracking the entire history for each entity and relation in a
large-scale, long-term TKG could be computationally expensive given the RNN model
architecture.
To tackle these challenges, we design our parameter learning algorithm to process
events in parallel along timestamps, as is shown in Algorithm 1. We also employ trun-
cated backpropagation (TBPTT) to limit the computational and memory requirements.
TBPTT is a variant of the standard backpropagation algorithm adapted for sequence-
based data. In TBPTT, backpropagation is applied only to a truncated sequence of
temporal events to approximate the gradient over the entire sequence. Specifically in

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 34

this training for KGTransformer, the backpropagation is truncated every b time steps to
make the training process more manageable and efficient.
This approach allows KGTransformer to learn highly contextualized representations
while efficiently handling the complexities introduced by dynamic, large-scale knowledge
graphs.

Algorithm 1: Optimizing Parameters for KGTransformer


Input: TKG Gtrain training graph data, Gvalidation validation graph, early
stopping patience P , maximum number of training epochs E, maximum
time steps B for TBPTT
1 Initialize epoch 1
2 repeat
3 Initialize zero embeddings for Entities and Relations of G0
4 foreach t in Time Stamps of Gtrain do
5 Get the temporal graph Gt
6 Compute loss Lt for concurrent events in Gt as (5.22)
7 Backpropagate the loss to update model parameters, limit the error
propagation to the last B time steps to approximate the full gradient
8 foreach i in Entities of Gt do
9 Compute and update embeddings for entity i
10 foreach r in Relations of Gt do
11 Compute and update embeddings for relation i

12 Evaluate model performance on Gvalidation


13 epoch epoch + 1
14 until No improvement in Gvalidation for P times or epoch = E
Output: Optimized KGTransformer parameters

Link Prediction Inference


This subsection delves into the two primary forms of temporal link prediction inference
task conducted by the KGTransformer model: single-step and multi-step inference over
time. These methods serve as the practical applications of the KGTransformer’s learnt
temporal and relational embeddings.
Single-step Inference Over Time. The goal of single-step inference is to predict
either the most likely object o or the source s for a given query quadruple q = (s, r, o, t),
leveraging the accumulated knowledge in the form of the temporal graph G<t up to time
t.
To predict either o or s, we generate perturbed quadruples in two ways: q 0 = (s, r, o0 , t)
for object prediction, and q 0 = (s0 , r, o, t) for source prediction. These perturbed quadru-
ples involve replacing either o with o0 or s with s0 , iterating over all possible entities in
the graph. The scoring for these perturbed quadruples employs the following generalized
conditional probability functions:

Electronic copy available at: https://ssrn.com/abstract=4608445


5. Dynamic Knowledge Graph Learning with KGTransformer 35

pobj (q 0 ) = log p(o0 |s, r, G<t ) = MLP([u⇤s k u⇤r k g ⇤ ]) (5.23)

psrc (q 0 ) = log p(s0 |o, r, G<t ) = MLP([u⇤o k u⇤r k g ⇤ ]) (5.24)


Here, u⇤s , u⇤r , and u⇤o are the embeddings for the source, relation, and object respec-
tively, and g ⇤ the global graph-level embeddings, which are introduced in the previous
modelling section. The embeddings are all rendered by KGTransformers at time t.
The scores for q 0 are sorted in descending order. This scoring mechanism leverages
the embeddings and temporal relationships captured during the training phase. It in-
tegrates these contextual factors to make temporally-sensitive and relationally-informed
predictions.
Multi-step Inference Over Time. Beyond single-step link prediction, we also con-
sider the more challenging problem of Multi-step Inference over Time using the sampling
method proposed by Jin et al. (2019), aiming to predict a sequence of future events at
time t + t where t > 0. The framework can be formalized as:

X
p(Gt+ t |G<t ) = p(Gt+ t , Gt:t+ t 1 |G<t ) (5.25)
Gt:t+ t 1
X
= p(Gt+ t |Gt:t+ t 1 , G<t )p(Gt:t+ t 1 |G<t ) (5.26)
Gt:t+ t 1

= EGt:t+ t 1 |G<t
[p(Gt+ t |Gt:t+ t 1 , G<t )] (5.27)
⇡ p(Gt+ t |Ĝt:t+ t 1 , G<t ) (5.28)

Here, Ĝt:t+ t 1 refers to a series of events, which we could sample from the model’s
conditional distribution as an approximation of the graph states between the current
and future times. Starting from the immediate next state p(Gt |G<t ), we iteratively
update our approximations to project what the graph might look like at t + t. This
iterative methodology allows KGTransformer to predict intricate event sequences that
may happen in the future.

Electronic copy available at: https://ssrn.com/abstract=4608445


36

6. Experiments
We conduct comprehensive experiments to evaluate the efficacy of our proposed KG-
Transformer model, particularly focusing on the extrapolative temporal link prediction
task on Temporal Knowledge Graphs (TKGs).

6.1. Experimental Setup


We benchmark KGTransformer against an array of both traditional static knowledge
graph models and contemporary temporal reasoning models. We employ a suite of five
publicly accessible real-world TKGs alongside the new Financial Dynamic Knowledge
Graph (FinDKG), uniquely created for this research.
Baseline KG Models. Our baseline models include a variety of approaches suited
to both static and temporal graphs.

Static Graph Models. By disregarding edge time-stamps, we assemble a static,


cumulative representation of the graph that accounts for all training events. This
model employs multi-relational graph representation learning techniques, where we
specifically opt for R-GCN given its foundational role in temporal methods.

Extrapolative Temporal Graph Models. Our temporal baselines feature


state-of-the-art models such as RE-Net and EvoKG. These models are explicitly
designed for the task of extrapolative prediction of future edges. We intentionally
exclude interpolation-focused models like Know-Evolve from this category.

Variants of KGTransformer. To examine the value added by KGTransformer’s


meta-relation-specific framework, which includes entity types as well as relations,
we compare against a stripped-down version — KGTransformer w/o node type.
This variant treats all entities as a default type, thus only relying on multi-
relational edge information. This serves as a control group test for testing the
efficacy of incorporating entity type data when available.

Predictive evaluations are conducted on future timestamps not encountered during


the training phase, thus utilizing hold-out test sets. Additionally, a validation set is
employed to implement an early stopping mechanism to mitigate potential overfitting.
The time sequence thus follows a strict order: the training set, followed by the validation
set, and finally the test set.
Entities newly introduced at test time are subject to random initialization of their em-
beddings. To account for the inherent variability in deep learning graph model training,

Electronic copy available at: https://ssrn.com/abstract=4608445


6. Experiments 37

we employ three distinct random seeds . The final results are reported as averages over
these di↵erent training runs, solidifying the robustness of our findings. It is worth noting
that the results across di↵erent seeds exhibit minimal variance under these large-scale
real-world knowledge graph data.

6.2. Implementation Details


We developed a specialized Python-based library named DKG (Dynamic Knowledge
Graph) to implement the KGTransformer model and to support efficient model infer-
ence. This library is constructed atop the PyTorch framework and leverages the Deep
Graph Library (DGL, Wang et al. (2019b)) for graph deep learning operations, adher-
ing to a modular design paradigm. Within this code architecture, the KGTransformer
model is instantiated with an embedding dimensionality of 200 for both structural and
temporal embeddings. The model comprises two layers of KGTransformer blocks, each
layer furnished with 8 meta-relation-styled attention heads. The source code of our
KGTransformer model is made publicly accessible via GitHub.
As for the baseline models, we adhere to the original implementations provided in the
respective papers for RE-Net and EvoKG. These models are executed using their default
hyperparameter configurations. For the implementation of R-GCN, we utilize the source
code available in the Deep Graph Library .
All evaluated models, including the baselines, employ the AdamW optimization al-
gorithm (Loshchilov and Hutter, 2017) with a learning rate set to 5 ⇥ 10 4 . An early-
stopping criterion, triggered after 10 epochs without improvement in the validation set
performance, is also employed to optimize model performance. Both model training
and evaluations are consistently conducted on an identical computational environment,
specifically a single NVIDIA A100 GPU cloud server, equipped with 40GB of memory.

6.3. Performance Comparison


The primary evaluation task we focus on is temporal link prediction, which involves
predicting the likelihood of relationships between entities in a knowledge graph at future
time stamps. The quality of these predictions is evaluated based on predicting the true
test quadruples q = (s, r, o, t), where s is the source entity, r is the relationship type, o
is the target entity, and t is the timestamp.
The performance of the model in this task is quantitatively measured using two pri-
mary metrics: Mean Reciprocal Rank (MRR) and Hits@n. Higher values of MRR and
Hits@n (specifically, Hits@{3,10} in our experiments) imply better model performance.
In the inference phase, the models are dynamically updated at each time point based on
the most recent snapshot of the graph, thereby facilitating a more accurate prediction
in subsequent time steps.

The selected random seeds are 0, 1, and 41.


https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn

Electronic copy available at: https://ssrn.com/abstract=4608445


6. Experiments 38

Definition 6.3.1. Mean Reciprocal Rank (MRR). Given a set Q of test quadruples,
the Reciprocal Rank (RR) for a test quadruple q is computed as:
1
RR(q) =
rankq
where rankq is the rank of the true link in the sorted list of scores for q and its corre-
sponding perturbed quadruples.
The Mean Reciprocal Rank (MRR) is then defined as the average RR over all test
quadruples:
1 X
MRR = RR(q)
|Q|
q2Q

Definition 6.3.2. Hits@n. Hits@n measures the fraction of true links that appear
within the top-n positions in the ranked list of scores for each test quadruple and its
perturbed variants. Mathematically, it is defined as:

Number of true links in top-n


Hits@n =
Total number of test quadruples

6.3.1. Results on Benchmark Temporal Knowledge Graphs


Table 6.1 displays the temporal link prediction scores across the five benchmark TKGs.
The static method R-GCN lags behind its temporal counterparts, underscoring the im-
portance of considering temporal dynamics of these TKGs. All temporal reasoning
models give competitive performance. KGTransformer relatively excels in the YAGO
and WIKI datasets, which feature a high density of temporally consistent edge events.
It is worthwhile to note, under these benchmark graph datasets there is no entity type
and thus the KGTransformer only considers the multi-relational attention.

YAGO WIKI GDELT ICEWS14 ICEWS18

Model MRR H@3 H@10 MRR H@3 H@10 MRR H@3 H@10 MRR H@3 H@10 MRR H@3 H@10

R-GCN 27.43 31.24 44.75 13.96 15.75 22.05 12.17 12.37 20.63 15.03 16.12 31.47 15.06 16.49 28.99

RE-Net 46.35 51.93 61.47 31.45 34.23 41.15 19.06 20.31 33.21 23.81 26.57 42.62 26.81 30.86 45.52

EvoKG 49.86 57.69 65.42 42.56 47.18 52.34 18.27 19.51 31.92 24.24 27.25 41.97 26.80 30.79 45.27

KGTransformer 51.33 59.22 67.15 44.32 49.27 53.81 18.21 19.29 31.91 23.98 26.89 41.22 26.50 30.10 44.17

Table 6.1. Performance comparison across di↵erent models on five real-world TKG
benchmark datasets. Best results are in bold.

6.3.2. Results on Financial Dynamic Knowledge Graph


The results on the novel FinDKG dataset are depicted in Figure 6.1. In this dataset, en-
tities are enriched with metadata attributes for their types. Consistent with previous ob-
servations, the static R-GCN performs suboptimally compared to its temporal counter-

Electronic copy available at: https://ssrn.com/abstract=4608445


6. Experiments 39

parts. RE-Net and EvoKG exhibit closely matched performance. KGTransformer, when
bereft of node type modeling denoted as “KGTransformer w/o node types”, aligns closely
with these temporal baselines. However, when entity type information is integrated into
KGTransformer as auxiliary features, the model (“KGTransformer”), achieves meaning-
ful improvements — approximately 10% uplift in both MRR and Hits@3,10 metrics.
By leveraging both entity and relation meta-data, KGTransformer surpasses existing
state-of-the-art methods in temporal link prediction tasks.

Figure 6.1. Performance comparison of models on the FinDKG dataset, evaluated on


MRR and Hits@3,10.

6.4. Ablation Study


We perform an ablation study on the FinDKG dataset, examining the influence of vari-
ous hyperparameters on KGTransformer model performance. For a baseline comparison,
our control configuration consists of an embedding size of 200, two KGTransformer lay-
ers, and four attention heads per layer. We report the MRR and H@3 link prediction
performance metrics change of KGTransformer on the FinDKG dataset. In general,
our findings shown in Table 6.2 suggest that the KGTransformer demonstrates rela-
tive stability across di↵erent hyperparameter settings, exhibiting marginal performance

Electronic copy available at: https://ssrn.com/abstract=4608445


6. Experiments 40

variations.
Embedding Size. We experiment with di↵erent embedding sizes, namely 100, 200,
and 400, for both temporal and structural embeddings. The results indicate that an
embedding size of 400 yields superior performance. We postulate that the increased
dimensionality allows for a richer feature space, e↵ectively capturing complex relations
and temporal dynamics.
Number of Transformer Layers. The architecture of the KGTransformer involves
stacking multiple transformer layers. We explore configurations with 1,2, and 3 layers
and observe that a single KGTransformer layer o↵ers the most robust performance. We
conjecture that more layers lead to overfitting.
Number of Attention Heads. To analyze the impact of the number of attention
heads, we experiment with 1, 4, and 8 attention heads in the KGTransformer layers.
The results suggest that 8 attention heads yield slightly optimal performance over the 4
attention heads version.

Figure 6.2. Parameter sensitivity on KGTransformer. The subfigures exhibit the e↵ects
of varying (a) the embedding size for both temporal and structural embed-
dings, (b) the number of stacked KGTransformer layers, and (c) the number
of meta-relation attention heads for each KGTransformer.

Electronic copy available at: https://ssrn.com/abstract=4608445


41

7. Applications of Financial Dynamic


Knowledge Graph
In this chapter, we detail the application of the KGTransformer model on the FinDKG
dataset, denoted as FinDKG model for simplicity, to address real-world financial chal-
lenges. Harnessing the power of this trained model, the chapter delves into generating
insightful signals for various financial applications. Specifically, we introduce two pivotal
use cases aimed at addressing foundational challenges in finance: risk management and
thematic investing. While previous chapters have established the theoretical foundation
and evaluated the KGTransformer’s capabilities, this chapter extends that evaluation by
deploying the model in real-world scenarios as out-of-sample case studies.
The KGTransformer-powered FinDKG model o↵ers several key advantages that set it
apart from traditional financial models and conventional static knowledge graphs:

Intelligent Reasoning: Beyond the practice of conventional financial NLP, which


often revolves around simple mention counts or frequencies, the dynamic knowlege
graph model is adept at drawing inferences, uncovering hidden relationships and
connections in a more logical way.

Dynamic Expressiveness: Unlike conventional static knowledge graphs, which


are limited to modeling stable knowledge, the FinDKG stands out with its inherent
dynamism. It captures the time-varying interdependencies of financial entities,
mirroring the intrinsically dynamic interactions of variables within the financial
sphere.

Forward-looking Prediction: Distinctly designed with an extrapolative ap-


proach, the FinDKG is poised to anticipate and capture forthcoming events, grant-
ing it a predictive edge.

In the realm of investment and risk management, the capacity for intelligent reasoning
and inference is paramount. This is especially true in the complex landscape of finan-
cial interdependencies — spanning from individual companies and markets to global
economies. The FinDKG model leverages graph-based techniques to provide a struc-
tured approach for inferring these complex relationships, making it uniquely suited for
tasks such as quantifying risk within the framework of complex contagion - a phenomenon
where multiple sources of exposure are necessary to induce a behavioral change.

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 42

7.1. Dynamic Financial Risk Tracking with FinDKG


At the core of the relative advantage of FinDKG over static Knowledge Graphs (KGs)
lies in its adeptness at modeling temporal dependencies, which can generate temporal
graph embedding to adapt to the changing economics and financial market environment,
thus better modeling the uncertainty of event. Moreover, such dynamic signal renders
the capability to process the trend of the potential risk events.
To implement this, we form a series of FinDKGs where rolling 1-month snapshot
knowledge graphs were assembled every week on Sundays. These graphs stored the
knowledge triplets of the preceding month, o↵ering a fresh perspective on contemporary
financial information and status. A sequential assessment was then conducted on specific
risk entities within these snapshots over time.
Four graph metrics were used to quantify the significance of an entity at each temporal
knowledge graph:

Degree Centrality: This measures the number of edges a node has. In our finan-
cial knowledge graph context, a high degree centrality indicates a financial entity,
such as a market or company, with numerous direct connections or interactions.

Betweenness Centrality: This measures the extent to which a node lies on


paths between other nodes. Nodes with high betweenness centrality can act as
bridges or bottlenecks in the network. In the FinDKG context, an entity with high
betweenness centrality might be an intermediary that facilitates many impacts as
a contagion triger among other entities.

Eigenvector Centrality: This provides scores to all nodes based on their connec-
tions, with connections to higher-scored nodes being more valuable than those to
lower-scored nodes. In FinDKG, a high eigenvector centrality score could indicate
an financial entity that is not only well-connected as of high degree centrality, but
is also linked to by other significant entities.

PageRank: Originally designed to rank web pages, PageRank operates on the


premise that important nodes are likely to be connected to by other important
nodes. It can be interpreted similarly to eigenvector centrality.

To ensure comparability across di↵erent centrality measures and across time, we in-
troduced a time series standardization technique. At each point in time, an index was
computed using the rolling 1-year z-score of the original measure. Under the assumption
of a normal distribution, this z-score is economically interpretable, where high values
correspond to rare, “large sigma” events. To compute the time series z-score for an
entity graph centrality score xt at time t, the following equation is utilized:
xt µt
zt = (7.1)
t
where:

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 43

µt represents the rolling mean calculated over the past 1-year (or 52 weeks if weekly
sampled) up to time t.

t denotes the rolling standard deviation over the same duration.

Covid-19 Case Study. The global Covid-19 pandemic, a paradigmatic exogenous


shock, was selected as a case study. Figure 7.1 chronicles the development of the Covid-19
entity as inferred by FinDKG. FinDKG’s centrality measures e↵ectively capture pivotal
moments in the pandemic timeline. These measures highlight the January 2022 outbreak
in China and its subsequent global impact in March. Interestingly, PageRank stands
as the lone centrality metric that did not promptly register the initial lockdown in
China. However, both Eigenvector and PageRank metrics were quick to emphasize the
significance of the initial vaccine releases by Pfizer and Moderna in December 2022.

Figure 7.1. Evolution of Covid-19 entity centrality measures over time. All indices are
sampled at weekly frequency and transformed using rolling 1-year z-score
standardization.

To provide a more holistic perspective, we introduce an ensemble approach by comb-


ing various centrality measures. This KG-based composite centrality is defined as the
equal-weighted average of the time series z-scored Degree, Betweenness, and Eigenvec-
tor centralities, with PageRank excluded due to its observed lag.
Moreover, a standard mention-based measure, commonly utilized in financial NLP,
serves as a comparison metric. This news coverage metric regards entity as topic and
evaluates topic significance based on the frequency of news headlines mentioning the
topic within a given timeframe. For Covid-19, headlines containing specific keywords
like ”covid”, ”covid-19”, ”global pandemic”, and ”coronavirus” were tracked. As is

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 44

shown in Figure 7.2, both indices display converging trends; however, the KG centrality
o↵ers distinct structural insights, spefically the global recovery from Covid in early 2022,
identifying events that have a more profound systematic impact.

Figure 7.2. Trend of Covid-19 composite centrality in comparison with news mention-
based measure. The composite centrality is an equal-weight of time series z-
scored Degree centrality, Betweenness centrality, and Eigenvector centrality.
Both indices are generated at weekly frequency.

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 45

7.2. FinDKG for Thematic Investing


Thematic investing is an investment strategy that targets specific themes or trends that
are anticipated to shape the future landscape of industries and economies (Somefun
et al., 2023). Rather than adhering to traditional sector-based approaches or geographic
classifications, thematic investing focuses on overarching trends, such as artificial in-
telligence, clean energy, or aging populations, aiming to capitalize on their long-term
potential. This approach seeks to identify and invest in companies or assets poised to
benefit from these structural shifts, regardless of their sector or location.
While structured financial datasets provide extensive quantitative metrics, such as
balance sheet data, they fall short in capturing the intangible thematic influences. Tex-
tual data, although rich in information, often o↵er noisy measures of thematic exposure.
For instance, companies engaging in “greenwashing” frequently disclose Environmental,
Social, and Governance (ESG) commitments without substantive action.
Knowledge graphs constructed from financial texts can disambiguate semantic mean-
ings to quantify a company’s exposure to targeted themes by analyzing various data
sources including official filings, news articles, and patents. However, the common prac-
tice static knowledge graphs fail to account for the evolving nature of thematic trends,
exacerbating investment risks associated with backward-looking historical data.
Dynamic knowledge graph, like FinDKG model, o↵ers a compelling solution to this
challenge. FinDKG models temporal dependencies and generates temporal graph em-
beddings, thereby facilitating more accurate and timely thematic measure. Furthermore,
FinDKG is designed to make forward-looking estimates of events, enhancing its utility
for identifying genuinely thematic-exposed companies.
AI Theme Investing Case Study. We demonstrate the utility of FinDKG in
gauging corporate exposure to AI, a theme that has seen heightened interest since the
launch of OpenAI’s ChatGPT. The objective is to quantitatively measure how closely
aligned financial entities are to AI and to generate forward-looking exposure scores. This
is framed as a temporal link prediction problem, a task for which the KGTransformer
model is particularly well-suited. Specifically, we employ the model to predict the most
probable event triplets relating to AI entity.
To facilitate cross-sectional comparison, we limit our entity pool to companies (equity).
We further expand the AI measure by considering both supply and demand sides, i.e.,
companies that produce AI technologies and those that are disrupted by them. This
duality is captured through two factors:
Figure 7.3 illustrates the top 20 companies identified by FinDKG as AI creators, plus
the top names impacted by AI, as of January 1st, 2023. The model was trained on
graph data from January 2018 to September 2022, while using the trailing three months
as a validation set for early stopping. The model forecast represents an out-of-sample
prediction of AI linkage likelihood for the upcoming week.
The list demonstrates some company overlap but generally features a diverse pro-
file. For instance, semiconductor companies like AMD and NVIDIA, along with Big
Tech firms like Google, are prevalent on the AI creator side, while companies like UPS
feature on the AI-impacted list. NVIDIA emerges as the most AI-exposed company,

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 46

Figure 7.3. Top 20 companies identified as AI creators and those most impacted by AI,
as of January 1, 2023, according to FinDKG.

demonstrating significant influence both as an AI producer and an entity impacted by


AI.
Interestingly, Microsoft, which owns OpenAI, does not rank highly in AI exposure
at the date according to the metrics. A detailed review of the source news corpus
reveals that as of late December 2022, only three articles related to ChatGPT were
captured as financial related news. This paucity suggests room for enhancing FinDKG
by incorporating additional text data sources like patent applications dataset (Suzgun
et al., 2022) for a more richful information set.
By executing this AI case study, we underline the efficacy of FinDKG model in refining
thematic investment strategies. The model’s ability to adapt to temporal changes and
predict future states o↵ers a significant advancement over traditional approaches, setting
a new precedent in the thematic investing domain.

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 47

7.3. An Open-source FinDKG Portal for Global Financial


Systems
Finally, we introduce a cutting-edge, real-time FinDKG AI network portal platform.
This FinDKG portal o↵ers a stream of timely analytical insights into global macroeco-
nomics and finance, all driven by the underlying engine of the FinDKG coupled with
the KGTransformer graph embedding model. Taking a step further to democratize the
potential of knowledge graphs in finance, this system is open-source, presented through
an online portal , and designed to continuously update and process the future incoming
data streams.
The primary focus of the FinDKG portal is to identify key influencing global macroe-
conomics and financial markets. It focuses specifically on entities classified as events,
concepts, or economic indicators, which act as exogenous shocks in macroeconomic con-
texts (Bybee et al., 2021). For instance, the Covid-19 pandemic serves as a case study
exemplifying an external shock whose impacts are actively monitored by FinDKG.
As a two-stage framework, the dynamic property enables the FinDKG to further pre-
dict the impact of such shocks. For each influential entity denoted s, the KGTransformer
model is employed to predict the financial entities o most likely to be impacted in the
subsequent week t + 1. This is formulated within the link prediction task as the impact
relation: (s, impact, o, t + 1). The system is intended to update with streams of new
data, thereby o↵ering timely forward-looking insights.
FinDKG Core Analytical Table. Figure 7.4 presents the snapshot of the core an-
alytical table, dated January 1, 2023, which pinpoints the most influential entities that
are driving the entire financial knowledge graph universe. These are the entities catego-
rized as events, concepts, or economic indicators. The table provides a ranked percentile
to indicate the relative importance of each entity, with a “Novelty” score indicating its
recent emergence in the dataset. By examining the “Recent 3-month Trend”, one can
gauge the trajectory of an entity’s influence over time. Most critically, the table forecasts
which financial variables, denoted as “Predicted Most Impacted Financial Entities”, are
anticipated to be most a↵ected by these top entities in the subsequent week.
As a specific example, during the week of January 1, 2023, the UK strike event appears
as one of the top emerging entities in FinDKG. The trend plot within the table reveals
that this entity gained significant influence during the last two weeks of December 2022.
The model forecasts a high likelihood that this event will substantially impact the UK
stock market, its economy, and the healthcare sector in the near term.
This advanced analytical portal serves as a novel KG lens, o↵ering an actionable and
holistic view of emerging financial influencers and their projected impacts within the
fast-changing global macro-financial system.

Electronic copy available at: https://ssrn.com/abstract=4608445


7. Applications of Financial Dynamic Knowledge Graph 48

Figure 7.4. Snapshot of FinDKG’s Core Analytical Table as of January 1, 2023, show-
casing the most influential entities in the global financial knowledge graph
and the financial variables projected to be most impacted by them in the
subsequent week.

Electronic copy available at: https://ssrn.com/abstract=4608445


49

8. Conclusion
In this study, we set out to address three fundamental research questions about Tempo-
ral Knowledge Graphs (TKGs), Large Language Models (LLMs), and their applications
in finance. We investigated the efficacy of fine-tuned open-source LLMs in generat-
ing Knowledge Graphs (KGs), the performance of our newly-developed KGTransformer
model in TKG learning tasks, and the practical value of the Financial Dynamic Knowl-
edge Graph (FinDKG) system in modelling global financial systems. Our findings o↵er
substantial contributions to both the theoretical and practical dimensions of these do-
mains.
Our research validates the efficacy of open-source LLMs in automating KGC tasks.
Specifically, our fine-tuned Integrated Contextual Knowledge Graph Generator (ICKG)
LLM has demonstrated feasibility, scalability, and precision in these generative tasks.
Notably, the ICKG outperformed advanced closed-source LLMs like GPT-3.5 in the
specific KGC task, o↵ering a promising use case of open-source LLMs. The ICKG was
instrumental in constructing our Financial Dynamic Knowledge Graph (FinDKG) from
global news data, thereby enhancing the empirical scope of the study and adding a novel
dataset to the field.
On the subject of TKGs, our proposed KGTransformer model has proven e↵ective,
outperforming traditional static KG models across various TKG benchmark datasets.
This is particularly e↵ective on the FinDKG dataset, where the KGTransformer outper-
formed existing TKG approaches, affirming its relevance to our second research question.
The FinDKG dataset itself serves as a unique domain-specific benchmark KG dataset
for further study in TKG learning.
For practical financial applications, our KGTransformer-powered FinDKG system was
applied in real-world financial scenarios like risk management and thematic investing.
By providing actionable insights through our publicly accessible analytics platform , this
system demonstrates the feasibility and utility of implementing dynamic KGs in global
financial systems, thereby addressing our third research question.
Our open-source contributions, including the ICKG model, KGTransformer, and FinDKG
dataset and analytics platform, are aimed at catalyzing further interdisciplinary research.
We see these resources as foundational for upcoming studies in Graph AI, particularly
within the domains of economics and finance.

Limitations and Future Directions


As an early work in a nascent field, this research serves as a stepping stone for more
dedicated future works in the domains of LLMs, graph machine learning, and the sectors
of economics and finance.

Electronic copy available at: https://ssrn.com/abstract=4608445


8. Conclusion 50

In the emerging realm of LLMs for KGC, there is a need for more comprehensive
evaluation methodologies. While our study preliminarily validates the capabilities of
fine-tuned open-source LLMs, we need more advanced metrics to rigorously assess the
quality of generated KGC. Future work should also aim to develop evaluation frameworks
and benchmark datasets for generated KGs. A more sophisticated KGC should, for
instance, excel in KG Completion tasks by implying missing relationships from the input
text, inviting further research focused on LLMs.
Regarding TKG learning space, the current KGTransformer architecture e↵ectively
captures the multi-relational aspects of TKGs. However, its use of Recurrent Neural
Networks (RNNs) for temporal modelling is a known limitation, owing to issues with
memory retention. Our initial analysis indicates that the advanced Long Short-Term
Memory (LSTM) architectures also do not boost performance, signalling the need for
novel architectural innovations. Additionally, incorporating LLM embeddings could en-
hance TKGs, possibly leading to hybrid “KGxLLM” models.
Lastly, the study has demonstrated the potential of TKGs in finance. Moving forward,
there is a need for rigorous empirical research, from the perspective of economics and
finance, to examine the value of FinDKG. Specifically, how can TKGs augment exist-
ing empirical macroeconomic models, and how do they compare to existing news-based
Financial NLP studies? Rigorous quantitative studies, rooted in macroeconomics and
econometrics disciplines, are crucial for evaluating the true potential of KGs and in-
troducing graph-based AI methodologies to broader academic audiences in finance and
other social sciences.
In conclusion, this work advances the field of KGs by applying dynamic KGs to model
real-world global financial systems. Through the integration of graph machine learning
and financial analytics, we underscore the transformative potential of AI in finance and
aim to inspire further innovation in this domain.

Electronic copy available at: https://ssrn.com/abstract=4608445


A1

A. Financial Dynamic Knowledge


Graph (FinDKG) Dataset

Entity Category Definition Example


ORG Organizations that are not governmental Imperial College London
or regulatory bodies.
ORG/GOV Governmental bodies. UK Government
ORG/REG Regulatory bodies involved in financial Bank of England
oversight.
GPE Geopolitical entities like countries or United Kingdom
cities.
PERSON Individual people often in influential or Jerome Powell
decision-making roles.
COMP Companies across sectors. Apple Inc.
PRODUCT Tangible or intangible products or ser- iPhone
vices.
EVENT Specific, material events that have finan- Brexit
cial or economic implications.
SECTOR Sectors or industries in which companies Technology Sector
operate.
ECON INDICATOR Non-numerical indicators of economic Inflation Rate
trends or states.
FIN INSTRUMENT Financial and market instruments. S&P 500 Index
CONCEPT Abstract ideas, themes, or financial theo- Artificial Intelligence
ries.

Table A.1. Definitions and Examples of Entity Categories in the Financial Dynamic
Knowledge Graph (FinDKG).

Electronic copy available at: https://ssrn.com/abstract=4608445


A. Financial Dynamic Knowledge Graph (FinDKG) Dataset A2

Relationship Definition Example


Has Indicates ownership or possession, often of Google Has Android
assets or subsidiaries in a financial con-
text.
Announce Refers to the formal public declaration of a Apple Announces iPhone 13
financial event, product launch, or strate-
gic move.
Operate In Describes the geographical market in Tesla Operates In China
which a business entity conducts its op-
erations.
Introduce Denotes the first-time introduction of a fi- Samsung Introduces Foldable
nancial instrument, product, or policy to Screen
the market.
Produce Specifies the entity responsible for creat- Pfizer Produces Covid-19 Vac-
ing a particular product, often in a man- cine
ufacturing or financial product context.
Control Implies authority or regulatory power over Federal Reserve Controls Interest
monetary policy, financial instruments, or Rates
market conditions.
Participates In Indicates active involvement in an event United States Participates In
that has financial or economic implica- G20 Summit
tions.
Impact Signifies a notable e↵ect, either positive or Brexit Impacts European Union
negative, on market trends, financial con-
ditions, or economic indicators.
Positive Impact On Highlights a beneficial e↵ect on financial Solar Energy Posi-
markets, economic indicators, or business tive Impact On ESG Ratings
performance.
Negative Impact On Underlines a detrimental e↵ect on finan- Covid-19 Negative Impact On
cial markets, economic indicators, or busi- Tourism Sector
ness performance.
Relate To Points out a connection or correlation AI Relates To FinTech Sector
with a financial concept, sector, or mar-
ket trend.
Is Member Of Denotes membership in a trade group, Germany Is Member Of EU
economic union, or financial consortium.
Invests In Specifies an allocation of capital into a fi- Warren Bu↵ett Invests In Apple
nancial instrument, sector, or business en-
tity.
Raise Indicates an increase, often referring to OPEC Raises Oil Production
capital, interest rates, or production levels
in a financial context.
Decrease Indicates a reduction, often referring to Federal Reserve Decreases Inter-
capital, interest rates, or production levels est Rates
in a financial context.

Table A.2. Definitions and Examples of Relationships in the Financial Dynamic Knowl-
edge Graph (FinDKG).

Electronic copy available at: https://ssrn.com/abstract=4608445


A3

B. ICKG Instruction-following Large


Language Model for Knowledge
Graph Construction
The promp template for knowledge graph construction on ICKG is specified in online
website .
Table B.1 provides an exhaustive comparison of the responses generated by di↵erent
LLMs: Vicuna-7B-v1.3, GPT-3.5, GPT-4, and the proposed ICKG model. Each model’s
output is represented as a set of tuples. These tuples encapsulate elements such as the
subject, its category, the action or relationship, the object, and the object’s category.
The table serves as an empirical gauge for understanding the relative competencies of
these LLMs in generating semantically and structurally coherent knowledge graph-like
structures.

LLM Generated Response


Vicuna-7B-v1.3 [(’Boris Johnson’, ’PERSON’, ’Champion’, ’Workers’, ’LABOUR’), (’Theresa
May’, ’PERSON’, ’Chief’, ’Government’, ’GOV’), (’Conservative Party’,
’ORG’, ’Achieve’, ’Victory’, ’ELECT’), (’Brexit’, ’PROCESS’, ’Break With’,
’European Union’, ’EU’), (’British Election Study’, ’ORG’, ’Vote’, ’Leave’,
’BREXIT’), ...]
GPT-3.5 [(’Boris Johnson’, ’PERSON’, ’Has’, ’Strength’, ’CONCEPT’), (’British Elec-
tions’, ’EVENT’, ’Impact’, ’Voter Allegiances’, ’CONCEPT’), (’Labour’,
’ORG’, ’Capture’, ’Blue-Collar Districts’, ’SECTOR’), (’Labour’, ’ORG’, ’Po-
sition’, ’Workers’, ’SECTOR’), (’Brexit’, ’EVENT’, ’Drive’, ’Change in U.K.’,
’CONCEPT’), ...]
GPT-4 [(’Boris Johnson’, ’PERSON’, ’Control’, ’Conservative Party’, ’ORG’),
(’Boris Johnson’, ’PERSON’, ’Invests In’, ’Government funds’, ’CON-
CEPT’), (’Boris Johnson’, ’PERSON’, ’Impact’,’Blyth Valley district’, ’GPE’),
(’Boris Johnson’, ’PERSON’, ’Relate To’,’Brexit’, ’EVENT’), (’Brexit’,
’EVENT’,’Impact’,’U.K.’, ’GPE’), ...]
ICKG [(’Boris Johnson’, ’PERSON’, ’Relate To’, ’Brexit’, ’EVENT’), (’Vernon Bog-
danor’, ’PERSON’, ’Relate To’, ’King’s College London’, ’ORG’), (’Brexit’,
’EVENT’, ’Impact On’, ’British Society’, ’ORG’), (’Conservative Party’,
’ORG’, ’Operate In’, ’British elections’, ’EVENT’), (’Brexit’, ’EVENT’, ’Con-
trol’, ’Voter allegiances’, ’CONCEPT’), ...]

Table B.1. Comparison of Generated Responses from Various Large Language Models.
The response only shows the top five extracted triplets for brevity.

Electronic copy available at: https://ssrn.com/abstract=4608445


A4

C. Dynamic Knowledge Graph Learning


C.1. Recurrent Neural Network (RNN)
The Recurrent Neural Network (RNN) serves as the temporal and structural sequential
layer in our framework. For each entity i and relation r, the RNNs take the current and
previous embeddings to update the temporal and structural embeddings.
Mathematically, for entities, the RNN layer is formulated as:
(l,t) (l,t 1)
hi = RNNentity (u(l,t) , hi ), (C.1)
Similarly, for relations, the RNN layer can be expressed as:

h(l,t)
r = RNNrelation (u(l,t) , h(l,t
r
1)
), (C.2)
(l,t) (l,t)
Here, hi and hr denote the hidden states for entity i and relation r at layer l
and time t, respectively. The hidden states serve as memory, encapsulating long-term
dependencies and capturing the evolving dynamics. These hidden states are initialized
(l,0) (l,0)
as hi = 0 and hr = 0 for entities and relations, respectively.
The RNN layer thus provides a mechanism to not only incorporate past information
but also adapt to new events, creating a balanced and context-sensitive representation
over time.

C.2. Multilayer Perceptron (MLP)


A Multilayer Perceptron (MLP) is a standard class of artificial neural networks (ANNs)
that consists of multiple layers of interconnected nodes.
Mathematically, an MLP transforms an input vector x through a series of hidden
layers, each followed by a nonlinear activation function, to produce an output vector y.

Definition C.2.1. Let x 2 Rd be an input vector and fi denote the nonlinear activation
function for the ith layer. An MLP with L hidden layers is formally defined as:

h0 = x,
hi = fi (Wi hi 1 + bi ), i = 1, . . . , L,
y = hL ,

where Wi and bi are the weight matrix and bias vector for the ith layer, respectively.

Electronic copy available at: https://ssrn.com/abstract=4608445


C. Dynamic Knowledge Graph Learning A5

Note that the softmax function can be included as the activation function in the top
layer, allowing the MLP to directly produce probability distributions. Therefore, in this
study, when we refer to an MLP to approximate distribution, we imply that the softmax
is integrated into the final layer, eliminating the need to decompose the transformation
into an MLP followed by a softmax function.

Electronic copy available at: https://ssrn.com/abstract=4608445


iii

Acknowledgements
I would like to express my heartfelt gratitude to my supervisor, Francesco Sanna Passino,
for his insightful guidance, inspiring feedback, and continuous support throughout the
journey of this thesis. I am also immensely thankful to my parents for their love, encour-
agement, and unwavering support, which have been the foundation of my academic pur-
suits. Additionally, I want to extend my appreciation to my girlfriend, Yuhong Cheng,
for her patience, understanding, and constant encouragement during the challenging
phases of this endeavour. Her presence has been a constant wellspring of motivation,
her faith in me is a sunshine that has carried me through the challenges of this journey,
and for that, I am profoundly thankful. I also extend my sincere appreciation to my
dear friend, Mr. Yuncong Xiao. Our trip companionship amidst the serene landscapes
of Arashiyama in Kyoto was the catalyst that ignited the initial spark of my research
idea, subsequently culminating in the realization of this thesis. Lastly, I am grateful for
the friendship and motivational support of my old pal Mr. Yixiao Tan, who has been
instrumental in my perseverance through the end of this journey.

Electronic copy available at: https://ssrn.com/abstract=4608445


A6

Bibliography
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana
Yakhnenko. Translating embeddings for modeling multi-relational data. Advances in
neural information processing systems, 26, 2013.

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. Large-scale simple
question answering with memory networks. arXiv preprint arXiv:1506.02075, 2015.

Elizabeth Boschee, Jennifer Lautenschlager, Sean O’Brien, Steve Shellman, James Starz,
and Michael Ward. ICEWS Coded Event Data, 2015. URL https://doi.org/10.
7910/DVN/28075.

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla
Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al.
Language models are few-shot learners. Advances in neural information processing
systems, 33:1877–1901, 2020.

Leland Bybee, Bryan T Kelly, Asaf Manela, and Dacheng Xiu. Business news and
business cycles. Technical report, National Bureau of Economic Research, 2021.

Borui Cai, Yong Xiang, Longxiang Gao, He Zhang, Yunfeng Li, and Jianxin Li. Temporal
knowledge graph completion: A survey. arXiv preprint arXiv:2201.08236, 2022.

Nai-Fu Chen, Richard Roll, and Stephen A Ross. Economic forces and the stock market.
Journal of business, pages 383–403, 1986.

Dawei Cheng, Fangzhou Yang, Xiaoyang Wang, Ying Zhang, and Liqing Zhang. Knowl-
edge graph-based event embedding framework for financial quantitative investments.
In Proceedings of the 43rd International ACM SIGIR Conference on Research and
Development in Information Retrieval, pages 2221–2230, 2020.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-
training of deep bidirectional transformers for language understanding. arXiv preprint
arXiv:1810.04805, 2018.

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy,
Thomas Strohmann, Shaohua Sun, and Wei Zhang. Knowledge vault: A web-scale
approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD
international conference on Knowledge discovery and data mining, pages 601–610,
2014.

Electronic copy available at: https://ssrn.com/abstract=4608445


Bibliography A7

Xiaoyi Fu, Xinqi Ren, Ole J Mengshoel, and Xindong Wu. Stochastic optimization for
market return prediction using financial knowledge graph. In 2018 IEEE International
Conference on Big Knowledge (ICBK), pages 25–32. IEEE, 2018.

Matthew Gentzkow, Bryan Kelly, and Matt Taddy. Text as data. Journal of Economic
Literature, 57(3):535–574, 2019.

Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. Heterogeneous graph trans-
former. In Proceedings of the web conference 2020, pages 2704–2710, 2020.

Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and S Yu Philip. A survey on
knowledge graphs: Representation, acquisition, and applications. IEEE transactions
on neural networks and learning systems, 33(2):494–514, 2021.

Woojeong Jin, Meng Qu, Xisen Jin, and Xiang Ren. Recurrent event network: Au-
toregressive structure inference over temporal knowledge graphs. arXiv preprint
arXiv:1904.05530, 2019.

Julien Leblay and Melisachew Wudage Chekol. Deriving validity time in knowledge
graph. In Companion proceedings of the the web conference 2018, pages 1771–1776,
2018.

Kalev Leetaru and Philip A Schrodt. Gdelt: Global data on events, location, and tone,
1979–2012. In ISA annual convention, volume 2, pages 1–49. Citeseer, 2013.

Xiaohui Li. Financial dynamic knowledge graph online portal, 2023. URL https:
//xiaohui-victor-li.github.io/FinDKG/. Accessed: August 31, 2023.

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint
arXiv:1711.05101, 2017.

Farzaneh Mahdisoltani, Joanna Biega, and Fabian M Suchanek. Yago3: A knowledge


base from multilingual wikipedias. In CIDR, 2013.

Maximilian Nickel, Kevin Murphy, Volker Tresp, and Evgeniy Gabrilovich. A review of
relational machine learning for knowledge graphs. Proceedings of the IEEE, 104(1):
11–33, 2015.

OpenAI. Gpt-4 technical report, 2023.

Long Ouyang, Je↵rey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela
Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Train-
ing language models to follow instructions with human feedback. Advances in Neural
Information Processing Systems, 35:27730–27744, 2022.

Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu.
Unifying large language models and knowledge graphs: A roadmap. arXiv preprint
arXiv:2306.08302, 2023.

Electronic copy available at: https://ssrn.com/abstract=4608445


Bibliography A8

Namyong Park, Fuchen Liu, Purvanshi Mehta, Dana Cristofor, Christos Faloutsos, and
Yuxiao Dong. Evokg: Jointly modeling event time and network structure for reasoning
over temporal knowledge graphs. In Proceedings of the Fifteenth ACM International
Conference on Web Search and Data Mining, pages 794–803, 2022.

Nils Reimers and Iryna Gurevych. Sentence-bert: Sentence embeddings using siamese
bert-networks. arXiv preprint arXiv:1908.10084, 2019.

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov,
and Max Welling. Modeling relational data with graph convolutional networks. In
The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete,
Greece, June 3–7, 2018, Proceedings 15, pages 593–607. Springer, 2018.

Amit Singhal et al. Introducing the knowledge graph: things, not strings. Official google
blog, 5(16):3, 2012.

Koye Somefun, Romain Perchet, Chenyang Yin, and Raul Leote de Carvalho. Allocating
to thematic investments. Financial Analysts Journal, 79(1):18–36, 2023.

Mirac Suzgun, Luke Melas-Kyriazi, Suproteem K Sarkar, Scott Duke Kominers, and
Stuart M Shieber. The harvard uspto patent dataset: A large-scale, well-structured,
and multi-purpose corpus of patent applications. arXiv preprint arXiv:2207.04043,
2022.

Paul C Tetlock. Giving content to investor sentiment: The role of media in the stock
market. The Journal of finance, 62(3):1139–1168, 2007.

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux,
Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar,
et al. Llama: Open and efficient foundation language models. arXiv preprint
arXiv:2302.13971, 2023.

Rakshit Trivedi, Hanjun Dai, Yichen Wang, and Le Song. Know-evolve: Deep temporal
reasoning for dynamic knowledge graphs. In international conference on machine
learning, pages 3462–3471. PMLR, 2017.

Alan Turing. Computing machinery and intelligence. Mind, 59(236):433, 1950.

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and
Yoshua Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.

Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, and
Minyi Guo. Ripplenet: Propagating user preferences on the knowledge graph for
recommender systems. In Proceedings of the 27th ACM international conference on
information and knowledge management, pages 417–426, 2018.

Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao, Wenjie Li,
and Zhongyuan Wang. Knowledge-aware graph neural networks with label smoothness

Electronic copy available at: https://ssrn.com/abstract=4608445


Bibliography A9

regularization for recommender systems. In Proceedings of the 25th ACM SIGKDD


international conference on knowledge discovery & data mining, pages 968–977, 2019a.

Jiapu Wang, Boyue Wang, Meikang Qiu, Shirui Pan, Bo Xiong, Heng Liu, Linhao Luo,
Tengfei Liu, Yongli Hu, Baocai Yin, et al. A survey on temporal knowledge graph
completion: Taxonomy, progress, and prospects. arXiv preprint arXiv:2308.02457,
2023.

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou,
Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li,
and Zheng Zhang. Deep graph library: A graph-centric, highly-performant package
for graph neural networks. arXiv preprint arXiv:1909.01315, 2019b.

Xin Xu, Yuqi Zhu, Xiaohan Wang, and Ningyu Zhang. How to unleash the power of large
language models for few-shot relation extraction? arXiv preprint arXiv:2305.01555,
2023.

Hongbin Ye, Ningyu Zhang, Hui Chen, and Huajun Chen. Generative knowledge graph
construction: A review. arXiv preprint arXiv:2210.12714, 2022.

Ningyu Zhang, Xin Xu, Liankuan Tao, Haiyang Yu, Hongbin Ye, Shuofei Qiao, Xin
Xie, Xiang Chen, Zhoubo Li, Lei Li, et al. Deepke: A deep learning based knowledge
extraction toolkit for knowledge base population. arXiv preprint arXiv:2201.03335,
2022.

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou,
Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. A survey of large
language models. arXiv preprint arXiv:2303.18223, 2023.

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao
Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, et al. Judging llm-as-a-judge
with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685, 2023.

Electronic copy available at: https://ssrn.com/abstract=4608445

You might also like