GNN Event Detection Paper(1): Reinforced, Incremental and Cross-lingual Event Detection From Social

天狼啸月1990

已于 2023-04-09 16:50:02 修改

阅读量1.4k

点赞数 1

CC 4.0 BY-SA版权

于 2022-10-20 19:53:08 首次发布

本文链接：https://blog.csdn.net/qq_33419476/article/details/127413860

GNN papers 专栏收录该内容

1 篇文章

订阅专栏

FinEvent是一种针对社交媒体流的事件检测方法，它结合了强化学习、增量学习和跨语言技术。通过构建加权多关系图来保留丰富结构和统计特征，使用多代理强化学习选择邻居节点并进行聚合，解决模糊事件特征和不平衡数据问题。此外，FinEvent采用对比学习和深度强化学习的DBSCAN聚类，实现无手动参数的事件检测，并能跨语言扩展。实验表明，FinEvent在准确性、泛化能力和处理多语言数据方面表现出色。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Reinforced, Incremental and Cross-lingual Event Detection From Social Messages(2022)

github address: GitHub - RingBDStack/FinEvent: Code for "Reinforced, Incremental and Cross-lingual Event Detection From Social Messages"

FinEvent是GNN event detection method for social streaming messages，利用强化学习RL从加权异构图中挑选neighbor，然后利用GAT作为聚合器Aggregator将邻居节点embedding聚合成最终的single node embedding，继而输入到DBSCAN中做event detection。
其中，cross-lingual是基于迁移学习transfer learning做的，target domain：少量标签数据。

2. problem formulation and notations

3. Model Architecture

3.1 life-cycle mechanism

3.2 Incremental Learning Framework

3.3 cross-lingual transferring mechanism

4. FinEvent Method

4.1 heterogeneous: HINs(heterogeneous information networks)

4.2 multi-agent reinforced weighted multi-relational Graph Neural Network framework(MarGNN)

4.2.1 Reinforced Neighbor Selection 强化学习挑选邻居节点

4.2.2 Weighted Relation-aware Neighbor Aggregation 加权的关系感知邻居聚合

4.3 Balanced Sampling Strategy based Contrastive Learning Mechanism(BasCL)

4.4 DRL-DBSCAN: Deep Reinforcement Learning(DRL) guided DSSCAN model

4.5 transferring: cross-lingual social message embedding method(Crlme)

4.6 Maintenace Strategies

1) All message strategy

2) Relevant message strategy

3) Latest message strategy

5.3 Experiment Setting

5.4 Evaluation Metrics

6. Evaluation

6.1 offline evaluation

6.2 online(incremental evaluation)

6.2.1 Ablation Study

6.3 study on RL process

6.3.1 preserving thresholds

6.3.2 DRL-DBSCAN

6.4 Cross-lingual Transferring Evaluation

6.5 Time Analysis

7. Related work

8. Conclusion

Abstract

background: detecting hot social events

streaming nature of social messages -> incremental models

problem: ambiguous events features, dispersive text contents, and multiple languages => low accuracy and generalization ability。模糊的事件特征，分散的文本内容，多语言 => 低准确率和泛化能力

solution: FinEvent

1) model social messages into heterogeneous graphs: rich meta-semantics & diverse meta-relations.

-> convert them to weighted multi-relational messages graphs 加权多关系消息图

2) solution: new reinforced weighted multi-relational GNN framework

multi-agent reinforcement learning to select optimal aggregation thresholds 多代理强化学习学习最佳聚合阈值

-> 扩展：reinforcement learning 本质就是学习最佳参数、阈值。

->获得 social message embeddings 向量

problem: long-tail problem in social event detection.

solution: balanced sampling strategy 均衡采样策略 + contrastive learning mechanism 对比学习机制

-> incremental social message representation learning

3) Deep Reinforcement Learning + DBSCAN 深度强化学习 + DBSCAN聚类

the optimal minimum number of samples -> to form a cluster
the optimal minimum distance between two clusters -> in social event detection masks

4) incremental social message representation learning

knowledge preservation
GNN

-> 实现cross-lingual social event detection.

1. Introduction

1) meaning

2) social event: the combination of social messages

1.1 现有研究方法问题

problem-1: event-related heterogeneous elements 事件相关的异构元素

solution-1: HIN(heterogeneous information networks)，2017年提出

--> 扩展：那我用个2019年提出的HAN不过分吧

problem-2: how to learn more discriminative embedding of social messages 如何学习一个更有区别性的social messages embedding向量

因为contents of messages: overlapping重叠、redundant冗余、discrete离散、noisy nature of messages stream 消息流的噪音特性 <- semantically rich event detection task

=> challenge-1: how to model social messages and design a more discriminative and explanatory social message embedding framework 如何学习一个更有区别性和解释性的消息向量

problem-3: the number of messages(samples) contained in each event is relatively imbalanced 事件消息分布不均衡

=> challenge-2：long-tail problem 数据长尾分布问题

降低检测方法性能performance，造成差的推广性generalization

solution: streaming clustering technology 消息流聚类技术

problem-4: incremental detection on streaming messages and cross-lingual detection 增量检测和跨语言检测

time attributes, the number of social events also increases in social message streams因为时间属性，消息流中的事件数目也在变化

solution: semantic incremental event detection framework 语义增量事件检测框架

problem-5: cross-lingual messages lead to inconsistencies in the semantic embedding space of underlying words or entities 跨语言消息导致底层单词或实体在语义嵌入空间中不一致！

=> challenge-3: how to implement cross-lingual social event detection, and event generalize to low-resource language messages data. 如何实现跨语言社交事件检测，并向低资源语言扩展。

Fin-Event Method

1.2 Contributions

1. weighted multi-relational graphs -> preserve richer structural and statistical feature

2. MarGNN framework

RL -> learn optimal preserving thresholds to select top-p neighbors
reasonably retains and integrates the most top-p valuable semantic and structural information from each relation 整合top-p的语义和结构信息 of each relation

3. DRL-DBSCAN

-> to realize social event clustering detection tasks without manual parameters

4. Crlme

-> cross-lingual social event detection

2. problem formulation and notations

1) social stream

2) social event

a set of correlated social messages that discuss the same real-world happening event

3) HIN

4) weighted multi-relational message graph

5) social event detection algorithm

6) incremental social event detection algorithm

7) cross-lingual social event detection algorithm

3. Incremental Model life-cycle

incremental life-cycle mechanism -> streaming nature

3.1 life-cycle mechanism

message graph construction <-> model training & detecting

pre-training stage, small social messages in advance -> initial weighted multi-relational message graph 初始化加权多关系消息图
detection stage, update the weighted multi-relational message graph with new message block 更新加权多关系消息图
maintenance stage, remove obsolete messages -> re-trained model with updated graph 用更新后的图重新训练模型

3.2 Incremental Learning Framework

problem: generalization challenges in incremental social event detection

solution: incremental learning + life-cycle mechanism

our architecture ->processess various elements in social streams

message embedding

HIN -> to extract different relations according to meta-path instances
update its embedding space
GNNs-> tune parameter + preserve helpful knowledge
MarGNN

3.3 cross-lingual transferring mechanism

problem: cross-lingual problem

solution:

preserved parameters in MarGNN in the detection stage 保留检测阶段MarGNN的参数
extend MarGNN by rescuming the training process using incoming data in the maintenance stage 在维护阶段用新数据继续训练extend MarGNN model

4. FinEvent Method

proposal: FinEvent(reinForced, incremental, and cross-lingual social Event detection architecture from steaming social messages)

Fin-Event method可以概括为以下5部分：

preprocessing
message embedding
training
detection
transferring

4.1 heterogeneous: HINs(heterogeneous information networks)

organize event-related elements and relations(表示异构图的某一元素) of various types into one unified graph structure.将不同种类的元素和关系组织成一个统一的图结构

problem: previous methods converting heterogeneous graph to homogeneous graph by using meta-path instances。将异构图转换成同构图，容易丢失语义semantics和结构信息structural information。

solution: a weighted multi-relational graph。加权多关系图

model the association between social messages, reserving the number of meta-path instances as different weight of edge/relation.

input: original social streaming messages 原始社交消息 -> HIN model ->

mid: heterogeneous social message graph(HIN) 异构社交消息图 to prevent the loss of heterogeneous information -> mapping ->

output: weighted multi-relational graph 加权多关系图 to save richer connection information

nodes: a series of message collections M with d-dimensional features.

edges: belonging to different relations will be established respectively

4.2 multi-agent reinforced weighted multi-relational Graph Neural Network framework(MarGNN)

Essence本质：Aggregator可以将neighbor embedding合并为node embedding

essence: 多代理强化学习 learn optimal weights -> select neighbor nodes，然后加权的多关系图神经网络 -> 生成social message embedding vector 不是event embedding vector

GNN. learn representations from semantic and structural information of social messages 从社交消息的语义和结构信息中学习embedding representation
multi-agent Actor-critic algorithm(AC) -> to learn optimal numbers/thresholds for each relation. 多代理强化学习最佳number和阈值 for each relation

-> guide intra-relation and inter-relation messages aggregations.

input: weighted multi-relational graph neural network 加权多关系图神经网络

mid: RL select neighbors for different relations 和 obtain the aggregation of all messages using multi-agent reinforcement learning 多代理强化学习为不同关系挑选neighbors & 消息聚合权重

output: GNN 生成social message embedding <- containing semantics and structural information

4.2.1 Reinforced Neighbor Selection 强化学习挑选邻居节点

problem: some meaningless links between social information 社交信息间的无效连接relation

solution: sample each relation before aggregation to retain neighbors with high semantic and structural connections 在聚合之前sampling relation(即 select neighbors under each relation)，以保留高语义和结构信息的连接

problem: different relations in the multi-relational graph have different degrees of impurities and collectively affect the embedding results 多关系图中不同的relations有不同的不纯度，并联合影响最终的embedding vector。

solution: a collaborative learning method 合作学习方法 去寻找平衡 to find the balance between different relations

problem: previous neighbors selecting methods: Bernoulli Multi-armed Bandit process or attention mechanism no longer applicable in increasing detection 以前的邻居选择方法: 伯努利多臂强盗过程或注意力机制都不再适用于增量检测

solution: multi-agent reinforcement learning performs top-p neighbor sampling before aggregation 多代理强化学习在聚合前采样top-p的neighbors

sort the neighbors of each node under the relation r
establish an agent for each relation as the selector

RL how to select neighbors? the agent of each relation will learn in the game how to find the balance between relations in the task of streaming social detection

four elements(Nagg;Aagg; Sagg;Ragg),

state: preserving thresholds of different relations jointly affect the final aggregation effect
- preserving thresholds of all relations 所有关系预留的阈值作为weights-> aggregating neighbor node representation 聚合邻居节点embedding表示-> calculate the average weighted distance under one relation 计算某个neighbor relation下的平均加权距离
Action: the preserving threshold p under relation r in epoch k
Reward: find the best aggregation scheme to obtain the best clustering performance of the message <- NMI互信息，即找到all relations 最佳的聚合权重aggregating weights
Optimization: Actor-critic algorithm -> to select actions according to the state through the actor-network and finally obtains the same reward to update the loss function.

4.2.2 Weighted Relation-aware Neighbor Aggregation 加权的关系感知邻居聚合

为了更好的指导weighted multi-relational Graph Neural Network(MarGNN) to learn the message embedding

intra-relation aggregation 关系内聚合

participating neighbor messages are controlled by the preserving threshold 参与的邻居消息controlled by预留的阈值
the process expressed as the aggregation process of the message mi belonging to relation r at the l-th layer这个过程可以被表示为属于关系r的消息mi在第l-th层的聚合过程

input: embedding vector hj of each neighbor message mj of message mi 邻居消息嵌入向量

mid: summation aggregation operator of all neighbor messages embedding 所有邻居节点embedding 求和聚合操作

Notice: 在l-1层向l层传递时，mj作为mi的邻居节点，也可以其他节点的中心节点！

model: GAT(multi-head attention mechanism of Graph Attention Netowrk)

multi-heads：用到多个query对一段文本进行多次attention操作，其中每个query都关注到原文不同的部分，相当于重复做多次单层attention。

output: 经过multi-attention拼接平均处理后得到 comprehensive neighbor message embedding vector under each relation r，即一个中心节点在each relation下只有一个综合neighbor embedding vector

inter-relation aggregation 关系间聚合

中心节点 relation embeddings 拼接
the preserving threshold of each relation is used as the weight of the relation embedding 预留阈值在relation关系(异构图不同属性)间聚合时作为relation embedding的权重

input: 中心节点mi在relation r下的comprehensive neighbor embedding vector，当然mi有多个relations下的neighbor embedding vector

mid1: splicing aggregation operator, e.g. concatenation、sum、MLP

mid2: 然后与上一层的inter-relation embedding拼接

model: GAT(multi-head attention mechanism of Graph Attention Netowrk)

output: 中心节点mi在l-th layer最终的embedding representation。

4.3 Balanced Sampling Strategy based Contrastive Learning Mechanism(BasCL)

problem: number of event classes in incremental event detection constantly changing 事件类数持续变化

solution: contrastive learning 对比学习

-> it focuses on learning the common features between similar instances and distinguishing differences between non-similar instances. 它集中于学习相似实例的共同特征，划分非相似实例的差异。

problem: long-tail problem in social event detection 长尾分布

solution: contrastive learning 对比学习

besides, contrastive learning contains more cluster-like structure information, which can benefit the downstream event clustering tasks.

solution: triplet losses -> to balance a large number of negative samples and a small number of positive samples of the same event class.

periodically up-to-date embedding space，定期更新嵌入空间 -> we first sample a positive sample mi+ and a negative sample mi- to construct triplet loss and update the embedding of the message in the direction of the positive sample. <- Euclidean distance

solution: global-local pair loss -> to preserve the graph structure information in the process of detecting long-tail events incrementally. 保存图结构信息-> to make better use of the influence of similar structural information by minimizing the cross-entropy of global summary and local message representation.

solution: a Balanced sampling strategy based Contrastive learning mechanism, BasCL

基于均衡采样策略的对比学习机制 -> used to train the GNN

4.4 DRL-DBSCAN: Deep Reinforcement Learning(DRL) guided DSSCAN model

DBSCAN: automatically adjust the number of classes

problem: DBSCAn still has two parameters (the distance parameter e and the minimum sample number parameter minPts) that need to be manually adjusted and cannot adapt to match message blocks with different data distributions in the constantly changing message input.

solution: DRL-DBSCAN，深度强化学习指导的DBSCAN, to obtain a stable clustering effect of social events in the multi-round parameter interaction with DBSCAN<- based on learned social messages embeddings

-> achieve social event clustering detection tasks automatically

多代理强化学习 multi-agent DRL: Twin Delayed Deep Deterministic policy gradient algorithm(TD3)

agent: parameter adjustment system
environment: incremental social data
process: Markov decision process(Sclu, Aclu, Rclu)
- Sclu, state--clustering situation
- Aclu: action--t-Distributed stochastic Neighbor Embedding, like learning rate 步长, to prevent the curse of dimensionality and speed up the DBSCAN processing speed
- Rclu: Reward--Variance Ratio Criterion to stimulate the agent
- Optimization: Twin Delayed Deep Deterministic policy gradient algorithm

to learn optimal parameters: minPts, minimum number of samples, $\varepsilon$ -minimum distance between two clusters

4.5 transferring: cross-lingual social message embedding method(Crlme)

transferring the parameters of MarGNN to improve the performance of embedding on target-language messages(non-English)

problem: non-English languages with insufficient original information that cannot reuse the training process of the English model.

solution: we directly inherit the parameters θ preserved in English model fE training as parameter θ of the non-English model fNoE when detecting non-English events.

LNMAP model -> map Non-English message m to English semantic space

4.6 Maintenace Strategies

1) All message strategy

keeping all the messages.

detection stage, insert newly arrived message block into G
maintenance stage, continue the training process using all the messages in G.

impractical -> eventually exceed the embedding space capacity of the message encoder

2) Relevant message strategy

keeping messages that are related to the newly arrived ones.

detection stage, insert the newly arrived message block into G
maintenance stage, first remove messages that are not connected to any messages that arrived during the last time window and then continue training using all the messages in G

3) Latest message strategy

keeping the latest message block

detection stage, use only the newly arrived message block to reconstruct G
maintenance stage, continue training with all the messages in G, which only involves the latest message block.

proposed FinEvent

initialize a weighted multi-relational graph G
when new messages arrive, update the graph
- inserting the new messages node
- establishing a connection
- regularly delete expired nodes and edges
neighbor selector of MarGNN
performing model single-language detection
aggregation module
BasCL
DRL-DBSCAN

5. Experiment

5.1 Data

Twitter dataset(Building a large-scale corpus for evaluating event detection on twitter)

68841 manually labeled tweets
503 event classes
4 weeks(29 days)
3 relations:
- M-U-M(message-user-message)
- M-L-M(message-location-message)
- M-E-M(message-entity-message)

French Twitter dataset:

64513 labeled tweets
257 event classes
3 weeks(23 days)

5.2 Baselines

word2vec
LDA
WMD
BERT
BiLSTM
PP-GCN
EventX
KPGNN

5.3 Experiment Setting

python 3.7.3
pytorch 1.8.1
64 core Intel Xeon CPU E5-2680 v4@2.40GHz with 512GB RAM and 1xNVIDIA Tesla P100-PICE GPU

5.4 Evaluation Metrics

NMI: normalized mutual information
AMI: adjusted mutual information
ARI: adjusted rand index

6. Evaluation

6.1 offline evaluation

6.2 online(incremental evaluation)

6.2.1 Ablation Study

6.3 study on RL process

6.3.1 preserving thresholds

6.3.2 DRL-DBSCAN

6.4 Cross-lingual Transferring Evaluation

6.5 Time Analysis

7. Related work

social event detection method
- document-pivot(DP) methods
- feature-pivot(FP) methods
their application scenarios
- offline
- online
different techniques and mechanisms
- incremental clustering
- community detection
- topic models

problem:

These methods are limited by the latest knowledge as they ignore the rich semantics and structural information contained in the social streams to some extent 忽略了丰富的语义和结构信息
have too few parameters to preserve the learned knowledge

8. Conclusion

FinEvent a reinforced, incremental, and cross-lingual social event detection architecture from steaming social messages.

9. Fin-Event Codes Analysis

Fin-Event，先用intra_agg分别求出 meta-path based word_embedding, user_id_embedding, entity_embedding，然后用inter_agg将这三种不同meta-path embeddings合并为一个综合的final embedding。

1) problem 1: 没有考虑语序

在构建word adjacency matrix时，sampled words中word直接连在Graph中，没有考虑语序问题。

e.g. I love Uassica; Jassica loves me，这是两个事，不考虑语序就成一个事了！

2) 邻接矩阵(adjacency matrix)是图的最基本的实现方式，有二维坐标矩阵转化。

3) Fin-Event中的mask指的是index，train_mask=train_idx，而不是HAN中的bool类型

4) problem-2: cal_similarity_node_edge中r_data[1]应该是写反了 -》r_data[0]

5) filtered_multi-r_data将entity_neighbor从48万多减少到10万多

7) validation, extract_features是重新计算的pre embeddings。