Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

阅读量：4226 次

发布时间：2019-05-26

本文共 3096 字，大约阅读时间需要 10 分钟。

这是一篇将GNN运用在预测知识图谱(Knowledge Graph)节点重要性的文章，被KDD2019接收。文中提出了GENI模型，在GNN聚合信息的过程中只聚合一个标量（score）而不是聚合节点的embedding。

Introduction

知识图谱可以看做是一个有向多关系图，并且节点之间可能存在不止一条边。

Given a KG, estimating the importance of each node is a crucial task that enables a number of applications such as recommendation, query disambiguation, and resource allocation optimization.
An importance score is a value that represents the significance or popularity of a node in the KG.

Method

table of symbols

在这里插入图片描述

score aggregation

在第 $l$ 层上，中心节点 $i$ 通过加权聚合邻居节点的score-estimation $s^{\ell-1}(j)$ 来更新自己的score-estimation

s^{\ell}(i)=\sum_{j \in N(i) \cup\{i\}} \alpha_{i j}^{\ell} s^{\ell-1}(j)

为了获得初始的

s^0(i)

，模型通过一个全连接层将节点的embedding映射成初始的score：

s^{0}(i)=\text { Scoring} \mathrm{Network}\left(\vec{z}_{i}\right)

聚合过程是在聚合标量而不是向量，所以本文的GNN模型和其他大多数GNN模型不太一样。

Predicate-Aware Attention Mechanism

知识图谱一般可以写成三元组的形式：（subject, predicate, object），可以看做是图上一条边上的（起点，边的类型，终点）。为了更好地得到在聚合过程中的 $\alpha_{i j}^{\ell}$ 的值，一个合理的想法是 $\alpha_{i j}^{\ell}$ 与i,j之间边的类型有关系。使用 $p^m_{ij}$ 表示i,j之间第m条边的类型， $\phi(p^m_{ij})$ 是这条边的向量表示。通过attention机制计算出 $\alpha_{i j}^{\ell}$ 。

$\alpha_{i j}^{\ell}=\frac{\exp \left(\sigma_{a}\left(\sum_{m} \vec{a}_{\ell}^{\top}\left[s^{\ell}(i)\left\|\phi\left(p_{i j}^{m}\right)\right\| s^{\ell}(j)\right]\right)\right)}{\sum_{k \in N(i) \cup\{i\}} \exp \left(\sigma_{a}\left(\sum_{m} \vec{a}_{\ell}^{\top}\left[s^{\ell}(i)\left\|\phi\left(p_{i k}^{m}\right)\right\| s^{\ell}(k)\right]\right)\right)}$

i}exp(σa(∑ma ℓ⊤[sℓ(i)∥ϕ(pikm)∥sℓ(k)]))exp(σa(∑ma ℓ⊤[sℓ(i)∥∥ϕ(pijm)∥∥sℓ(j)]))

Centrality Adjustment

通常来说，图上入度越大的节点它的重要性就越高，所以可以使用 $c(i)=\log (d(i)+\epsilon)$ 计算初始的中心性得分，但这样直接计算出来的结果不能准确地衡量入度和中心性之间的关系，所以又加上了两个可学习的参数 $\gamma$ 和 $\beta$ ： $c^{*}(i)=\gamma \cdot c(i)+\beta$ 通过综合考虑 $c^{*}(i)$ 和最后一层的输出 $s^{L}(i)$ 得到节点i最终的score $s^{*}(i)=\sigma_{s}\left(c^{*}(i) \cdot s^{L}(i)\right)$

architecture

在这里插入图片描述

为了增强注意力的效果，模型使用了多头注意力机制

We define $s_{h}^{\prime \ell-1}(j)$ to be node i’s score that is estimated by (ℓ − 1)-th layer, and fed into h-th SA head in ℓ-th (i.e., the next) layer, which in turn produces an aggregation $s_{h}^{\ell}(i)$ of these scores: