未验证 提交 096ab790 编写于 作者: H Huang Zhengjie 提交者: GitHub

Merge pull request #47 from ZHUI/fix_typo

Fix some typo of pgl-ke, add multi neg times support, so on.
# PGL - Knowledge Graph Embedding
## Introduction
This package is mainly for computing node and relation embedding of knowledge graphs efficiently.
This package reproduce the following knowledge embedding models:
- TransE
- TransR
- RotatE
## Dataset
The dataset WN18 and FB15k are originally published by TransE paper and and be download [here](https://everest.hds.utc.fr/doku.php?id=en:transe)
## Dependencies
If you want to use the PGL-KGE in paddle, please install following packages.
- paddlepaddle>=1.7
- pgl
## Experiment results
FB15k dataset
| Models |Mean Rank| Mrr | Hits@1 | Hits@3 | Hits@10 | MR@filter| Hits10@filter|
|----------|-------|-------|--------|--------|---------|---------|---------|
| TransE| 214 | -- | -- | -- | 0.491 | 118 | 0.668|
| TransR| 202 | -- | -- | -- | 0.502 | 115 | 0.683|
| RotatE| 156| -- | -- | -- | 0.498 | 52 | 0.710|
WN18 dataset
| Models |Mean Rank| Mrr | Hits@1 | Hits@3 | Hits@10 | MR@filter| Hits10@filter|
|----------|-------|-------|--------|--------|---------|---------|---------|
| TransE| 257 | -- | -- | -- | 0.800 | 245 | 0.915|
| TransR| 255 | -- | -- | -- | 0.8012| 243 | 0.9371|
| RotatE| 188 | -- | -- | -- | 0.8325| 176 | 0.9601|
## References
[1]. TransE https://ieeexplore.ieee.org/abstract/document/8047276
[2]. TransR http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9571/9523
[3]. RotatE https://arxiv.org/abs/1902.10197
#CUDA_VISIBLE_DEVICES=2 \
#FLAGS_fraction_of_gpu_memory_to_use=0.01 \
#python main.py \
# --use_cuda \
# --model TransE \
# --optimizer adam \
# --batch_size=512 \
# --learning_rate=0.001 \
# --epoch 100 \
# --evaluate_per_iteration 20 \
# --sample_workers 4 \
# --margin 4 \
## #--only_evaluate
#CUDA_VISIBLE_DEVICES=2 \
#FLAGS_fraction_of_gpu_memory_to_use=0.01 \
#python main.py \
# --use_cuda \
# --model RotatE \
# --data_dir ./data/WN18 \
# --optimizer adam \
# --batch_size=512 \
# --learning_rate=0.001 \
# --epoch 100 \
# --evaluate_per_iteration 100 \
# --sample_workers 10 \
# --margin 6 \
# --neg_times 10
CUDA_VISIBLE_DEVICES=2 \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model RotatE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 10 \
--margin 8 \
--neg_times 10 \
--neg_mode True
# PGL - Knowledge Graph Embedding
This package is mainly for computing node and relation embedding of knowledge graphs efficiently.
This package reproduce the following knowledge embedding models:
- TransE
- TransR
- RotatE
### Dataset
The dataset WN18 and FB15k are originally published by TransE paper and can be download [here](https://everest.hds.utc.fr/doku.php?id=en:transe).
FB15k: [https://drive.google.com/open?id=19I3LqaKjgq-3vOs0us7OgEL06TIs37W8](https://drive.google.com/open?id=19I3LqaKjgq-3vOs0us7OgEL06TIs37W8)
WN18: [https://drive.google.com/open?id=1MXy257ZsjeXQHZScHLeQeVnUTPjltlwD](https://drive.google.com/open?id=1MXy257ZsjeXQHZScHLeQeVnUTPjltlwD)
### Dependencies
If you want to use the PGL-KG in paddle, please install following packages.
- paddlepaddle>=1.7
- pgl
### Hyperparameters
- use\_cuda: use cuda to train.
- model: pgl-kg model names. Now available for `TransE`, `TransR` and `RotatE`.
- data\_dir: the data path of dataset.
- optimizer: optimizer to run the model.
- batch\_size: batch size.
- learning\_rate:learning rate.
- epoch: epochs to run.
- evaluate\_per\_iteration: evaluate after certain epochs.
- sample\_workers: sample workers nums to prepare data.
- margin: hyper-parameter for some model.
For more hyper parameters usages, please refer the `main.py`. We also provide `run.sh` script to reproduce performance results (please download dataset in `./data` and specify the data\_dir paramter).
### How to run
For examples, use GPU to train TransR model on WN18 dataset.
(please download WN18 dataset to `./data` floder)
```
python main.py --use_cuda --model TransR --data_dir ./data/WN18
```
We also provide `run.sh` script to reproduce following performance results.
### Experiment results
Here we report the experiment results on FB15k and WN18 dataset. The evaluation criteria are MR (mean rank), Mrr (mean reciprocal rank), Hit@N (The first N hit rate). The suffix `@f` means that we filter the exists relations of entities.
FB15k dataset
| Models | MR | Mrr | Hits@1 | Hits@3 | Hits@10| MR@f |Mrr@f|Hit1@f|Hit3@f|Hits10@f|
|--------|-----|-------|--------|--------|--------|-------|-----|------|------|--------|
| TransE | 215 | 0.205 | 0.093 | 0.234 | 0.446 | 74 |0.379| 0.235| 0.453| 0.647 |
| TransR | 304 | 0.193 | 0.092 | 0.211 | 0.418 | 156 |0.366| 0.232| 0.435| 0.623 |
| RotatE | 157 | 0.270 | 0.162 | 0.303 | 0.501 | 53 |0.478| 0.354| 0.547| 0.710 |
WN18 dataset
| Models | MR | Mrr | Hits@1 | Hits@3 | Hits@10| MR@f |Mrr@f|Hit1@f|Hit3@f|Hits10@f|
|--------|-----|-------|--------|--------|--------|-------|-----|------|------|--------|
| TransE | 219 | 0.338 | 0.082 | 0.523 | 0.800 | 208 |0.463| 0.135| 0.771| 0.932 |
| TransR | 321 | 0.370 | 0.096 | 0.591 | 0.810 | 309 |0.513| 0.158| 0.941| 0.941 |
| RotatE | 167 | 0.623 | 0.476 | 0.688 | 0.830 | 155 |0.915| 0.884| 0.941| 0.957 |
## References
[1]. [TransE: Translating embeddings for modeling multi-relational data.](https://ieeexplore.ieee.org/abstract/document/8047276)
[2]. [TransR: Learning entity and relation embeddings for knowledge graph completion.](http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9571/9523)
[3]. [RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space.](https://arxiv.org/abs/1902.10197)
......@@ -19,10 +19,11 @@ import os
import numpy as np
from collections import defaultdict
from pgl.utils.logger import log
from pybloom import BloomFilter
#from pybloom import BloomFilter
class KBloader:
class KGLoader:
"""
load the FB15K
"""
......@@ -65,8 +66,9 @@ class KBloader:
def training_data_no_filter(self, train_triple_positive):
"""faster, no filter for exists triples"""
size = len(train_triple_positive)
train_triple_negative = train_triple_positive + 0
size = len(train_triple_positive) * self._neg_times
train_triple_negative = train_triple_positive.repeat(
self._neg_times, axis=0)
replace_head_probability = 0.5 * np.ones(size)
replace_entity_id = np.random.randint(self.entity_total, size=size)
random_num = np.random.random(size=size)
......@@ -122,7 +124,6 @@ class KBloader:
"""
n = len(self._triple_train)
rand_idx = np.random.permutation(n)
rand_idx = rand_idx % n
n_triple = len(rand_idx)
start = 0
while start < n_triple:
......
......@@ -99,8 +99,10 @@ class Evaluate:
feed=batch_feed_dict)
yield batch_feed_dict["test_triple"], head_score, tail_score
n_used_eval_triple += 1
print('[{:.3f}s] #evaluation triple: {}/{}'.format(
timeit.default_timer() - start, n_used_eval_triple, 5000))
if n_used_eval_triple % 500 == 0:
print('[{:.3f}s] #evaluation triple: {}/{}'.format(
timeit.default_timer(
) - start, n_used_eval_triple, self.reader.test_num))
res_reader = mp_reader_mapper(
reader=iterator,
......
......@@ -16,10 +16,13 @@ The script to run these models.
"""
import argparse
import timeit
import os
import numpy as np
import paddle.fluid as fluid
from data_loader import KBloader
from data_loader import KGLoader
from evalutate import Evaluate
from model import model_dict
from model.utils import load_var
from mp_mapper import mp_reader_mapper
from pgl.utils.logger import log
......@@ -49,6 +52,7 @@ def run_round(batch_iter,
run_time = 0
data_time = 0
t2 = timeit.default_timer()
start_epoch_time = timeit.default_timer()
for batch_feed_dict in batch_iter():
batch += 1
t1 = timeit.default_timer()
......@@ -62,8 +66,11 @@ def run_round(batch_iter,
if batch % log_per_step == 0:
tmp_epoch += 1
if prefix == "train":
log.info("Epoch %s Ava Loss %s" %
(epoch + tmp_epoch, tmp_loss / batch))
log.info("Epoch %s (%.7f sec) Train Loss: %.7f" %
(epoch + tmp_epoch,
timeit.default_timer() - start_epoch_time,
tmp_loss[0] / batch))
start_epoch_time = timeit.default_timer()
else:
log.info("Batch %s" % batch)
batch = 0
......@@ -84,7 +91,7 @@ def train(args):
:param args: all args.
:return: None
"""
kgreader = KBloader(
kgreader = KGLoader(
batch_size=args.batch_size,
data_dir=args.data_dir,
neg_mode=args.neg_mode,
......@@ -117,8 +124,8 @@ def train(args):
reader = mp_reader_mapper(
data_repeat,
func=kgreader.training_data_map,
#func=kgreader.training_data_no_filter,
func=kgreader.training_data_no_filter
if args.nofilter else kgreader.training_data_map,
num_works=args.sample_workers)
return reader
......@@ -148,6 +155,20 @@ def train(args):
exe = fluid.Executor(places[0])
exe.run(model.startup_program)
exe.run(fluid.default_startup_program())
if args.pretrain and model.model_name in ["TransR", "transr"]:
pretrain_ent = os.path.join(args.checkpoint,
model.ent_name.replace("TransR", "TransE"))
pretrain_rel = os.path.join(args.checkpoint,
model.rel_name.replace("TransR", "TransE"))
if os.path.exists(pretrain_ent):
print("loading pretrain!")
#var = fluid.global_scope().find_var(model.ent_name)
load_var(exe, model.train_program, model.ent_name, pretrain_ent)
#var = fluid.global_scope().find_var(model.rel_name)
load_var(exe, model.train_program, model.rel_name, pretrain_rel)
else:
raise ValueError("pretrain file {} not exists!".format(
pretrain_ent))
prog = fluid.CompiledProgram(model.train_program).with_data_parallel(
loss_name=model.train_fetch_vars[0].name)
......@@ -182,9 +203,9 @@ def train(args):
log_per_step=kgreader.train_num // args.batch_size,
epoch=epoch * args.evaluate_per_iteration)
log.info("epoch\t%s" % ((1 + epoch) * args.evaluate_per_iteration))
if True:
fluid.io.save_params(
exe, dirname=args.checkpoint, main_program=model.train_program)
fluid.io.save_params(
exe, dirname=args.checkpoint, main_program=model.train_program)
if not args.noeval:
eva = Evaluate(kgreader)
eva.launch_evaluation(
exe=exe,
......@@ -273,6 +294,22 @@ def main():
parser.add_argument(
'--neg_mode', type=bool, help='return neg mode flag', default=False)
parser.add_argument(
'--nofilter',
type=bool,
help='don\'t filter invalid examples',
default=False)
parser.add_argument(
'--pretrain',
type=bool,
help='pretrain for TransR model',
default=False)
parser.add_argument(
'--noeval',
type=bool,
help='whether to evaluate the result',
default=False)
args = parser.parse_args()
log.info(args)
train(args)
......
......@@ -13,9 +13,9 @@
# limitations under the License.
"""
RotatE:
"Learning entity and relation embeddings for knowledge graph completion."
Lin, Yankai, et al.
https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9571/9523
"RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space."
Sun, Zhiqing, et al.
https://arxiv.org/abs/1902.10197
"""
import paddle.fluid as fluid
from .Model import Model
......
......@@ -34,6 +34,7 @@ class TransE(Model):
learning_rate,
args,
optimizer="adam"):
self._neg_times = args.neg_times
super(TransE, self).__init__(
model_name="TransE",
data_reader=data_reader,
......@@ -84,6 +85,9 @@ class TransE(Model):
fluid.layers.abs(pos_score), 1, keep_dim=False)
neg = fluid.layers.reduce_sum(
fluid.layers.abs(neg_score), 1, keep_dim=False)
neg = fluid.layers.reshape(
neg, shape=[-1, self._neg_times], inplace=True)
loss = fluid.layers.reduce_mean(
fluid.layers.relu(pos - neg + self._margin))
return [loss]
......
......@@ -36,6 +36,7 @@ class TransR(Model):
args,
optimizer="adam"):
"""init"""
self._neg_times = args.neg_times
super(TransR, self).__init__(
model_name="TransR",
data_reader=data_reader,
......@@ -60,19 +61,19 @@ class TransR(Model):
dtype="float32",
name=self.rel_name,
default_initializer=fluid.initializer.Xavier())
init_values = np.tile(
np.identity(
self._hidden_size, dtype="float32").reshape(-1),
(self._relation_total, 1))
transfer_matrix = fluid.layers.create_parameter(
shape=[
self._relation_total, self._hidden_size * self._hidden_size
],
dtype="float32",
name=self._prefix + "transfer_matrix", )
# Here is a trick, must init with identity matrix to get good hit@10 performance.
fluid.layers.assign(
np.tile(
np.identity(
self._hidden_size, dtype="float32").reshape(-1),
(self._relation_total, 1)),
transfer_matrix)
name=self._prefix + "transfer_matrix",
default_initializer=fluid.initializer.NumpyArrayInitializer(
init_values))
return entity_embedding, relation_embedding, transfer_matrix
def score_with_l2_normalize(self, head, rel, tail):
......@@ -111,7 +112,7 @@ class TransR(Model):
pos_head_trans = self.matmul_with_expend_dims(pos_head, rel_matrix)
pos_tail_trans = self.matmul_with_expend_dims(pos_tail, rel_matrix)
trans_neg = False
trans_neg = True
if trans_neg:
rel_matrix_neg = fluid.layers.reshape(
lookup_table(self.train_neg_input[:, 1], transfer_matrix),
......@@ -133,6 +134,9 @@ class TransR(Model):
fluid.layers.abs(pos_score), -1, keep_dim=False)
neg = fluid.layers.reduce_sum(
fluid.layers.abs(neg_score), -1, keep_dim=False)
neg = fluid.layers.reshape(
neg, shape=[-1, self._neg_times], inplace=True)
loss = fluid.layers.reduce_mean(
fluid.layers.relu(pos - neg + self._margin))
return [loss]
......
......@@ -56,3 +56,64 @@ def lookup_table_gather(index, input):
:return:
"""
return fluid.layers.gather(index=index, input=input, overwrite=False)
def _clone_var_in_block_(block, var):
assert isinstance(var, fluid.Variable)
if var.desc.type() == fluid.core.VarDesc.VarType.LOD_TENSOR:
return block.create_var(
name=var.name,
shape=var.shape,
dtype=var.dtype,
type=var.type,
lod_level=var.lod_level,
persistable=True)
else:
return block.create_var(
name=var.name,
shape=var.shape,
dtype=var.dtype,
type=var.type,
persistable=True)
def load_var(executor, main_program=None, var=None, filename=None):
"""
load_var to certain program
:param executor: executor
:param main_program: the program to load
:param var: the variable name in main_program.
:file_name: the file name of the file to load.
:return: None
"""
load_prog = fluid.Program()
load_block = load_prog.global_block()
if main_program is None:
main_program = fluid.default_main_program()
if not isinstance(main_program, fluid.Program):
raise TypeError("program should be as Program type or None")
vars = list(filter(None, main_program.list_vars()))
# save origin param shape
orig_para_shape = {}
load_var_map = {}
for each_var in vars:
if each_var.name != var:
continue
assert isinstance(each_var, fluid.Variable)
if each_var.type == fluid.core.VarDesc.VarType.RAW:
continue
if isinstance(each_var, fluid.framework.Parameter):
orig_para_shape[each_var.name] = tuple(each_var.desc.get_shape())
new_var = _clone_var_in_block_(load_block, each_var)
if filename is not None:
load_block.append_op(
type='load',
inputs={},
outputs={'Out': [new_var]},
attrs={'file_path': filename})
executor.run(load_prog)
......@@ -65,12 +65,16 @@ def mp_reader_mapper(reader, func, num_works=4):
all_process.append(p)
data_iter = reader()
if not hasattr(data_iter, "__next__"):
__next__ = data_iter.next
else:
__next__ = data_iter.__next__
def next_data():
"""next_data"""
_next = None
try:
_next = data_iter.next()
_next = __next__()
except StopIteration:
# log.debug(traceback.format_exc())
pass
......
device=3
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=1024 \
--learning_rate=0.001 \
--epoch 200 \
--evaluate_per_iteration 200 \
--sample_workers 1 \
--margin 1.0 \
--nofilter True \
--neg_times 10 \
--neg_mode True
#--only_evaluate
# TransE FB15k
# -----Raw-Average-Results
# MeanRank: 214.94, MRR: 0.2051, Hits@1: 0.0929, Hits@3: 0.2343, Hits@10: 0.4458
# -----Filter-Average-Results
# MeanRank: 74.41, MRR: 0.3793, Hits@1: 0.2351, Hits@3: 0.4538, Hits@10: 0.6570
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=1024 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 1 \
--margin 4 \
--nofilter True \
--neg_times 10 \
--neg_mode True
# TransE WN18
# -----Raw-Average-Results
# MeanRank: 219.08, MRR: 0.3383, Hits@1: 0.0821, Hits@3: 0.5233, Hits@10: 0.7997
# -----Filter-Average-Results
# MeanRank: 207.72, MRR: 0.4631, Hits@1: 0.1349, Hits@3: 0.7708, Hits@10: 0.9315
#for prertrain
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 30 \
--evaluate_per_iteration 30 \
--sample_workers 1 \
--margin 2.0 \
--nofilter True \
--noeval True \
--neg_times 10 \
--neg_mode True && \
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransR \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 200 \
--evaluate_per_iteration 200 \
--sample_workers 1 \
--margin 2.0 \
--pretrain True \
--nofilter True \
--neg_times 10 \
--neg_mode True
# FB15k TransR 200, pretrain 20
# -----Raw-Average-Results
# MeanRank: 303.81, MRR: 0.1931, Hits@1: 0.0920, Hits@3: 0.2109, Hits@10: 0.4181
# -----Filter-Average-Results
# MeanRank: 156.30, MRR: 0.3663, Hits@1: 0.2318, Hits@3: 0.4352, Hits@10: 0.6231
# for pretrain
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 30 \
--evaluate_per_iteration 30 \
--sample_workers 1 \
--margin 4.0 \
--nofilter True \
--noeval True \
--neg_times 10 \
--neg_mode True && \
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransR \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 1 \
--margin 4.0 \
--pretrain True \
--nofilter True \
--neg_times 10 \
--neg_mode True
# TransR WN18 100, pretrain 30
# -----Raw-Average-Results
# MeanRank: 321.41, MRR: 0.3706, Hits@1: 0.0955, Hits@3: 0.5906, Hits@10: 0.8099
# -----Filter-Average-Results
# MeanRank: 309.15, MRR: 0.5126, Hits@1: 0.1584, Hits@3: 0.8601, Hits@10: 0.9409
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model RotatE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 10 \
--margin 8 \
--neg_times 10 \
--neg_mode True
# RotatE FB15k
# -----Raw-Average-Results
# MeanRank: 156.85, MRR: 0.2699, Hits@1: 0.1615, Hits@3: 0.3031, Hits@10: 0.5006
# -----Filter-Average-Results
# MeanRank: 53.35, MRR: 0.4776, Hits@1: 0.3537, Hits@3: 0.5473, Hits@10: 0.7062
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model RotatE \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 10 \
--margin 6 \
--neg_times 10 \
--neg_mode True
# RotaE WN18
# -----Raw-Average-Results
# MeanRank: 167.27, MRR: 0.6025, Hits@1: 0.4764, Hits@3: 0.6880, Hits@10: 0.8298
# -----Filter-Average-Results
# MeanRank: 155.23, MRR: 0.9145, Hits@1: 0.8843, Hits@3: 0.9412, Hits@10: 0.9570
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册