未验证 提交 2e3c52a4 编写于 作者: K kirayummy 提交者: GitHub

Merge pull request #49 from PaddlePaddle/develop

Develop
# Distributed metapath2vec in PGL # Distributed metapath2vec, metapath2vec++, multi-metapath2vec++ in PGL
[metapath2vec](https://ericdongyx.github.io/papers/KDD17-dong-chawla-swami-metapath2vec.pdf) is a algorithm framework for representation learning in heterogeneous networks which contains multiple types of nodes and links. Given a heterogeneous graph, metapath2vec algorithm first generates meta-path-based random walks and then use skipgram model to train a language model. Based on PGL, we reproduce metapath2vec algorithm in distributed mode. [metapath2vec](https://ericdongyx.github.io/papers/KDD17-dong-chawla-swami-metapath2vec.pdf) is a algorithm framework for representation learning in heterogeneous networks which contains multiple types of nodes and links. Given a heterogeneous graph, metapath2vec algorithm first generates meta-path-based random walks and then use skipgram model to train a language model. Based on PGL, we reproduce metapath2vec algorithm in distributed mode.
## Datasets ### Datasets
DBLP: The dataset contains 14376 papers (P), 20 conferences (C), 14475 authors (A), and 8920 terms (T). There are 33791 nodes in this dataset. DBLP: The dataset contains 14376 papers (P), 20 conferences (C), 14475 authors (A), and 8920 terms (T). There are 33791 nodes in this dataset.
You can dowload datasets from [here](https://github.com/librahu/HIN-Datasets-for-Recommendation-and-Network-Embedding) You can dowload datasets from [here](https://github.com/librahu/HIN-Datasets-for-Recommendation-and-Network-Embedding)
We use the ```DBLP``` dataset for example. After downloading the dataset, put them, let's say, in ```./data/DBLP/``` . We use the ```DBLP``` dataset for example. After downloading the dataset, put them, let's say, in ```./data/DBLP/``` .
## Dependencies ### Dependencies
- paddlepaddle>=1.6 - paddlepaddle>=1.6
- pgl>=1.0.0 - pgl>=1.0.0
## How to run ### How to run
Before training, run the below command to do data preprocessing. Before training, run the below command to do data preprocessing.
```sh ```sh
python data_process.py --data_path ./data/DBLP --output_path ./data/data_processed python data_process.py --data_path ./data/DBLP --output_path ./data/data_processed
...@@ -30,11 +30,21 @@ python multi_class.py --dataset ./data/data_processed/author_label.txt --ckpt_pa ...@@ -30,11 +30,21 @@ python multi_class.py --dataset ./data/data_processed/author_label.txt --ckpt_pa
``` ```
### Model Selection
Actually, There are 3 models in this example, they are ```metapath2vec```, ```metapath2vec++``` and ```multi_metapath2vec++```. You can select different models by modifying ```config.yaml```.
## Hyperparameters In order to run ```metapath2vec++``` model, you can easily rewrite the hyper parameter of **neg_sample_type** to **m2v_plus**, then ```metapath2vec++``` model will be selected.
```multi-metapath2vec++``` means that you are not only use a single metapath, instead, you can use several metapaths at the same time to train the model. For example, you might want to use ```c2p-p2a-a2p-p2c``` and ```p2a-a2p``` simultaneously. Then you can rewrite the below hyper parameters in ```config.yaml```.
- **neg_sample_type**: "m2v_plus"
- **walk_mode**: "multi_m2v"
- **meta_path**: "c2p-p2a-a2p-p2c;p2a-a2p"
- **first_node_type**: "c;p"
### Hyperparameters
All the hyper parameters are saved in ```config.yaml``` file. So before training, you can open the config.yaml to modify the hyper parameters as you like. All the hyper parameters are saved in ```config.yaml``` file. So before training, you can open the config.yaml to modify the hyper parameters as you like.
Some important hyper parameters in config.yaml: Some important hyper parameters in ```config.yaml```:
- **edge_path**: the directory of graph data that you want to load - **edge_path**: the directory of graph data that you want to load
- **lr**: learning rate - **lr**: learning rate
- **neg_num**: number of negative samples. - **neg_num**: number of negative samples.
......
...@@ -31,7 +31,7 @@ is_distributed: False ...@@ -31,7 +31,7 @@ is_distributed: False
# trainging config # trainging config
epochs: 10 epochs: 10
optimizer: "sgd" optimizer: "sgd"
lr: 1.0 lr: 0.1
warm_start_from_dir: null warm_start_from_dir: null
walkpath_files: "None" walkpath_files: "None"
train_files: "None" train_files: "None"
......
...@@ -87,9 +87,12 @@ class NodeGenerator(object): ...@@ -87,9 +87,12 @@ class NodeGenerator(object):
idx = cc % num_n_type idx = cc % num_n_type
n_type = n_type_list[idx] n_type = n_type_list[idx]
try: try:
nodes = node_generators[n_type].next() nodes = next(node_generators[n_type])
except StopIteration as e: except StopIteration as e:
log.info("exception when iteration") log.info("node type of %s iteration finished in one epoch" %
(n_type))
node_generators[n_type] = \
self.graph.node_batch_iter(self.batch_size, n_type=n_type)
break break
yield (nodes, idx) yield (nodes, idx)
cc += 1 cc += 1
......
# PGL - Knowledge Graph Embedding
## Introduction
This package is mainly for computing node and relation embedding of knowledge graphs efficiently.
This package reproduce the following knowledge embedding models:
- TransE
- TransR
- RotatE
## Dataset
The dataset WN18 and FB15k are originally published by TransE paper and and be download [here](https://everest.hds.utc.fr/doku.php?id=en:transe)
## Dependencies
If you want to use the PGL-KGE in paddle, please install following packages.
- paddlepaddle>=1.7
- pgl
## Experiment results
FB15k dataset
| Models |Mean Rank| Mrr | Hits@1 | Hits@3 | Hits@10 | MR@filter| Hits10@filter|
|----------|-------|-------|--------|--------|---------|---------|---------|
| TransE| 214 | -- | -- | -- | 0.491 | 118 | 0.668|
| TransR| 202 | -- | -- | -- | 0.502 | 115 | 0.683|
| RotatE| 156| -- | -- | -- | 0.498 | 52 | 0.710|
WN18 dataset
| Models |Mean Rank| Mrr | Hits@1 | Hits@3 | Hits@10 | MR@filter| Hits10@filter|
|----------|-------|-------|--------|--------|---------|---------|---------|
| TransE| 257 | -- | -- | -- | 0.800 | 245 | 0.915|
| TransR| 255 | -- | -- | -- | 0.8012| 243 | 0.9371|
| RotatE| 188 | -- | -- | -- | 0.8325| 176 | 0.9601|
## References
[1]. TransE https://ieeexplore.ieee.org/abstract/document/8047276
[2]. TransR http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9571/9523
[3]. RotatE https://arxiv.org/abs/1902.10197
#CUDA_VISIBLE_DEVICES=2 \
#FLAGS_fraction_of_gpu_memory_to_use=0.01 \
#python main.py \
# --use_cuda \
# --model TransE \
# --optimizer adam \
# --batch_size=512 \
# --learning_rate=0.001 \
# --epoch 100 \
# --evaluate_per_iteration 20 \
# --sample_workers 4 \
# --margin 4 \
## #--only_evaluate
#CUDA_VISIBLE_DEVICES=2 \
#FLAGS_fraction_of_gpu_memory_to_use=0.01 \
#python main.py \
# --use_cuda \
# --model RotatE \
# --data_dir ./data/WN18 \
# --optimizer adam \
# --batch_size=512 \
# --learning_rate=0.001 \
# --epoch 100 \
# --evaluate_per_iteration 100 \
# --sample_workers 10 \
# --margin 6 \
# --neg_times 10
CUDA_VISIBLE_DEVICES=2 \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model RotatE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 10 \
--margin 8 \
--neg_times 10 \
--neg_mode True
# PGL - Knowledge Graph Embedding
This package is mainly for computing node and relation embedding of knowledge graphs efficiently.
This package reproduce the following knowledge embedding models:
- TransE
- TransR
- RotatE
### Dataset
The dataset WN18 and FB15k are originally published by TransE paper and can be download [here](https://everest.hds.utc.fr/doku.php?id=en:transe).
FB15k: [https://drive.google.com/open?id=19I3LqaKjgq-3vOs0us7OgEL06TIs37W8](https://drive.google.com/open?id=19I3LqaKjgq-3vOs0us7OgEL06TIs37W8)
WN18: [https://drive.google.com/open?id=1MXy257ZsjeXQHZScHLeQeVnUTPjltlwD](https://drive.google.com/open?id=1MXy257ZsjeXQHZScHLeQeVnUTPjltlwD)
### Dependencies
If you want to use the PGL-KG in paddle, please install following packages.
- paddlepaddle>=1.7
- pgl
### Hyperparameters
- use\_cuda: use cuda to train.
- model: pgl-kg model names. Now available for `TransE`, `TransR` and `RotatE`.
- data\_dir: the data path of dataset.
- optimizer: optimizer to run the model.
- batch\_size: batch size.
- learning\_rate:learning rate.
- epoch: epochs to run.
- evaluate\_per\_iteration: evaluate after certain epochs.
- sample\_workers: sample workers nums to prepare data.
- margin: hyper-parameter for some model.
For more hyper parameters usages, please refer the `main.py`. We also provide `run.sh` script to reproduce performance results (please download dataset in `./data` and specify the data\_dir paramter).
### How to run
For examples, use GPU to train TransR model on WN18 dataset.
(please download WN18 dataset to `./data` floder)
```
python main.py --use_cuda --model TransR --data_dir ./data/WN18
```
We also provide `run.sh` script to reproduce following performance results.
### Experiment results
Here we report the experiment results on FB15k and WN18 dataset. The evaluation criteria are MR (mean rank), Mrr (mean reciprocal rank), Hit@N (The first N hit rate). The suffix `@f` means that we filter the exists relations of entities.
FB15k dataset
| Models | MR | Mrr | Hits@1 | Hits@3 | Hits@10| MR@f |Mrr@f|Hit1@f|Hit3@f|Hits10@f|
|--------|-----|-------|--------|--------|--------|-------|-----|------|------|--------|
| TransE | 215 | 0.205 | 0.093 | 0.234 | 0.446 | 74 |0.379| 0.235| 0.453| 0.647 |
| TransR | 304 | 0.193 | 0.092 | 0.211 | 0.418 | 156 |0.366| 0.232| 0.435| 0.623 |
| RotatE | 157 | 0.270 | 0.162 | 0.303 | 0.501 | 53 |0.478| 0.354| 0.547| 0.710 |
WN18 dataset
| Models | MR | Mrr | Hits@1 | Hits@3 | Hits@10| MR@f |Mrr@f|Hit1@f|Hit3@f|Hits10@f|
|--------|-----|-------|--------|--------|--------|-------|-----|------|------|--------|
| TransE | 219 | 0.338 | 0.082 | 0.523 | 0.800 | 208 |0.463| 0.135| 0.771| 0.932 |
| TransR | 321 | 0.370 | 0.096 | 0.591 | 0.810 | 309 |0.513| 0.158| 0.941| 0.941 |
| RotatE | 167 | 0.623 | 0.476 | 0.688 | 0.830 | 155 |0.915| 0.884| 0.941| 0.957 |
## References
[1]. [TransE: Translating embeddings for modeling multi-relational data.](https://ieeexplore.ieee.org/abstract/document/8047276)
[2]. [TransR: Learning entity and relation embeddings for knowledge graph completion.](http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9571/9523)
[3]. [RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space.](https://arxiv.org/abs/1902.10197)
...@@ -19,10 +19,11 @@ import os ...@@ -19,10 +19,11 @@ import os
import numpy as np import numpy as np
from collections import defaultdict from collections import defaultdict
from pgl.utils.logger import log from pgl.utils.logger import log
from pybloom import BloomFilter
#from pybloom import BloomFilter
class KBloader:
class KGLoader:
""" """
load the FB15K load the FB15K
""" """
...@@ -65,8 +66,9 @@ class KBloader: ...@@ -65,8 +66,9 @@ class KBloader:
def training_data_no_filter(self, train_triple_positive): def training_data_no_filter(self, train_triple_positive):
"""faster, no filter for exists triples""" """faster, no filter for exists triples"""
size = len(train_triple_positive) size = len(train_triple_positive) * self._neg_times
train_triple_negative = train_triple_positive + 0 train_triple_negative = train_triple_positive.repeat(
self._neg_times, axis=0)
replace_head_probability = 0.5 * np.ones(size) replace_head_probability = 0.5 * np.ones(size)
replace_entity_id = np.random.randint(self.entity_total, size=size) replace_entity_id = np.random.randint(self.entity_total, size=size)
random_num = np.random.random(size=size) random_num = np.random.random(size=size)
...@@ -122,7 +124,6 @@ class KBloader: ...@@ -122,7 +124,6 @@ class KBloader:
""" """
n = len(self._triple_train) n = len(self._triple_train)
rand_idx = np.random.permutation(n) rand_idx = np.random.permutation(n)
rand_idx = rand_idx % n
n_triple = len(rand_idx) n_triple = len(rand_idx)
start = 0 start = 0
while start < n_triple: while start < n_triple:
......
...@@ -99,8 +99,10 @@ class Evaluate: ...@@ -99,8 +99,10 @@ class Evaluate:
feed=batch_feed_dict) feed=batch_feed_dict)
yield batch_feed_dict["test_triple"], head_score, tail_score yield batch_feed_dict["test_triple"], head_score, tail_score
n_used_eval_triple += 1 n_used_eval_triple += 1
print('[{:.3f}s] #evaluation triple: {}/{}'.format( if n_used_eval_triple % 500 == 0:
timeit.default_timer() - start, n_used_eval_triple, 5000)) print('[{:.3f}s] #evaluation triple: {}/{}'.format(
timeit.default_timer(
) - start, n_used_eval_triple, self.reader.test_num))
res_reader = mp_reader_mapper( res_reader = mp_reader_mapper(
reader=iterator, reader=iterator,
......
...@@ -16,10 +16,13 @@ The script to run these models. ...@@ -16,10 +16,13 @@ The script to run these models.
""" """
import argparse import argparse
import timeit import timeit
import os
import numpy as np
import paddle.fluid as fluid import paddle.fluid as fluid
from data_loader import KBloader from data_loader import KGLoader
from evalutate import Evaluate from evalutate import Evaluate
from model import model_dict from model import model_dict
from model.utils import load_var
from mp_mapper import mp_reader_mapper from mp_mapper import mp_reader_mapper
from pgl.utils.logger import log from pgl.utils.logger import log
...@@ -49,6 +52,7 @@ def run_round(batch_iter, ...@@ -49,6 +52,7 @@ def run_round(batch_iter,
run_time = 0 run_time = 0
data_time = 0 data_time = 0
t2 = timeit.default_timer() t2 = timeit.default_timer()
start_epoch_time = timeit.default_timer()
for batch_feed_dict in batch_iter(): for batch_feed_dict in batch_iter():
batch += 1 batch += 1
t1 = timeit.default_timer() t1 = timeit.default_timer()
...@@ -62,8 +66,11 @@ def run_round(batch_iter, ...@@ -62,8 +66,11 @@ def run_round(batch_iter,
if batch % log_per_step == 0: if batch % log_per_step == 0:
tmp_epoch += 1 tmp_epoch += 1
if prefix == "train": if prefix == "train":
log.info("Epoch %s Ava Loss %s" % log.info("Epoch %s (%.7f sec) Train Loss: %.7f" %
(epoch + tmp_epoch, tmp_loss / batch)) (epoch + tmp_epoch,
timeit.default_timer() - start_epoch_time,
tmp_loss[0] / batch))
start_epoch_time = timeit.default_timer()
else: else:
log.info("Batch %s" % batch) log.info("Batch %s" % batch)
batch = 0 batch = 0
...@@ -84,7 +91,7 @@ def train(args): ...@@ -84,7 +91,7 @@ def train(args):
:param args: all args. :param args: all args.
:return: None :return: None
""" """
kgreader = KBloader( kgreader = KGLoader(
batch_size=args.batch_size, batch_size=args.batch_size,
data_dir=args.data_dir, data_dir=args.data_dir,
neg_mode=args.neg_mode, neg_mode=args.neg_mode,
...@@ -117,8 +124,8 @@ def train(args): ...@@ -117,8 +124,8 @@ def train(args):
reader = mp_reader_mapper( reader = mp_reader_mapper(
data_repeat, data_repeat,
func=kgreader.training_data_map, func=kgreader.training_data_no_filter
#func=kgreader.training_data_no_filter, if args.nofilter else kgreader.training_data_map,
num_works=args.sample_workers) num_works=args.sample_workers)
return reader return reader
...@@ -148,6 +155,20 @@ def train(args): ...@@ -148,6 +155,20 @@ def train(args):
exe = fluid.Executor(places[0]) exe = fluid.Executor(places[0])
exe.run(model.startup_program) exe.run(model.startup_program)
exe.run(fluid.default_startup_program()) exe.run(fluid.default_startup_program())
if args.pretrain and model.model_name in ["TransR", "transr"]:
pretrain_ent = os.path.join(args.checkpoint,
model.ent_name.replace("TransR", "TransE"))
pretrain_rel = os.path.join(args.checkpoint,
model.rel_name.replace("TransR", "TransE"))
if os.path.exists(pretrain_ent):
print("loading pretrain!")
#var = fluid.global_scope().find_var(model.ent_name)
load_var(exe, model.train_program, model.ent_name, pretrain_ent)
#var = fluid.global_scope().find_var(model.rel_name)
load_var(exe, model.train_program, model.rel_name, pretrain_rel)
else:
raise ValueError("pretrain file {} not exists!".format(
pretrain_ent))
prog = fluid.CompiledProgram(model.train_program).with_data_parallel( prog = fluid.CompiledProgram(model.train_program).with_data_parallel(
loss_name=model.train_fetch_vars[0].name) loss_name=model.train_fetch_vars[0].name)
...@@ -182,9 +203,9 @@ def train(args): ...@@ -182,9 +203,9 @@ def train(args):
log_per_step=kgreader.train_num // args.batch_size, log_per_step=kgreader.train_num // args.batch_size,
epoch=epoch * args.evaluate_per_iteration) epoch=epoch * args.evaluate_per_iteration)
log.info("epoch\t%s" % ((1 + epoch) * args.evaluate_per_iteration)) log.info("epoch\t%s" % ((1 + epoch) * args.evaluate_per_iteration))
if True: fluid.io.save_params(
fluid.io.save_params( exe, dirname=args.checkpoint, main_program=model.train_program)
exe, dirname=args.checkpoint, main_program=model.train_program) if not args.noeval:
eva = Evaluate(kgreader) eva = Evaluate(kgreader)
eva.launch_evaluation( eva.launch_evaluation(
exe=exe, exe=exe,
...@@ -273,6 +294,22 @@ def main(): ...@@ -273,6 +294,22 @@ def main():
parser.add_argument( parser.add_argument(
'--neg_mode', type=bool, help='return neg mode flag', default=False) '--neg_mode', type=bool, help='return neg mode flag', default=False)
parser.add_argument(
'--nofilter',
type=bool,
help='don\'t filter invalid examples',
default=False)
parser.add_argument(
'--pretrain',
type=bool,
help='pretrain for TransR model',
default=False)
parser.add_argument(
'--noeval',
type=bool,
help='whether to evaluate the result',
default=False)
args = parser.parse_args() args = parser.parse_args()
log.info(args) log.info(args)
train(args) train(args)
......
...@@ -13,9 +13,9 @@ ...@@ -13,9 +13,9 @@
# limitations under the License. # limitations under the License.
""" """
RotatE: RotatE:
"Learning entity and relation embeddings for knowledge graph completion." "RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space."
Lin, Yankai, et al. Sun, Zhiqing, et al.
https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9571/9523 https://arxiv.org/abs/1902.10197
""" """
import paddle.fluid as fluid import paddle.fluid as fluid
from .Model import Model from .Model import Model
......
...@@ -34,6 +34,7 @@ class TransE(Model): ...@@ -34,6 +34,7 @@ class TransE(Model):
learning_rate, learning_rate,
args, args,
optimizer="adam"): optimizer="adam"):
self._neg_times = args.neg_times
super(TransE, self).__init__( super(TransE, self).__init__(
model_name="TransE", model_name="TransE",
data_reader=data_reader, data_reader=data_reader,
...@@ -84,6 +85,9 @@ class TransE(Model): ...@@ -84,6 +85,9 @@ class TransE(Model):
fluid.layers.abs(pos_score), 1, keep_dim=False) fluid.layers.abs(pos_score), 1, keep_dim=False)
neg = fluid.layers.reduce_sum( neg = fluid.layers.reduce_sum(
fluid.layers.abs(neg_score), 1, keep_dim=False) fluid.layers.abs(neg_score), 1, keep_dim=False)
neg = fluid.layers.reshape(
neg, shape=[-1, self._neg_times], inplace=True)
loss = fluid.layers.reduce_mean( loss = fluid.layers.reduce_mean(
fluid.layers.relu(pos - neg + self._margin)) fluid.layers.relu(pos - neg + self._margin))
return [loss] return [loss]
......
...@@ -36,6 +36,7 @@ class TransR(Model): ...@@ -36,6 +36,7 @@ class TransR(Model):
args, args,
optimizer="adam"): optimizer="adam"):
"""init""" """init"""
self._neg_times = args.neg_times
super(TransR, self).__init__( super(TransR, self).__init__(
model_name="TransR", model_name="TransR",
data_reader=data_reader, data_reader=data_reader,
...@@ -60,19 +61,19 @@ class TransR(Model): ...@@ -60,19 +61,19 @@ class TransR(Model):
dtype="float32", dtype="float32",
name=self.rel_name, name=self.rel_name,
default_initializer=fluid.initializer.Xavier()) default_initializer=fluid.initializer.Xavier())
init_values = np.tile(
np.identity(
self._hidden_size, dtype="float32").reshape(-1),
(self._relation_total, 1))
transfer_matrix = fluid.layers.create_parameter( transfer_matrix = fluid.layers.create_parameter(
shape=[ shape=[
self._relation_total, self._hidden_size * self._hidden_size self._relation_total, self._hidden_size * self._hidden_size
], ],
dtype="float32", dtype="float32",
name=self._prefix + "transfer_matrix", ) name=self._prefix + "transfer_matrix",
# Here is a trick, must init with identity matrix to get good hit@10 performance. default_initializer=fluid.initializer.NumpyArrayInitializer(
fluid.layers.assign( init_values))
np.tile(
np.identity(
self._hidden_size, dtype="float32").reshape(-1),
(self._relation_total, 1)),
transfer_matrix)
return entity_embedding, relation_embedding, transfer_matrix return entity_embedding, relation_embedding, transfer_matrix
def score_with_l2_normalize(self, head, rel, tail): def score_with_l2_normalize(self, head, rel, tail):
...@@ -111,7 +112,7 @@ class TransR(Model): ...@@ -111,7 +112,7 @@ class TransR(Model):
pos_head_trans = self.matmul_with_expend_dims(pos_head, rel_matrix) pos_head_trans = self.matmul_with_expend_dims(pos_head, rel_matrix)
pos_tail_trans = self.matmul_with_expend_dims(pos_tail, rel_matrix) pos_tail_trans = self.matmul_with_expend_dims(pos_tail, rel_matrix)
trans_neg = False trans_neg = True
if trans_neg: if trans_neg:
rel_matrix_neg = fluid.layers.reshape( rel_matrix_neg = fluid.layers.reshape(
lookup_table(self.train_neg_input[:, 1], transfer_matrix), lookup_table(self.train_neg_input[:, 1], transfer_matrix),
...@@ -133,6 +134,9 @@ class TransR(Model): ...@@ -133,6 +134,9 @@ class TransR(Model):
fluid.layers.abs(pos_score), -1, keep_dim=False) fluid.layers.abs(pos_score), -1, keep_dim=False)
neg = fluid.layers.reduce_sum( neg = fluid.layers.reduce_sum(
fluid.layers.abs(neg_score), -1, keep_dim=False) fluid.layers.abs(neg_score), -1, keep_dim=False)
neg = fluid.layers.reshape(
neg, shape=[-1, self._neg_times], inplace=True)
loss = fluid.layers.reduce_mean( loss = fluid.layers.reduce_mean(
fluid.layers.relu(pos - neg + self._margin)) fluid.layers.relu(pos - neg + self._margin))
return [loss] return [loss]
......
...@@ -56,3 +56,64 @@ def lookup_table_gather(index, input): ...@@ -56,3 +56,64 @@ def lookup_table_gather(index, input):
:return: :return:
""" """
return fluid.layers.gather(index=index, input=input, overwrite=False) return fluid.layers.gather(index=index, input=input, overwrite=False)
def _clone_var_in_block_(block, var):
assert isinstance(var, fluid.Variable)
if var.desc.type() == fluid.core.VarDesc.VarType.LOD_TENSOR:
return block.create_var(
name=var.name,
shape=var.shape,
dtype=var.dtype,
type=var.type,
lod_level=var.lod_level,
persistable=True)
else:
return block.create_var(
name=var.name,
shape=var.shape,
dtype=var.dtype,
type=var.type,
persistable=True)
def load_var(executor, main_program=None, var=None, filename=None):
"""
load_var to certain program
:param executor: executor
:param main_program: the program to load
:param var: the variable name in main_program.
:file_name: the file name of the file to load.
:return: None
"""
load_prog = fluid.Program()
load_block = load_prog.global_block()
if main_program is None:
main_program = fluid.default_main_program()
if not isinstance(main_program, fluid.Program):
raise TypeError("program should be as Program type or None")
vars = list(filter(None, main_program.list_vars()))
# save origin param shape
orig_para_shape = {}
load_var_map = {}
for each_var in vars:
if each_var.name != var:
continue
assert isinstance(each_var, fluid.Variable)
if each_var.type == fluid.core.VarDesc.VarType.RAW:
continue
if isinstance(each_var, fluid.framework.Parameter):
orig_para_shape[each_var.name] = tuple(each_var.desc.get_shape())
new_var = _clone_var_in_block_(load_block, each_var)
if filename is not None:
load_block.append_op(
type='load',
inputs={},
outputs={'Out': [new_var]},
attrs={'file_path': filename})
executor.run(load_prog)
...@@ -65,12 +65,16 @@ def mp_reader_mapper(reader, func, num_works=4): ...@@ -65,12 +65,16 @@ def mp_reader_mapper(reader, func, num_works=4):
all_process.append(p) all_process.append(p)
data_iter = reader() data_iter = reader()
if not hasattr(data_iter, "__next__"):
__next__ = data_iter.next
else:
__next__ = data_iter.__next__
def next_data(): def next_data():
"""next_data""" """next_data"""
_next = None _next = None
try: try:
_next = data_iter.next() _next = __next__()
except StopIteration: except StopIteration:
# log.debug(traceback.format_exc()) # log.debug(traceback.format_exc())
pass pass
......
device=3
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=1024 \
--learning_rate=0.001 \
--epoch 200 \
--evaluate_per_iteration 200 \
--sample_workers 1 \
--margin 1.0 \
--nofilter True \
--neg_times 10 \
--neg_mode True
#--only_evaluate
# TransE FB15k
# -----Raw-Average-Results
# MeanRank: 214.94, MRR: 0.2051, Hits@1: 0.0929, Hits@3: 0.2343, Hits@10: 0.4458
# -----Filter-Average-Results
# MeanRank: 74.41, MRR: 0.3793, Hits@1: 0.2351, Hits@3: 0.4538, Hits@10: 0.6570
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=1024 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 1 \
--margin 4 \
--nofilter True \
--neg_times 10 \
--neg_mode True
# TransE WN18
# -----Raw-Average-Results
# MeanRank: 219.08, MRR: 0.3383, Hits@1: 0.0821, Hits@3: 0.5233, Hits@10: 0.7997
# -----Filter-Average-Results
# MeanRank: 207.72, MRR: 0.4631, Hits@1: 0.1349, Hits@3: 0.7708, Hits@10: 0.9315
#for prertrain
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 30 \
--evaluate_per_iteration 30 \
--sample_workers 1 \
--margin 2.0 \
--nofilter True \
--noeval True \
--neg_times 10 \
--neg_mode True && \
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransR \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 200 \
--evaluate_per_iteration 200 \
--sample_workers 1 \
--margin 2.0 \
--pretrain True \
--nofilter True \
--neg_times 10 \
--neg_mode True
# FB15k TransR 200, pretrain 20
# -----Raw-Average-Results
# MeanRank: 303.81, MRR: 0.1931, Hits@1: 0.0920, Hits@3: 0.2109, Hits@10: 0.4181
# -----Filter-Average-Results
# MeanRank: 156.30, MRR: 0.3663, Hits@1: 0.2318, Hits@3: 0.4352, Hits@10: 0.6231
# for pretrain
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransE \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 30 \
--evaluate_per_iteration 30 \
--sample_workers 1 \
--margin 4.0 \
--nofilter True \
--noeval True \
--neg_times 10 \
--neg_mode True && \
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model TransR \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 1 \
--margin 4.0 \
--pretrain True \
--nofilter True \
--neg_times 10 \
--neg_mode True
# TransR WN18 100, pretrain 30
# -----Raw-Average-Results
# MeanRank: 321.41, MRR: 0.3706, Hits@1: 0.0955, Hits@3: 0.5906, Hits@10: 0.8099
# -----Filter-Average-Results
# MeanRank: 309.15, MRR: 0.5126, Hits@1: 0.1584, Hits@3: 0.8601, Hits@10: 0.9409
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model RotatE \
--data_dir ./data/FB15k \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 10 \
--margin 8 \
--neg_times 10 \
--neg_mode True
# RotatE FB15k
# -----Raw-Average-Results
# MeanRank: 156.85, MRR: 0.2699, Hits@1: 0.1615, Hits@3: 0.3031, Hits@10: 0.5006
# -----Filter-Average-Results
# MeanRank: 53.35, MRR: 0.4776, Hits@1: 0.3537, Hits@3: 0.5473, Hits@10: 0.7062
CUDA_VISIBLE_DEVICES=$device \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python main.py \
--use_cuda \
--model RotatE \
--data_dir ./data/WN18 \
--optimizer adam \
--batch_size=512 \
--learning_rate=0.001 \
--epoch 100 \
--evaluate_per_iteration 100 \
--sample_workers 10 \
--margin 6 \
--neg_times 10 \
--neg_mode True
# RotaE WN18
# -----Raw-Average-Results
# MeanRank: 167.27, MRR: 0.6025, Hits@1: 0.4764, Hits@3: 0.6880, Hits@10: 0.8298
# -----Filter-Average-Results
# MeanRank: 155.23, MRR: 0.9145, Hits@1: 0.8843, Hits@3: 0.9412, Hits@10: 0.9570
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册