提交 33ea7c74 编写于 作者: F frankwhzhang

update api & check version

上级 cc1167d7
...@@ -44,6 +44,10 @@ session-based推荐应用场景非常广泛,比如用户的商品浏览、新 ...@@ -44,6 +44,10 @@ session-based推荐应用场景非常广泛,比如用户的商品浏览、新
运行样例程序可跳过'RSC15 数据下载及预处理'部分 运行样例程序可跳过'RSC15 数据下载及预处理'部分
**要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。**
## RSC15 数据下载及预处理 ## RSC15 数据下载及预处理
运行命令 下载RSC15官网数据集 运行命令 下载RSC15官网数据集
......
...@@ -71,6 +71,7 @@ def infer(test_reader, use_cuda, model_path): ...@@ -71,6 +71,7 @@ def infer(test_reader, use_cuda, model_path):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
args = parse_args() args = parse_args()
start_index = args.start_index start_index = args.start_index
last_index = args.last_index last_index = args.last_index
......
...@@ -84,6 +84,7 @@ def infer(args, vocab_size, test_reader, use_cuda): ...@@ -84,6 +84,7 @@ def infer(args, vocab_size, test_reader, use_cuda):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
args = parse_args() args = parse_args()
start_index = args.start_index start_index = args.start_index
last_index = args.last_index last_index = args.last_index
......
...@@ -169,4 +169,5 @@ def get_device(args): ...@@ -169,4 +169,5 @@ def get_device(args):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
train() train()
...@@ -128,4 +128,5 @@ def train(): ...@@ -128,4 +128,5 @@ def train():
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
train() train()
...@@ -115,6 +115,20 @@ def prepare_data(file_dir, ...@@ -115,6 +115,20 @@ def prepare_data(file_dir,
file_dir, buffer_size, data_type=DataType.SEQ), batch_size) file_dir, buffer_size, data_type=DataType.SEQ), batch_size)
return vocab_size, reader return vocab_size, reader
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
def sort_batch(reader, batch_size, sort_group_size, drop_last=False): def sort_batch(reader, batch_size, sort_group_size, drop_last=False):
""" """
......
...@@ -3,6 +3,9 @@ ...@@ -3,6 +3,9 @@
## Introduction ## Introduction
In personalized recommendation scenario, a user often is provided with several items from personalized interest matching model. In real world application, a user may have multiple views of features, say user-id, age, click-history of items, search queries. A item, e.g. news, may also have multiple views of features like news title, news category, images in news and so on. Multi-view Simnet is matching a model that combine users' and items' multiple views of features into one unified model. The model can be used in many industrial product like Baidu's feed news. The model is adapted from the paper A Multi-View Deep Learning(MV-DNN) Approach for Cross Domain User Modeling in Recommendation Systems, WWW 2015. The difference between our model and the MV-DNN is that we also consider multiple feature views of users. In personalized recommendation scenario, a user often is provided with several items from personalized interest matching model. In real world application, a user may have multiple views of features, say user-id, age, click-history of items, search queries. A item, e.g. news, may also have multiple views of features like news title, news category, images in news and so on. Multi-view Simnet is matching a model that combine users' and items' multiple views of features into one unified model. The model can be used in many industrial product like Baidu's feed news. The model is adapted from the paper A Multi-View Deep Learning(MV-DNN) Approach for Cross Domain User Modeling in Recommendation Systems, WWW 2015. The difference between our model and the MV-DNN is that we also consider multiple feature views of users.
**Now all models in PaddleRec require PaddlePaddle version 1.6 or higher, or suitable develop version.**
## Dataset ## Dataset
Currently, synthetic dataset is provided for proof of concept and we aim to add more real world dataset in this project in the future. The result is inaccurate because of synthetic dataset. Currently, synthetic dataset is provided for proof of concept and we aim to add more real world dataset in this project in the future. The result is inaccurate because of synthetic dataset.
......
...@@ -31,6 +31,23 @@ logger = logging.getLogger("fluid") ...@@ -31,6 +31,23 @@ logger = logging.getLogger("fluid")
logger.setLevel(logging.INFO) logger.setLevel(logging.INFO)
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
def parse_args(): def parse_args():
parser = argparse.ArgumentParser("multi-view simnet") parser = argparse.ArgumentParser("multi-view simnet")
parser.add_argument("--train_file", type=str, help="Training file") parser.add_argument("--train_file", type=str, help="Training file")
...@@ -116,4 +133,5 @@ def main(): ...@@ -116,4 +133,5 @@ def main():
if __name__ == "__main__": if __name__ == "__main__":
check_version()
main() main()
...@@ -88,6 +88,21 @@ def parse_args(): ...@@ -88,6 +88,21 @@ def parse_args():
return parser.parse_args() return parser.parse_args()
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
def start_train(args): def start_train(args):
if args.enable_ce: if args.enable_ce:
SEED = 102 SEED = 102
...@@ -170,4 +185,5 @@ def main(): ...@@ -170,4 +185,5 @@ def main():
if __name__ == "__main__": if __name__ == "__main__":
check_version()
main() main()
...@@ -12,6 +12,10 @@ Sequence Semantic Retrieval(SSR) Model shares the similar idea with Multi-Rate D ...@@ -12,6 +12,10 @@ Sequence Semantic Retrieval(SSR) Model shares the similar idea with Multi-Rate D
- The idea of SSR is to model a user's personalized interest of an item through matching model structure, and the representation of a news item can be computed online even the news item does not exist in training dataset. - The idea of SSR is to model a user's personalized interest of an item through matching model structure, and the representation of a news item can be computed online even the news item does not exist in training dataset.
- With the representation of news items, we are able to build an vector indexing service online for news prediction and this is the retrieval part of SSR. - With the representation of news items, we are able to build an vector indexing service online for news prediction and this is the retrieval part of SSR.
## Version
**Now all models in PaddleRec require PaddlePaddle version 1.6 or higher, or suitable develop version.**
## Dataset ## Dataset
Dataset preprocessing follows the method of [GRU4Rec Project](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec). Note that you should reuse scripts from GRU4Rec project for data preprocessing. Dataset preprocessing follows the method of [GRU4Rec Project](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec). Note that you should reuse scripts from GRU4Rec project for data preprocessing.
......
...@@ -120,6 +120,7 @@ def infer(args, vocab_size, test_reader): ...@@ -120,6 +120,7 @@ def infer(args, vocab_size, test_reader):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
args = parse_args() args = parse_args()
start_index = args.start_index start_index = args.start_index
last_index = args.last_index last_index = args.last_index
......
...@@ -165,4 +165,5 @@ def main(): ...@@ -165,4 +165,5 @@ def main():
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
main() main()
...@@ -30,6 +30,20 @@ def construct_test_data(file_dir, vocab_path, batch_size): ...@@ -30,6 +30,20 @@ def construct_test_data(file_dir, vocab_path, batch_size):
test_reader = fluid.io.batch(y_data.test(files), batch_size=batch_size) test_reader = fluid.io.batch(y_data.test(files), batch_size=batch_size)
return test_reader, vocab_size return test_reader, vocab_size
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
def infer_data(raw_data, place): def infer_data(raw_data, place):
data = [dat[0] for dat in raw_data] data = [dat[0] for dat in raw_data]
......
...@@ -26,6 +26,7 @@ TagSpace模型的介绍可以参阅论文[#TagSpace: Semantic Embeddings from Ha ...@@ -26,6 +26,7 @@ TagSpace模型的介绍可以参阅论文[#TagSpace: Semantic Embeddings from Ha
Tagspace模型学习文本及标签的embedding表示,应用于工业级的标签推荐,具体应用场景有feed新闻标签推荐。 Tagspace模型学习文本及标签的embedding表示,应用于工业级的标签推荐,具体应用场景有feed新闻标签推荐。
**Now all models in PaddleRec require PaddlePaddle version 1.6 or higher, or suitable develop version.**
## 数据下载及预处理 ## 数据下载及预处理
......
...@@ -71,6 +71,7 @@ def infer(test_reader, vocab_tag, use_cuda, model_path, epoch): ...@@ -71,6 +71,7 @@ def infer(test_reader, vocab_tag, use_cuda, model_path, epoch):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
args = parse_args() args = parse_args()
start_index = args.start_index start_index = args.start_index
last_index = args.last_index last_index = args.last_index
......
...@@ -168,4 +168,5 @@ def get_device(args): ...@@ -168,4 +168,5 @@ def get_device(args):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
train() train()
...@@ -29,6 +29,21 @@ def get_vocab_size(vocab_path): ...@@ -29,6 +29,21 @@ def get_vocab_size(vocab_path):
line = rf.readline() line = rf.readline()
return int(line.strip()) return int(line.strip())
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
def prepare_data(file_dir, def prepare_data(file_dir,
vocab_text_path, vocab_text_path,
......
...@@ -20,6 +20,7 @@ ...@@ -20,6 +20,7 @@
## 介绍 ## 介绍
本例实现了skip-gram模式的word2vector模型。 本例实现了skip-gram模式的word2vector模型。
**目前模型库下模型均要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。**
## 数据下载 ## 数据下载
全量数据集使用的是来自1 Billion Word Language Model Benchmark的(http://www.statmt.org/lm-benchmark) 的数据集. 全量数据集使用的是来自1 Billion Word Language Model Benchmark的(http://www.statmt.org/lm-benchmark) 的数据集.
......
...@@ -185,6 +185,7 @@ def infer_step(args, vocab_size, test_reader, use_cuda, i2w): ...@@ -185,6 +185,7 @@ def infer_step(args, vocab_size, test_reader, use_cuda, i2w):
if __name__ == "__main__": if __name__ == "__main__":
utils.check_version()
args = parse_args() args = parse_args()
start_index = args.start_index start_index = args.start_index
last_index = args.last_index last_index = args.last_index
......
...@@ -224,5 +224,6 @@ def train(args): ...@@ -224,5 +224,6 @@ def train(args):
if __name__ == '__main__': if __name__ == '__main__':
utils.check_version()
args = parse_args() args = parse_args()
train(args) train(args)
...@@ -25,6 +25,20 @@ def prepare_data(file_dir, dict_path, batch_size): ...@@ -25,6 +25,20 @@ def prepare_data(file_dir, dict_path, batch_size):
reader = fluid.io.batch(test(file_dir, w2i), batch_size) reader = fluid.io.batch(test(file_dir, w2i), batch_size)
return vocab_size, reader, i2w return vocab_size, reader, i2w
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
def native_to_unicode(s): def native_to_unicode(s):
if _is_unicode(s): if _is_unicode(s):
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册