README.md 3.3 KB
Newer Older
G
guru4elephant 已提交
1
# Sequence Semantic Retrieval Model
2 3

## Introduction
D
dongdaxiang 已提交
4 5
In news recommendation scenarios, different from traditional systems that recommend entertainment items such as movies or music, there are several new problems to solve.
- Very sparse user profile features exist that a user may login a news recommendation app anonymously and a user is likely to read a fresh news item.
Z
add ssr  
zhangwenhui03 已提交
6
- News are generated or disappeared very fast compare with movies or musics. Usually, there will be thousands of news generated in a news recommendation app. The Consumption of news is also fast since users care about newly happened things.
D
dongdaxiang 已提交
7
- User interests may change frequently in the news recommendation setting. The content of news will affect users' reading behaviors a lot even the category of the news does not belong to users' long-term interest. In news recommendation, reading behaviors are determined by both short-term interest and long-term interest of users.
8

D
dongdaxiang 已提交
9
[GRU4Rec](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec) models a user's short-term and long-term interest by applying a gated-recurrent-unit on the user's reading history. The generalization ability of recurrent neural network captures users' similarity of reading sequences that alleviates the user profile sparsity problem. However, the paper of GRU4Rec operates on close domain of items that the model predicts which item a user will be interested in through classification method. In news recommendation, news items are dynamic through time that GRU4Rec model can not predict items that do not exist in training dataset.
D
dongdaxiang 已提交
10

D
dongdaxiang 已提交
11
Sequence Semantic Retrieval(SSR) Model shares the similar idea with Multi-Rate Deep Learning for Temporal Recommendation, SIGIR 2016. Sequence Semantic Retrieval Model has two components, one is the matching model part, the other one is the retrieval part.
Z
add ssr  
zhangwenhui03 已提交
12
- The idea of SSR is to model a user's personalized interest of an item through matching model structure, and the representation of a news item can be computed online even the news item does not exist in training dataset.
D
dongdaxiang 已提交
13
- With the representation of news items, we are able to build an vector indexing service online for news prediction and this is the retrieval part of SSR.
14 15

## Dataset
D
dongdaxiang 已提交
16
Dataset preprocessing follows the method of [GRU4Rec Project](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleRec/gru4rec). Note that you should reuse scripts from GRU4Rec project for data preprocessing.
17 18

## Training
Z
add ssr  
zhangwenhui03 已提交
19 20 21 22 23 24

The command line options for training can be listed by `python train.py -h`

gpu 单机单卡训练
``` bash
CUDA_VISIBLE_DEVICES=0 python train.py --train_dir train_data --use_cuda 1 --batch_size 50 --model_dir model_output
D
dongdaxiang 已提交
25
```
Z
add ssr  
zhangwenhui03 已提交
26 27 28 29

cpu 单机训练
``` bash
python train.py --train_dir train_data --use_cuda 0 --batch_size 50 --model_dir model_output
D
dongdaxiang 已提交
30 31
```

Z
add ssr  
zhangwenhui03 已提交
32
gpu 单机多卡训练
D
dongdaxiang 已提交
33
``` bash
Z
add ssr  
zhangwenhui03 已提交
34
CUDA_VISIBLE_DEVICES=0,1 python train.py --train_dir train_data --use_cuda 1 --parallel 1 --batch_size 50 --model_dir model_output --num_devices 2
D
dongdaxiang 已提交
35 36
```

Z
add ssr  
zhangwenhui03 已提交
37 38 39 40 41
cpu 单机多卡训练
``` bash
CPU_NUM=10 python train.py --train_dir train_data --use_cuda 0 --parallel 1 --batch_size 50 --model_dir model_output --num_devices 10
```

Z
zhangwenhui03 已提交
42 43 44 45
本地模拟多机训练
``` bash
sh cluster_train.sh
```
46

Z
add ssr  
zhangwenhui03 已提交
47 48 49 50 51 52
## Inference

gpu 预测
``` bash
CUDA_VISIBLE_DEVICES=0 python infer.py --test_dir test_data --use_cuda 1 --batch_size 50 --model_dir model_output
```