index.html 20.7 KB
Newer Older
1

Y
Yu Yang 已提交
2 3 4 5
<html>
<head>
  <script type="text/x-mathjax-config">
  MathJax.Hub.Config({
Y
Yu Yang 已提交
6
    extensions: ["tex2jax.js", "TeX/AMSsymbols.js", "TeX/AMSmath.js"],
Y
Yu Yang 已提交
7 8
    jax: ["input/TeX", "output/HTML-CSS"],
    tex2jax: {
9 10
      inlineMath: [ ['$','$'] ],
      displayMath: [ ['$$','$$'] ],
Y
Yu Yang 已提交
11 12 13 14
      processEscapes: true
    },
    "HTML-CSS": { availableFonts: ["TeX"] }
  });
Y
Yi Wang 已提交
15 16
  </script>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js" async></script>
Y
Yu Yang 已提交
17
  <script type="text/javascript" src="../.tools/theme/marked.js">
Y
Yu Yang 已提交
18 19
  </script>
  <link href="http://cdn.bootcss.com/highlight.js/9.9.0/styles/darcula.min.css" rel="stylesheet">
Y
Yi Wang 已提交
20
  <script src="http://cdn.bootcss.com/highlight.js/9.9.0/highlight.min.js"></script>
Y
Yu Yang 已提交
21
  <link href="http://cdn.bootcss.com/bootstrap/4.0.0-alpha.6/css/bootstrap.min.css" rel="stylesheet">
Y
Yu Yang 已提交
22
  <link href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" rel="stylesheet">
Y
Yu Yang 已提交
23
  <link href="../.tools/theme/github-markdown.css" rel='stylesheet'>
Y
Yu Yang 已提交
24 25
</head>
<style type="text/css" >
Y
Yu Yang 已提交
26 27 28 29 30 31
.markdown-body {
    box-sizing: border-box;
    min-width: 200px;
    max-width: 980px;
    margin: 0 auto;
    padding: 45px;
Y
Yu Yang 已提交
32 33 34 35
}
</style>


Y
Yu Yang 已提交
36
<body>
Y
Yu Yang 已提交
37

Y
Yu Yang 已提交
38
<div id="context" class="container-fluid markdown-body">
Y
Yu Yang 已提交
39 40 41 42
</div>

<!-- This block will be replaced by each markdown file content. Please do not change lines below.-->
<div id="markdown" style='display:none'>
C
choijulie 已提交
43
# Personalized Recommendation
Y
Yu Yang 已提交
44

C
choijulie 已提交
45
The source code of this tutorial is in [book/recommender_system](https://github.com/PaddlePaddle/book/tree/develop/05.recommender_system).
Y
Yu Yang 已提交
46

L
Luo Tao 已提交
47
For instructions on getting started with PaddlePaddle, see [PaddlePaddle installation guide](https://github.com/PaddlePaddle/book/blob/develop/README.md#running-the-book).
Y
Yu Yang 已提交
48 49


C
choijulie 已提交
50
## Background
Y
Yu Yang 已提交
51

C
choijulie 已提交
52
With the fast growth of e-commerce, online videos, and online reading business, users have to rely on recommender systems to avoid manually browsing tremendous volume of choices.  Recommender systems understand users' interest by mining user behavior and other properties of users and products.
Y
Yu Yang 已提交
53

C
choijulie 已提交
54
Some well know approaches include:
Y
Yu Yang 已提交
55

C
choijulie 已提交
56
- User behavior-based approach.  A well-known method is collaborative filtering. The underlying assumption is that if a person A has the same opinion as a person B on an issue, A is more likely to have B's opinion on a different issue than that of a randomly chosen person.
Y
Yu Yang 已提交
57

C
choijulie 已提交
58
- Content-based recommendation[[1](#reference)]. This approach infers feature vectors that represent products from their descriptions.  It also infers feature vectors that represent users' interests.  Then it measures the relevance of users and products by some distances between these feature vectors.
Y
Yu Yang 已提交
59

C
choijulie 已提交
60
- Hybrid approach[[2](#reference)]: This approach uses the content-based information to help address the cold start problem[[6](#reference)] in behavior-based approach.
Y
Yu Yang 已提交
61

C
choijulie 已提交
62 63 64
Among these options, collaborative filtering might be the most studied one.  Some of its variants include user-based[[3](#reference)], item-based [[4](#reference)], social network based[[5](#reference)], and model-based.

This tutorial explains a deep learning based approach and how to implement it using PaddlePaddle.  We will train a model using a dataset that includes user information, movie information, and ratings.  Once we train the model, we will be able to get a predicted rating given a pair of user and movie IDs.
Y
Yu Yang 已提交
65 66


C
choijulie 已提交
67
## Model Overview
Y
Yu Yang 已提交
68

C
choijulie 已提交
69
To know more about deep learning based recommendation, let us start from going over the Youtube recommender system[[7](#reference)] before introducing our hybrid model.
Y
Yu Yang 已提交
70 71


C
choijulie 已提交
72 73 74
### YouTube's Deep Learning Recommendation Model

YouTube is a video-sharing Web site with one of the largest user base in the world.  Its recommender system serves more than a billion users.  This system is composed of two major parts: candidate generation and ranking.  The former selects few hundreds of candidates from millions of videos, and the latter ranks and outputs the top 10.
Y
Yu Yang 已提交
75 76

<p align="center">
C
choijulie 已提交
77 78
<img src="image/YouTube_Overview.en.png" width="70%" ><br/>
Figure 1. YouTube recommender system overview.
Y
Yu Yang 已提交
79 80
</p>

C
choijulie 已提交
81
#### Candidate Generation Network
Y
Yu Yang 已提交
82

C
choijulie 已提交
83
Youtube models candidate generation as a multiclass classification problem with a huge number of classes equal to the number of videos.  The architecture of the model is as follows:
Y
Yu Yang 已提交
84 85

<p align="center">
C
choijulie 已提交
86 87
<img src="image/Deep_candidate_generation_model_architecture.en.png" width="70%" ><br/>
Figure 2. Deep candidate generation model.
Y
Yu Yang 已提交
88 89
</p>

C
choijulie 已提交
90 91 92
The first stage of this model maps watching history and search queries into fixed-length representative features.  Then, an MLP (multi-layer perceptron, as described in the [Recognize Digits](https://github.com/PaddlePaddle/book/blob/develop/recognize_digits/README.md) tutorial) takes the concatenation of all representative vectors.  The output of the MLP represents the user' *intrinsic interests*.  At training time, it is used together with a softmax output layer for minimizing the classification error.   At serving time, it is used to compute the relevance of the user with all movies.

For a user $U$, the predicted watching probability of video $i$ is
Y
Yu Yang 已提交
93 94 95

$$P(\omega=i|u)=\frac{e^{v_{i}u}}{\sum_{j \in V}e^{v_{j}u}}$$

C
choijulie 已提交
96
where $u$ is the representative vector of user $U$, $V$ is the corpus of all videos, $v_i$ is the representative vector of the $i$-th video. $u$ and $v_i$ are vectors of the same length, so we can compute their dot product using a fully connected layer.
Y
Yu Yang 已提交
97

C
choijulie 已提交
98
This model could have a performance issue as the softmax output covers millions of classification labels.  To optimize performance, at the training time, the authors down-sample negative samples, so the actual number of classes is reduced to thousands.  At serving time, the authors ignore the normalization of the softmax outputs, because the results are just for ranking.
Y
Yu Yang 已提交
99

C
choijulie 已提交
100
#### Ranking Network
Y
Yu Yang 已提交
101

C
choijulie 已提交
102
The architecture of the ranking network is similar to that of the candidate generation network.  Similar to ranking models widely used in online advertising, it uses rich features like video ID, last watching time, etc.  The output layer of the ranking network is a weighted logistic regression, which rates all candidate videos.
Y
Yu Yang 已提交
103

C
choijulie 已提交
104
### Hybrid Model
105

C
choijulie 已提交
106
In the section, let us introduce our movie recommendation system. Especially, we feed moives titles into a text convolution network to get a fixed-length representative feature vector. Accordingly we will introduce the convolutional neural network for texts and the hybrid recommendation model respectively.
107

C
choijulie 已提交
108
#### Convolutional Neural Networks for Texts (CNN)
109

C
choijulie 已提交
110
**Convolutional Neural Networks** are frequently applied to data with grid-like topology such as two-dimensional images and one-dimensional texts. A CNN can extract multiple local features, combine them, and produce high-level abstractions, which correspond to semantic understanding. Empirically, CNN is shown to be efficient for image and text modeling.
111

C
choijulie 已提交
112
CNN mainly contains convolution and pooling operation, with versatile combinations in various applications. Here, we briefly describe a CNN as shown in Figure 3.
113 114


C
choijulie 已提交
115 116 117 118
<p align="center">
<img src="image/text_cnn_en.png" width = "80%" align="center"/><br/>
Figure 3. CNN for text modeling.
</p>
119

C
choijulie 已提交
120
Let $n$ be the length of the sentence to process, and the $i$-th word has embedding as $x_i\in\mathbb{R}^k$,where $k$ is the embedding dimensionality.
121

C
choijulie 已提交
122
First, we concatenate the words by piecing together every $h$ words, each as a window of length $h$. This window is denoted as $x_{i:i+h-1}$, consisting of $x_{i},x_{i+1},\ldots,x_{i+h-1}$, where $x_i$ is the first word in the window and $i$ takes value ranging from $1$ to $n-h+1$: $x_{i:i+h-1}\in\mathbb{R}^{hk}$.
123

C
choijulie 已提交
124
Next, we apply the convolution operation: we apply the kernel $w\in\mathbb{R}^{hk}$ in each window, extracting features $c_i=f(w\cdot x_{i:i+h-1}+b)$, where $b\in\mathbb{R}$ is the bias and $f$ is a non-linear activation function such as $sigmoid$. Convolving by the kernel at every window ${x_{1:h},x_{2:h+1},\ldots,x_{n-h+1:n}}$ produces a feature map in the following form:
125

C
choijulie 已提交
126
$$c=[c_1,c_2,\ldots,c_{n-h+1}], c \in \mathbb{R}^{n-h+1}$$
Y
Yu Yang 已提交
127

C
choijulie 已提交
128
Next, we apply *max pooling* over time to represent the whole sentence $\hat c$, which is the maximum element across the feature map:
Y
Yu Yang 已提交
129

C
choijulie 已提交
130
$$\hat c=max(c)$$
Y
Yu Yang 已提交
131

C
choijulie 已提交
132
#### Model Structure Of The Hybrid Model
Y
Yu Yang 已提交
133

C
choijulie 已提交
134
In our network, the input includes features of users and movies.  The user feature includes four properties: user ID, gender, occupation, and age.  Movie features include their IDs, genres, and titles.
Y
Yu Yang 已提交
135

C
choijulie 已提交
136
We use fully-connected layers to map user features into representative feature vectors and concatenate them.  The process of movie features is similar, except that for movie titles -- we feed titles into a text convolution network as described in the above section to get a fixed-length representative feature vector.
Y
Yu Yang 已提交
137

C
choijulie 已提交
138
Given the feature vectors of users and movies, we compute the relevance using cosine similarity.  We minimize the squared error at training time.
Y
Yu Yang 已提交
139 140

<p align="center">
C
choijulie 已提交
141 142
<img src="image/rec_regression_network_en.png" width="90%" ><br/>
Figure 4. A hybrid recommendation model.
143
</p>
Y
Yu Yang 已提交
144

C
choijulie 已提交
145
## Dataset
Y
Yu Yang 已提交
146

C
choijulie 已提交
147
We use the [MovieLens ml-1m](http://files.grouplens.org/datasets/movielens/ml-1m.zip) to train our model.  This dataset includes 10,000 ratings of 4,000 movies from 6,000 users to 4,000 movies.  Each rate is in the range of 1~5.  Thanks to GroupLens Research for collecting, processing and publishing the dataset.
Y
Yu Yang 已提交
148

C
choijulie 已提交
149
`paddle.v2.datasets` package encapsulates multiple public datasets, including `cifar`, `imdb`, `mnist`, `moivelens` and `wmt14`, etc. There's no need for us to manually download and preprocess `MovieLens` dataset.
150

C
choijulie 已提交
151 152
The raw `MoiveLens` contains movie ratings, relevant features from both movies and users.
For instance, one movie's feature could be:
Y
Yu Yang 已提交
153

154 155
```python
import paddle.v2 as paddle
C
choijulie 已提交
156 157
movie_info = paddle.dataset.movielens.movie_info()
print movie_info.values()[0]
Y
Yu Yang 已提交
158
```
159

C
choijulie 已提交
160 161
```text
<MovieInfo id(1), title(Toy Story), categories(['Animation', "Children's", 'Comedy'])>
Y
Yu Yang 已提交
162 163
```

C
choijulie 已提交
164
One user's feature could be:
Y
Yu Yang 已提交
165

166
```python
C
choijulie 已提交
167 168
user_info = paddle.dataset.movielens.user_info()
print user_info.values()[0]
Y
Yu Yang 已提交
169 170
```

C
choijulie 已提交
171 172 173
```text
<UserInfo id(1), gender(F), age(1), job(10)>
```
Y
Yu Yang 已提交
174

C
choijulie 已提交
175
In this dateset, the distribution of age is shown as follows:
176

C
choijulie 已提交
177 178 179 180 181 182 183 184
```text
1: "Under 18"
18: "18-24"
25: "25-34"
35: "35-44"
45: "45-49"
50: "50-55"
56: "56+"
Y
Yu Yang 已提交
185
```
186

C
choijulie 已提交
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211
User's occupation is selected from the following options:

```text
0: "other" or not specified
1: "academic/educator"
2: "artist"
3: "clerical/admin"
4: "college/grad student"
5: "customer service"
6: "doctor/health care"
7: "executive/managerial"
8: "farmer"
9: "homemaker"
10: "K-12 student"
11: "lawyer"
12: "programmer"
13: "retired"
14: "sales/marketing"
15: "scientist"
16: "self-employed"
17: "technician/engineer"
18: "tradesman/craftsman"
19: "unemployed"
20: "writer"
```
212

C
choijulie 已提交
213 214
Each record consists of three main components: user features, movie features and movie ratings.
Likewise, as a simple example, consider the following:
215 216 217 218 219 220 221

```python
train_set_creator = paddle.dataset.movielens.train()
train_sample = next(train_set_creator())
uid = train_sample[0]
mov_id = train_sample[len(user_info[uid].value())]
print "User %s rates Movie %s with Score %s"%(user_info[uid], movie_info[mov_id], train_sample[-1])
Y
Yu Yang 已提交
222 223
```

C
choijulie 已提交
224 225 226 227 228
```text
User <UserInfo id(1), gender(F), age(1), job(10)> rates Movie <MovieInfo id(1193), title(One Flew Over the Cuckoo's Nest), categories(['Drama'])> with Score [5.0]
```

The output shows that user 1 gave movie `1193` a rating of 5.
229

C
choijulie 已提交
230
After issuing a command `python train.py`, training will start immediately. The details will be unpacked by the following sessions to see how it works.
Y
Yu Yang 已提交
231

C
choijulie 已提交
232
## Model Architecture
233

C
choijulie 已提交
234
### Initialize PaddlePaddle
235

C
choijulie 已提交
236 237 238 239 240 241
First, we must import and initialize PaddlePaddle (enable/disable GPU, set the number of trainers, etc).

```python
import paddle.v2 as paddle
paddle.init(use_gpu=False)
```
Y
Yu Yang 已提交
242

C
choijulie 已提交
243
### Model Configuration
Y
Yu Yang 已提交
244 245

```python
246
uid = paddle.layer.data(
L
livc 已提交
247 248 249
    name='user_id',
    type=paddle.data_type.integer_value(
        paddle.dataset.movielens.max_user_id() + 1))
250
usr_emb = paddle.layer.embedding(input=uid, size=32)
L
livc 已提交
251
usr_fc = paddle.layer.fc(input=usr_emb, size=32)
252 253

usr_gender_id = paddle.layer.data(
L
livc 已提交
254
    name='gender_id', type=paddle.data_type.integer_value(2))
255
usr_gender_emb = paddle.layer.embedding(input=usr_gender_id, size=16)
L
livc 已提交
256
usr_gender_fc = paddle.layer.fc(input=usr_gender_emb, size=16)
257 258

usr_age_id = paddle.layer.data(
L
livc 已提交
259 260 261
    name='age_id',
    type=paddle.data_type.integer_value(
        len(paddle.dataset.movielens.age_table)))
262
usr_age_emb = paddle.layer.embedding(input=usr_age_id, size=16)
L
livc 已提交
263
usr_age_fc = paddle.layer.fc(input=usr_age_emb, size=16)
264 265

usr_job_id = paddle.layer.data(
L
livc 已提交
266 267 268
    name='job_id',
    type=paddle.data_type.integer_value(
        paddle.dataset.movielens.max_job_id() + 1))
269
usr_job_emb = paddle.layer.embedding(input=usr_job_id, size=16)
L
livc 已提交
270
usr_job_fc = paddle.layer.fc(input=usr_job_emb, size=16)
271
```
Y
Yu Yang 已提交
272

C
choijulie 已提交
273
As shown in the above code, the input is four dimension integers for each user, that is,  `user_id`,`gender_id`, `age_id` and `job_id`. In order to deal with these features conveniently, we use the language model in NLP to transform these discrete values into embedding vaules `usr_emb`, `usr_gender_emb`, `usr_age_emb` and `usr_job_emb`.
274 275 276

```python
usr_combined_features = paddle.layer.fc(
L
livc 已提交
277
        input=[usr_fc, usr_gender_fc, usr_age_fc, usr_job_fc],
278 279
        size=200,
        act=paddle.activation.Tanh())
Y
Yu Yang 已提交
280 281
```

C
choijulie 已提交
282
Then, employing user features as input, directly connecting to a fully-connected layer, which is used to reduce dimension to 200.
Y
Yu Yang 已提交
283

C
choijulie 已提交
284
Furthermore, we do a similar transformation for each movie feature. The model configuration is:
Y
Yu Yang 已提交
285 286

```python
287 288 289 290 291
mov_id = paddle.layer.data(
    name='movie_id',
    type=paddle.data_type.integer_value(
        paddle.dataset.movielens.max_movie_id() + 1))
mov_emb = paddle.layer.embedding(input=mov_id, size=32)
L
livc 已提交
292
mov_fc = paddle.layer.fc(input=mov_emb, size=32)
293 294 295 296 297 298 299

mov_categories = paddle.layer.data(
    name='category_id',
    type=paddle.data_type.sparse_binary_vector(
        len(paddle.dataset.movielens.movie_categories())))
mov_categories_hidden = paddle.layer.fc(input=mov_categories, size=32)

L
livc 已提交
300
movie_title_dict = paddle.dataset.movielens.get_movie_title_dict()
301 302 303 304 305 306 307 308
mov_title_id = paddle.layer.data(
    name='movie_title',
    type=paddle.data_type.integer_value_sequence(len(movie_title_dict)))
mov_title_emb = paddle.layer.embedding(input=mov_title_id, size=32)
mov_title_conv = paddle.networks.sequence_conv_pool(
    input=mov_title_emb, hidden_size=32, context_len=3)

mov_combined_features = paddle.layer.fc(
L
livc 已提交
309
    input=[mov_fc, mov_categories_hidden, mov_title_conv],
310 311 312
    size=200,
    act=paddle.activation.Tanh())
```
Y
Yu Yang 已提交
313

C
choijulie 已提交
314
Movie title, a sequence of words represented by an integer word index sequence, will be feed into a `sequence_conv_pool` layer, which will apply convolution and pooling on time dimension. Because pooling is done on time dimension, the output will be a fixed-length vector regardless the length of the input sequence.
Y
Yu Yang 已提交
315

C
choijulie 已提交
316
Finally, we can use cosine similarity to calculate the similarity between user characteristics and movie features.
Y
Yu Yang 已提交
317

318 319
```python
inference = paddle.layer.cos_sim(a=usr_combined_features, b=mov_combined_features, size=1, scale=5)
320
cost = paddle.layer.square_error_cost(
321 322
        input=inference,
        label=paddle.layer.data(
C
choijulie 已提交
323
        name='score', type=paddle.data_type.dense_vector(1)))
Y
Yu Yang 已提交
324 325
```

C
choijulie 已提交
326
## Model Training
327

C
choijulie 已提交
328
### Define Parameters
Y
Yu Yang 已提交
329

C
choijulie 已提交
330
First, we define the model parameters according to the previous model configuration `cost`.
Y
Yu Yang 已提交
331 332

```python
C
choijulie 已提交
333
# Create parameters
334
parameters = paddle.parameters.create(cost)
Y
Yu Yang 已提交
335 336
```

C
choijulie 已提交
337
### Create Trainer
Y
Yu Yang 已提交
338

C
choijulie 已提交
339
Before jumping into creating a training module, algorithm setting is also necessary. Here we specified Adam optimization algorithm via `paddle.optimizer`.
Y
Update  
Yi Wang 已提交
340

341
```python
C
choijulie 已提交
342 343
trainer = paddle.trainer.SGD(cost=cost, parameters=parameters,
                             update_equation=paddle.optimizer.Adam(learning_rate=1e-4))
344
```
Y
Yu Yang 已提交
345

C
choijulie 已提交
346 347
```text
[INFO 2017-03-06 17:12:13,378 networks.py:1472] The input order is [user_id, gender_id, age_id, job_id, movie_id, category_id, movie_title, score]
348
[INFO 2017-03-06 17:12:13,379 networks.py:1478] The output order is [__square_error_cost_0__]
C
choijulie 已提交
349
```
Y
Yu Yang 已提交
350

C
choijulie 已提交
351
### Training
Y
Yu Yang 已提交
352

C
choijulie 已提交
353
`paddle.dataset.movielens.train` will yield records during each pass, after shuffling, a batch input is generated for training.
Y
Yu Yang 已提交
354

355
```python
C
choijulie 已提交
356 357 358 359
reader=paddle.batch(
    paddle.reader.shuffle(
        paddle.dataset.movielens.train(), buf_size=8192),
        batch_size=256)
360
```
Y
Yu Yang 已提交
361

C
choijulie 已提交
362
`feeding` is devoted to specifying the correspondence between each yield record and `paddle.layer.data`. For instance, the first column of data generated by `movielens.train` corresponds to `user_id` feature.
Y
Yu Yang 已提交
363

364
```python
Q
qijun 已提交
365
feeding = {
366 367 368 369 370 371 372 373 374
    'user_id': 0,
    'gender_id': 1,
    'age_id': 2,
    'job_id': 3,
    'movie_id': 4,
    'category_id': 5,
    'movie_title': 6,
    'score': 7
}
Q
qijun 已提交
375 376
```

C
choijulie 已提交
377
Callback function `event_handler` and  `event_handler_plot` will be called during training when a pre-defined event happens.
Q
qijun 已提交
378 379 380 381 382 383 384 385 386 387

```python
def event_handler(event):
    if isinstance(event, paddle.event.EndIteration):
        if event.batch_id % 100 == 0:
            print "Pass %d Batch %d Cost %.2f" % (
                event.pass_id, event.batch_id, event.cost)
```

```python
L
liaogang 已提交
388
from paddle.v2.plot import Ploter
Q
qijun 已提交
389

L
liaogang 已提交
390 391 392
train_title = "Train cost"
test_title = "Test cost"
cost_ploter = Ploter(train_title, test_title)
Y
Yu Yang 已提交
393

L
liaogang 已提交
394
step = 0
Y
Yu Yang 已提交
395

Q
qijun 已提交
396
def event_handler_plot(event):
397 398 399
    global step
    if isinstance(event, paddle.event.EndIteration):
        if step % 10 == 0:  # every 10 batches, record a train cost
L
liaogang 已提交
400
            cost_ploter.append(train_title, step, event.cost)
Y
Yi Wang 已提交
401

402
        if step % 1000 == 0: # every 1000 batches, record a test cost
L
liaogang 已提交
403 404 405 406 407
            result = trainer.test(
                reader=paddle.batch(
                    paddle.dataset.movielens.test(), batch_size=256),
                feeding=feeding)
            cost_ploter.append(test_title, step, result.cost)
Y
Yi Wang 已提交
408

409
        if step % 100 == 0: # every 100 batches, update cost plot
L
liaogang 已提交
410 411
            cost_ploter.plot()

412
        step += 1
L
liaogang 已提交
413
```
414

C
choijulie 已提交
415 416
Finally, we can invoke `trainer.train` to start training:

L
liaogang 已提交
417
```python
418
trainer.train(
C
choijulie 已提交
419
    reader=reader,
Q
qijun 已提交
420
    event_handler=event_handler_plot,
Q
qijun 已提交
421
    feeding=feeding,
422
    num_passes=2)
Y
Yu Yang 已提交
423 424
```

C
choijulie 已提交
425
## Conclusion
426

C
choijulie 已提交
427
This tutorial goes over traditional approaches in recommender system and a deep learning based approach.  We also show that how to train and use the model with PaddlePaddle.  Deep learning has been well used in computer vision and NLP, we look forward to its new successes in recommender systems.
Y
Yu Yang 已提交
428

C
choijulie 已提交
429
## Reference
Y
Yu Yang 已提交
430

C
choijulie 已提交
431 432
1. [Peter Brusilovsky](https://en.wikipedia.org/wiki/Peter_Brusilovsky) (2007). *The Adaptive Web*. p. 325.
2. Robin Burke , [Hybrid Web Recommender Systems](http://www.dcs.warwick.ac.uk/~acristea/courses/CS411/2010/Book%20-%20The%20Adaptive%20Web/HybridWebRecommenderSystems.pdf), pp. 377-408, The Adaptive Web, Peter Brusilovsky, Alfred Kobsa, Wolfgang Nejdl (Ed.), Lecture Notes in Computer Science, Springer-Verlag, Berlin, Germany, Lecture Notes in Computer Science, Vol. 4321, May 2007, 978-3-540-72078-2.
Y
Yu Yang 已提交
433
3. P. Resnick, N. Iacovou, etc. “[GroupLens: An Open Architecture for Collaborative Filtering of Netnews](http://ccs.mit.edu/papers/CCSWP165.html)”, Proceedings of ACM Conference on Computer Supported Cooperative Work, CSCW 1994. pp.175-186.
C
choijulie 已提交
434 435 436
4. Sarwar, Badrul, et al. "[Item-based collaborative filtering recommendation algorithms.](http://files.grouplens.org/papers/www10_sarwar.pdf)" *Proceedings of the 10th International Conference on World Wide Web*. ACM, 2001.
5. Kautz, Henry, Bart Selman, and Mehul Shah. "[Referral Web: Combining Social networks and collaborative filtering.](http://www.cs.cornell.edu/selman/papers/pdf/97.cacm.refweb.pdf)" Communications of the ACM 40.3 (1997): 63-65. APA
6. Yuan, Jianbo, et al. ["Solving Cold-Start Problem in Large-scale Recommendation Engines: A Deep Learning Approach."](https://arxiv.org/pdf/1611.05480v1.pdf) *arXiv preprint arXiv:1611.05480* (2016).
Y
Yu Yang 已提交
437
7. Covington P, Adams J, Sargin E. [Deep neural networks for youtube recommendations](https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/45530.pdf)[C]//Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 2016: 191-198.
438

Y
Yu Yang 已提交
439
<br/>
L
Luo Tao 已提交
440
This tutorial is contributed by <a xmlns:cc="http://creativecommons.org/ns#" href="http://book.paddlepaddle.org" property="cc:attributionName" rel="cc:attributionURL">PaddlePaddle</a>, and licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
441

Y
Yu Yang 已提交
442 443 444 445 446 447 448
</div>
<!-- You can change the lines below now. -->

<script type="text/javascript">
marked.setOptions({
  renderer: new marked.Renderer(),
  gfm: true,
Y
Yu Yang 已提交
449 450 451
  breaks: false,
  smartypants: true,
  highlight: function(code, lang) {
Y
Yu Yang 已提交
452
    code = code.replace(/&amp;/g, "&")
Y
Yu Yang 已提交
453 454
    code = code.replace(/&gt;/g, ">")
    code = code.replace(/&lt;/g, "<")
455
    code = code.replace(/&nbsp;/g, " ")
Y
Yu Yang 已提交
456
    return hljs.highlightAuto(code, [lang]).value;
Y
Yu Yang 已提交
457 458 459
  }
});
document.getElementById("context").innerHTML = marked(
460
        document.getElementById("markdown").innerHTML)
Y
Yu Yang 已提交
461 462
</script>
</body>