@@ -80,7 +80,7 @@ Figure 3. A hybrid recommendation model.
We use the [MovieLens ml-1m](http://files.grouplens.org/datasets/movielens/ml-1m.zip) to train our model. This dataset includes 10,000 ratings of 4,000 movies from 6,000 users to 4,000 movies. Each rate is in the range of 1~5. Thanks to GroupLens Research for collecting, processing and publishing the dataset.
`paddle.v2.datasets` package encapsulates multiple public datasets, including `cifar`, `imdb`, `mnist`, `moivelens` and `wmt14`, etc. There's no need for us to manually donwload and preprocess `MovieLens` dataset.
`paddle.v2.datasets` package encapsulates multiple public datasets, including `cifar`, `imdb`, `mnist`, `moivelens` and `wmt14`, etc. There's no need for us to manually download and preprocess `MovieLens` dataset.
```python
# Run this block to show dataset's documentation
...
...
@@ -165,7 +165,7 @@ User <UserInfo id(1), gender(F), age(1), job(10)> rates Movie <MovieInfo id(1193
The output shows that user 1 gave movie `1193` a rating of 5.
After issuing a command `python train.py`, trainning is starting immediately! The details will be unpacked by the following sessions to see how it works.
After issuing a command `python train.py`, training will start immediately. The details will be unpacked by the following sessions to see how it works.
The movie ID and the movie type are mapped to their corresponding hidden layers. For movie's title, a sequence of words represented by an ID sequence, the sequence feature of time window will be obtained after the convolution layer, and then sampling to obtain specific dimension features. The entire process is implemented in `text_conv_pool`.
Movie title, a sequence of words represented by an integer word index sequence, will be feed into a `sequence_conv_pool` layer, which will apply convolution and pooling on time dimension. Because pooling is done on time dimension, the output will be a fixed-length vector regardless the length of the input sequence.
Finally, we can use cosine similarity to calculate the similarity between user characteristics and movie features.
...
...
@@ -301,7 +301,7 @@ reader=paddle.reader.batch(
batch_size=256)
```
`feeding` is devoted to specify the correspondence between each yield record and `paddle.layer.data`. For instance, the first column of data generated by `movielens.train` corresponds to `user_id` feature.
`feeding` is devoted to specifying the correspondence between each yield record and `paddle.layer.data`. For instance, the first column of data generated by `movielens.train` corresponds to `user_id` feature.
```python
feeding={
...
...
@@ -316,7 +316,7 @@ feeding = {
}
```
Callback function `event_handler`is used to track training and testing process that might be triggered once the action to which it is attached is executed.
Callback function `event_handler`will be called during training when a pre-defined event happens.