提交 7cada188 编写于 作者: Q Qiao Longfei

update comment and add requirements.txt

上级 2b37298d
...@@ -14,6 +14,13 @@ ...@@ -14,6 +14,13 @@
} }
``` ```
## 运行环境
需要先安装PaddlePaddle Fluid,然后运行:
```shell
pip install -r requirements.txt
```
## 数据集 ## 数据集
本文使用的是Kaggle公司举办的[展示广告竞赛](https://www.kaggle.com/c/criteo-display-ad-challenge/)中所使用的Criteo数据集。 本文使用的是Kaggle公司举办的[展示广告竞赛](https://www.kaggle.com/c/criteo-display-ad-challenge/)中所使用的Criteo数据集。
...@@ -27,7 +34,6 @@ cd data && ./download.sh && cd .. ...@@ -27,7 +34,6 @@ cd data && ./download.sh && cd ..
## 模型 ## 模型
本例子只实现了DeepFM论文中介绍的模型的DNN部分,DeepFM会在其他例子中给出。 本例子只实现了DeepFM论文中介绍的模型的DNN部分,DeepFM会在其他例子中给出。
```
## 数据准备 ## 数据准备
处理原始数据集,整型特征使用min-max归一化方法规范到[0, 1],类别类特征使用了one-hot编码。原始数据集分割成两部分:90%用于训练,其他10%用于训练过程中的验证。 处理原始数据集,整型特征使用min-max归一化方法规范到[0, 1],类别类特征使用了one-hot编码。原始数据集分割成两部分:90%用于训练,其他10%用于训练过程中的验证。
......
...@@ -19,6 +19,13 @@ both low order and high order feature interactions. For details of the ...@@ -19,6 +19,13 @@ both low order and high order feature interactions. For details of the
factorization machines, please refer to the paper [factorization factorization machines, please refer to the paper [factorization
machines](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf) machines](https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf)
## Environment
You should install PaddlePaddle Fluid first, and run:
```shell
pip install -r requirements.txt
```
## Dataset ## Dataset
This example uses Criteo dataset which was used for the [Display Advertising This example uses Criteo dataset which was used for the [Display Advertising
Challenge](https://www.kaggle.com/c/criteo-display-ad-challenge/) Challenge](https://www.kaggle.com/c/criteo-display-ad-challenge/)
......
...@@ -89,8 +89,8 @@ class ContinuousFeatureGenerator: ...@@ -89,8 +89,8 @@ class ContinuousFeatureGenerator:
@click.option("--outdir", type=str, help="Path to save the processed data") @click.option("--outdir", type=str, help="Path to save the processed data")
def preprocess(datadir, outdir): def preprocess(datadir, outdir):
""" """
All the 13 integer features are normalzied to continous values and these All 13 integer features are normalized to continuous values and these continuous
continous features are combined into one vecotr with dimension 13. features are combined into one vector with dimension of 13.
Each of the 26 categorical features are one-hot encoded and all the one-hot Each of the 26 categorical features are one-hot encoded and all the one-hot
vectors are combined into one sparse binary vector. vectors are combined into one sparse binary vector.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册