diff --git a/doc/imgs/cnn-ckim2014.png b/doc/imgs/cnn-ckim2014.png new file mode 100644 index 0000000000000000000000000000000000000000..691fd457b7c628a899632b4bbe91c9fe57655c71 Binary files /dev/null and b/doc/imgs/cnn-ckim2014.png differ diff --git a/doc/imgs/dcn.png b/doc/imgs/dcn.png new file mode 100644 index 0000000000000000000000000000000000000000..82a77e1743ac4425fbcd5f636b8360ede91258dd Binary files /dev/null and b/doc/imgs/dcn.png differ diff --git a/doc/imgs/deepfm.png b/doc/imgs/deepfm.png new file mode 100644 index 0000000000000000000000000000000000000000..3288a71fdda524c972144b7eeaeeb2fcdd93728f Binary files /dev/null and b/doc/imgs/deepfm.png differ diff --git a/doc/imgs/din.png b/doc/imgs/din.png new file mode 100644 index 0000000000000000000000000000000000000000..3a3e550a4802c6159d06f83759251d36fa7984ca Binary files /dev/null and b/doc/imgs/din.png differ diff --git a/doc/imgs/tagspace.png b/doc/imgs/tagspace.png new file mode 100644 index 0000000000000000000000000000000000000000..7f64fd6d4029ee1ad56b7778817735cee321f1c7 Binary files /dev/null and b/doc/imgs/tagspace.png differ diff --git a/doc/imgs/wide&deep.png b/doc/imgs/wide&deep.png new file mode 100644 index 0000000000000000000000000000000000000000..d46cef37d771acbedb766f98dabead50ff038b3e Binary files /dev/null and b/doc/imgs/wide&deep.png differ diff --git a/doc/imgs/xdeepfm.png b/doc/imgs/xdeepfm.png new file mode 100644 index 0000000000000000000000000000000000000000..2c2577afbd1c4eb47d583f8aec317d1736aea5f1 Binary files /dev/null and b/doc/imgs/xdeepfm.png differ diff --git a/models/contentunderstanding/text_classification/config.yaml b/models/contentunderstanding/classification/config.yaml similarity index 100% rename from models/contentunderstanding/text_classification/config.yaml rename to models/contentunderstanding/classification/config.yaml diff --git a/models/contentunderstanding/text_classification/model.py b/models/contentunderstanding/classification/model.py similarity index 100% rename from models/contentunderstanding/text_classification/model.py rename to models/contentunderstanding/classification/model.py diff --git a/models/contentunderstanding/text_classification/reader.py b/models/contentunderstanding/classification/reader.py similarity index 100% rename from models/contentunderstanding/text_classification/reader.py rename to models/contentunderstanding/classification/reader.py diff --git a/models/contentunderstanding/text_classification/train_data/part-0 b/models/contentunderstanding/classification/train_data/part-0 similarity index 100% rename from models/contentunderstanding/text_classification/train_data/part-0 rename to models/contentunderstanding/classification/train_data/part-0 diff --git a/models/contentunderstanding/readme.md b/models/contentunderstanding/readme.md index 1063982b7a98dbe56a06ed7c5915ecd21fd5bebf..ef3fd88362bbffeb3c0abb233f0023e0eda7d18b 100644 --- a/models/contentunderstanding/readme.md +++ b/models/contentunderstanding/readme.md @@ -1,13 +1,13 @@ # 内容理解模型库 ## 简介 -我们提供了常见的内容理解任务中使用的模型算法的PaddleRec实现, 单机训练&预测效果指标以及分布式训练&预测性能指标等。实现的内容理解模型包括 [Tagspace](http://gitlab.baidu.com/xujiaqi01/paddlerec/tree/develop/models/contentunderstanding/tagspace)、[文本分类](http://gitlab.baidu.com/xujiaqi01/paddlerec/tree/develop/models/contentunderstanding/text_classification)。 +我们提供了常见的内容理解任务中使用的模型算法的PaddleRec实现, 单机训练&预测效果指标以及分布式训练&预测性能指标等。实现的内容理解模型包括 [Tagspace](tagspace)、[文本分类](classification)等。 模型算法库在持续添加中,欢迎关注。 ## 目录 * [整体介绍](#整体介绍) - * [内容理解模型列表](#内容理解模型列表) + * [模型列表](#内容理解模型列表) * [使用教程](#使用教程) * [数据处理](#数据处理) * [训练](#训练) @@ -18,13 +18,24 @@ * [模型性能列表](#模型性能列表) ## 整体介绍 -### 排序模型列表 +### 模型列表 | 模型 | 简介 | 论文 | | :------------------: | :--------------------: | :---------: | -| TagSpace | 标签推荐 | [TagSpace: Semantic Embeddings from Hashtags](https://research.fb.com/publications/tagspace-semantic-embeddings-from-hashtags/) | -| TextClassification | 文本分类 | -- | +| TagSpace | 标签推荐 | [TagSpace: Semantic Embeddings from Hashtags (2014)](https://research.fb.com/publications/tagspace-semantic-embeddings-from-hashtags/) | +| Classification | 文本分类 | [Convolutional neural networks for sentence classication (2014)](https://www.aclweb.org/anthology/D14-1181.pdf) | +下面是每个模型的简介(注:图片引用自链接中的论文) + +[TagSpace模型](https://research.fb.com/publications/tagspace-semantic-embeddings-from-hashtags) +

+ +

+ +[文本分类CNN模型](https://www.aclweb.org/anthology/D14-1181.pdf) +

+ +

## 使用教程 ### 数据处理 @@ -53,7 +64,7 @@ mv test.csv raw_big_test_data python text2paddle.py raw_big_train_data/ raw_big_test_data/ train_big_data test_big_data big_vocab_text.txt big_vocab_tag.txt ``` -**(2)TextClassification** +**(2)Classification** 无 @@ -66,7 +77,7 @@ python text2paddle.py raw_big_train_data/ raw_big_test_data/ train_big_data test | 数据集 | 模型 | loss | auc | acc | mae | | :------------------: | :--------------------: | :---------: |:---------: | :---------: |:---------: | | -- | TagSpace | -- | -- | -- | -- | -| -- | TextClassification | -- | -- | -- | -- | +| -- | Classification | -- | -- | -- | -- | ## 分布式 @@ -74,7 +85,7 @@ python text2paddle.py raw_big_train_data/ raw_big_test_data/ train_big_data test | 数据集 | 模型 | 单机 | 同步 (4节点) | 同步 (8节点) | 同步 (16节点) | 同步 (32节点) | | :------------------: | :--------------------: | :---------: |:---------: |:---------: |:---------: |:---------: | | -- | TagSpace | -- | -- | -- | -- | -- | -| -- | TextClassification | -- | -- | -- | -- | -- | +| -- | Classification | -- | -- | -- | -- | -- | ---- @@ -82,4 +93,4 @@ python text2paddle.py raw_big_train_data/ raw_big_test_data/ train_big_data test | 数据集 | 模型 | 单机 | 异步 (4节点) | 异步 (8节点) | 异步 (16节点) | 异步 (32节点) | | :------------------: | :--------------------: | :---------: |:---------: |:---------: |:---------: |:---------: | | -- | TagSpace | -- | -- | -- | -- | -- | -| -- | TextClassification | -- | -- | -- | -- | -- | \ No newline at end of file +| -- | Classification | -- | -- | -- | -- | -- | diff --git a/models/rank/readme.md b/models/rank/readme.md index 0f890e995f6cbfc9520f0f6719fbf08252194cf4..326fb481356982dfb2acccaba670c072363bdb76 100755 --- a/models/rank/readme.md +++ b/models/rank/readme.md @@ -1,13 +1,13 @@ # 排序模型库 ## 简介 -我们提供了常见的排序任务中使用的模型算法的PaddleRec实现, 单机训练&预测效果指标以及分布式训练&预测性能指标等。实现的排序模型包括 [多层神经网络](http://gitlab.baidu.com/tangwei12/paddlerec/tree/develop/models/rank/dnn)、[Deep Cross Network](http://gitlab.baidu.com/tangwei12/paddlerec/tree/develop/models/rank/dcn)、[DeepFM](http://gitlab.baidu.com/tangwei12/paddlerec/tree/develop/models/rank/deepfm)、 [xDeepFM](http://gitlab.baidu.com/tangwei12/paddlerec/tree/develop/models/rank/xdeepfm)、[Deep Interest Network](http://gitlab.baidu.com/tangwei12/paddlerec/tree/develop/models/rank/din)、[Wide&Deep](http://gitlab.baidu.com/tangwei12/paddlerec/tree/develop/models/rank/wide_deep)。 +我们提供了常见的排序任务中使用的模型算法的PaddleRec实现, 单机训练&预测效果指标以及分布式训练&预测性能指标等。实现的排序模型包括 [多层神经网络](dnn)、[Deep Cross Network](dcn)、[DeepFM](deepfm)、 [xDeepFM](xdeepfm)、[Deep Interest Network](din)、[Wide&Deep](wide_deep)。 模型算法库在持续添加中,欢迎关注。 ## 目录 * [整体介绍](#整体介绍) - * [排序模型列表](#排序模型列表) + * [模型列表](#模型列表) * [使用教程](#使用教程) * [数据处理](#数据处理) * [训练](#训练) @@ -18,16 +18,43 @@ * [模型性能列表](#模型性能列表) ## 整体介绍 -### 排序模型列表 +### 模型列表 | 模型 | 简介 | 论文 | | :------------------: | :--------------------: | :---------: | | DNN | 多层神经网络 | -- | -| wide&deep | Deep + wide(LR) | [Wide & Deep Learning for Recommender Systems](https://dl.acm.org/doi/abs/10.1145/2988450.2988454)(2016) | -| DeepFM | DeepFM | [DeepFM: A Factorization-Machine based Neural Network for CTR Prediction](https://arxiv.org/abs/1703.04247)(2017) | -| xDeepFM | xDeepFM | [xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems](https://dl.acm.org/doi/abs/10.1145/3219819.3220023)(2018) | -| DCN | Deep Cross Network | [Deep & Cross Network for Ad Click Predictions](https://dl.acm.org/doi/abs/10.1145/3124749.3124754)(2017) | -| DIN | Deep Interest Network | [Deep Interest Network for Click-Through Rate Prediction](https://dl.acm.org/doi/abs/10.1145/3219819.3219823)(2018) | +| wide&deep | Deep + wide(LR) | [Wide & Deep Learning for Recommender Systems](https://dl.acm.org/doi/pdf/10.1145/2988450.2988454)(2016) | +| DeepFM | DeepFM | [DeepFM: A Factorization-Machine based Neural Network for CTR Prediction](https://arxiv.org/pdf/1703.04247.pdf)(2017) | +| DCN | Deep Cross Network | [Deep & Cross Network for Ad Click Predictions](https://dl.acm.org/doi/pdf/10.1145/3124749.3124754)(2017) | +| xDeepFM | xDeepFM | [xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems](https://dl.acm.org/doi/pdf/10.1145/3219819.3220023)(2018) | +| DIN | Deep Interest Network | [Deep Interest Network for Click-Through Rate Prediction](https://dl.acm.org/doi/pdf/10.1145/3219819.3219823)(2018) | + +下面是每个模型的简介(注:图片引用自链接中的论文) + +[wide&deep](https://dl.acm.org/doi/pdf/10.1145/2988450.2988454): +

+ +

+ +[DeepFM](https://arxiv.org/pdf/1703.04247.pdf): +

+ +

+ +[XDeepFM](https://dl.acm.org/doi/pdf/10.1145/3219819.3220023): +

+ +

+ +[DCN](https://dl.acm.org/doi/pdf/10.1145/3124749.3124754): +

+ +

+ +[DIN](https://dl.acm.org/doi/pdf/10.1145/3219819.3219823): +

+ +

## 使用教程 ### 数据处理 @@ -66,4 +93,4 @@ | Criteo | DCN | -- | -- | -- | -- | -- | | Criteo | xDeepFM | -- | -- | -- | -- | -- | | Census-income Data | Wide&Deep | -- | -- | -- | -- | -- | -| Amazon Product | DIN | -- | -- | -- | -- | -- | \ No newline at end of file +| Amazon Product | DIN | -- | -- | -- | -- | -- | diff --git a/readme.md b/readme.md index 4873ab053d13cfa16e53121f0cd5dcd02978b282..ff2b64b8d7eea316b4d4a73249a84ff97751b21e 100644 --- a/readme.md +++ b/readme.md @@ -108,7 +108,7 @@ python -m paddlerec.run -m ./models/rank/dnn/config.yaml -e single | 方向 | 模型 | 单机CPU训练 | 单机GPU训练 | 分布式CPU训练 | | :------: | :----------------------------------------------------------------------------: | :---------: | :---------: | :-----------: | -| 内容理解 | [Text-Classifcation](models/contentunderstanding/text_classification/model.py) | ✓ | x | ✓ | +| 内容理解 | [Text-Classifcation](models/contentunderstanding/classification/model.py) | ✓ | x | ✓ | | 内容理解 | [TagSpace](models/contentunderstanding/tagspace/model.py) | ✓ | x | ✓ | | 召回 | [TDM](models/treebased/tdm/model.py) | ✓ | x | ✓ | | 召回 | [Word2Vec](models/recall/word2vec/model.py) | ✓ | x | ✓ | @@ -162,4 +162,4 @@ python -m paddlerec.run -m ./models/rank/dnn/config.yaml -e single ### 许可证书 本项目的发布受[Apache 2.0 license](LICENSE)许可认证。 - \ No newline at end of file +