From 3b0a9cf5ec683f0e1b6a4f515869f4c42be0fdf1 Mon Sep 17 00:00:00 2001 From: yinhaofeng <1841837261@qq.com> Date: Mon, 21 Sep 2020 12:37:49 +0000 Subject: [PATCH] fm add readme --- models/rank/fm/readme.md | 261 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 261 insertions(+) create mode 100644 models/rank/fm/readme.md diff --git a/models/rank/fm/readme.md b/models/rank/fm/readme.md new file mode 100644 index 00000000..e2284faf --- /dev/null +++ b/models/rank/fm/readme.md @@ -0,0 +1,261 @@ +# 基于FM模型的点击率预估模型 + +## 介绍 +`CTR(Click Through Rate)`,即点击率,是“推荐系统/计算广告”等领域的重要指标,对其进行预估是商品推送/广告投放等决策的基础。简单来说,CTR预估对每次广告的点击情况做出预测,预测用户是点击还是不点击。CTR预估模型综合考虑各种因素、特征,在大量历史数据上训练,最终对商业决策提供帮助。本模型实现了下述论文中的FM模型: + +```text +@inproceedings{guo2017deepfm, + title={DeepFM: A Factorization-Machine based Neural Network for CTR Prediction}, + author={Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li and Xiuqiang He}, + booktitle={the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI)}, + pages={1725--1731}, + year={2017} +} +``` + +## 数据准备 +### 数据来源 +训练及测试数据集选用[Display Advertising Challenge](https://www.kaggle.com/c/criteo-display-ad-challenge/)所用的Criteo数据集。该数据集包括两部分:训练集和测试集。训练集包含一段时间内Criteo的部分流量,测试集则对应训练数据后一天的广告点击流量。 +每一行数据格式如下所示: +```bash +