diff --git a/models/multitask/mmoe/data/run.sh b/models/multitask/mmoe/data/run.sh new file mode 100644 index 0000000000000000000000000000000000000000..b60d42b37057593b1c16aa5fd91b8217a5a71bbf --- /dev/null +++ b/models/multitask/mmoe/data/run.sh @@ -0,0 +1,16 @@ +mkdir train_data +mkdir test_data +mkdir data +train_path="data/census-income.data" +test_path="data/census-income.test" +train_data_path="train_data/" +test_data_path="test_data/" +pip install -r requirements.txt + +wget -P data/ https://archive.ics.uci.edu/ml/machine-learning-databases/census-income-mld/census.tar.gz +tar -zxvf data/census.tar.gz -C data/ + +python data_preparation.py --train_path ${train_path} \ + --test_path ${test_path} \ + --train_data_path ${train_data_path}\ + --test_data_path ${test_data_path} diff --git a/models/multitask/mmoe/readme.md b/models/multitask/mmoe/readme.md index 4efa036a6998ed1826e07dd39f8cf866b0e72e04..694323db5eceaccc11b548f2154ab2355f1c7881 100644 --- a/models/multitask/mmoe/readme.md +++ b/models/multitask/mmoe/readme.md @@ -14,6 +14,19 @@ python -m paddlerec.run -m paddlerec.models.multitask.mmoe 根据原论文,我们在开源数据集Census-income Data上验证模型效果 +### 数据下载及预处理 + +数据地址: [Census-income Data](https://archive.ics.uci.edu/ml/machine-learning-databases/census-income-mld/census.tar.gz ) + +数据解压后, 在data/run.sh脚本文件中添加文件的路径,并运行脚本。 + +```shell +cd data +sh run.sh +``` + +脚本运行后,在config.yaml中修改数据路径dataset.data_path + ### 参数 config.yaml中的hyper_parameters部分,batch_size:32, epochs:400