Update the optimization of PaddingRNN model in benchmark repo to models (!2413) · 合并请求 · PaddlePaddle / models

Update the optimization of PaddingRNN model in benchmark repo to models !2413

Created by: Xreki

We changed a lot of the PaddingRNN codes, include both the model configuration and the training options. So we need to update those changes to here.

I test the develop branch use:

+ python train.py --data_path data/simple-examples/data/ --model_type small --rnn_model static --enable_ce --use_gpu True
2019-06-17 06:58:47,642 - lm - INFO - Running with args : Namespace(data_path='data/simple-examples/data/', enable_ce=True, log_path=None, model_type='small', para_init=False, rnn_model='static', use_gpu=True)
W0617 06:58:49.498152  4694 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0 
W0617 06:58:49.506192  4694 device_context.cc:267] device: 0, cuDNN Version: 7.5.
.
.
.
python train.py \
      --data_path data/simple-examples/data/ \
      --model_type ${model_type} \
      --rnn_model static \
      --enable_ce \
      --use_gpu True

small model:

epoch id 12
ppl  232 44.82514 0.001953125
ppl  464 49.39149 0.001953125
ppl  696 47.661358 0.001953125
ppl  928 46.72023 0.001953125
ppl  1160 46.184597 0.001953125
ppl  1392 44.540607 0.001953125
ppl  1624 43.820793 0.001953125
ppl  1856 43.23518 0.001953125
ppl  2088 41.79701 0.001953125
ppl  2320 40.787346 0.001953125
train ppl 40.786976
ptblm lstm_language_model_static_duration_card1 70.1848174242
ptblm lstm_language_model_static_loss_card1 40.786976
valid ppl 119.65324
test ppl 114.91723

large model:

+ python train.py --data_path data/simple-examples/data/ --model_type large --rnn_model static --enable_ce --use_gpu True
2019-06-17 07:16:50,873 - lm - INFO - Running with args : Namespace(data_path='data/simple-examples/data/', enable_ce=True, log_path=None, model_type='large', para_init=False, rnn_model='static', use_gpu=True)
W0617 07:16:53.597585  4710 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W0617 07:16:53.602998  4710 device_context.cc:267] device: 0, cuDNN Version: 7.5.
.
.
.
epoch id 54
ppl  132 37.065334 0.0032462992
ppl  264 40.845306 0.0032462992
ppl  396 39.749107 0.0032462992
ppl  528 39.512344 0.0032462992
ppl  660 39.659756 0.0032462992
ppl  792 38.83683 0.0032462992
ppl  924 38.86937 0.0032462992
ppl  1056 39.0496 0.0032462992
ppl  1188 38.37251 0.0032462992
ppl  1320 38.06603 0.0032462992
train ppl 38.081997
ptblm lstm_language_model_static_duration_card1 125.04247805
ptblm lstm_language_model_static_loss_card1 38.081997
valid ppl 82.77821
test ppl 78.73715

Then I test this PR use:

python train.py \
      --data_path data/simple-examples/data/ \
      --model_type ${model_type} \
      --rnn_model static \
      --use_py_reader True \
      --enable_ce \
      --use_gpu True

small model

+ python train.py --data_path data/simple-examples/data/ --model_type small --rnn_model static --use_py_reader True --enable_ce --use_gpu True
2019-06-18 03:25:14,878 - lm - INFO - Running with args : Namespace(batch_size=0, data_path='data/simple-examples/data/', enable_ce=True, log_path=None, max_epoch=0, model_type='small', para_init=False, parallel=True, rnn_model='static', save_model_dir='models', use_gpu=True, use_py_reader=True)
W0618 03:25:16.665827  5481 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0 
W0618 03:25:16.671789  5481 device_context.cc:267] device: 0, cuDNN Version: 7.5.
.
.
.
-- Epoch:[12]; Batch:[232]; Time: 0.02059 s; ppl: 45.17641, lr: 0.00195
-- Epoch:[12]; Batch:[464]; Time: 0.02319 s; ppl: 49.69373, lr: 0.00195
-- Epoch:[12]; Batch:[696]; Time: 0.02119 s; ppl: 47.89302, lr: 0.00195
-- Epoch:[12]; Batch:[928]; Time: 0.02276 s; ppl: 46.88626, lr: 0.00195
-- Epoch:[12]; Batch:[1160]; Time: 0.02284 s; ppl: 46.28157, lr: 0.00195
-- Epoch:[12]; Batch:[1392]; Time: 0.02228 s; ppl: 44.61953, lr: 0.00195
-- Epoch:[12]; Batch:[1624]; Time: 0.02480 s; ppl: 43.86269, lr: 0.00195
-- Epoch:[12]; Batch:[1856]; Time: 0.02261 s; ppl: 43.27738, lr: 0.00195
-- Epoch:[12]; Batch:[2088]; Time: 0.02492 s; ppl: 41.82330, lr: 0.00195
-- Epoch:[12]; Batch:[2320]; Time: 0.02289 s; ppl: 40.82926, lr: 0.00195

Train epoch:[12]; epoch Time: 53.97107; ppl: 40.82766; avg_time: 43.06617 steps/s 

ptblm lstm_language_model_static_duration_card1 49.5954330701
ptblm lstm_language_model_static_loss_card1 40.827663
Valid ppl: 119.90054
Saved model to: models/12.

Test ppl: 115.99717

large model

+ python train.py --data_path data/simple-examples/data/ --model_type large --rnn_model static --use_py_reader True --enable_ce --use_gpu True
2019-06-18 08:22:52,034 - lm - INFO - Running with args : Namespace(batch_size=0, data_path='data/simple-examples/data/', enable_ce=True, log_path=None, max_epoch=0, model_type='large', para_init=False, parallel=True, rnn_model='static', save_model_dir='models', use_gpu=True, use_py_reader=True)
W0618 08:22:54.477843  5797 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0 
W0618 08:22:54.483436  5797 device_context.cc:267] device: 0, cuDNN Version: 7.5.
.
.
.
-- Epoch:[54]; Batch:[132]; Time: 0.05313 s; ppl: 36.70142, lr: 0.00325
-- Epoch:[54]; Batch:[264]; Time: 0.05284 s; ppl: 40.47433, lr: 0.00325
-- Epoch:[54]; Batch:[396]; Time: 0.05262 s; ppl: 39.36209, lr: 0.00325
-- Epoch:[54]; Batch:[528]; Time: 0.05793 s; ppl: 39.15080, lr: 0.00325
-- Epoch:[54]; Batch:[660]; Time: 0.05314 s; ppl: 39.24402, lr: 0.00325
-- Epoch:[54]; Batch:[792]; Time: 0.05264 s; ppl: 38.42751, lr: 0.00325
-- Epoch:[54]; Batch:[924]; Time: 0.05725 s; ppl: 38.41288, lr: 0.00325
-- Epoch:[54]; Batch:[1056]; Time: 0.05902 s; ppl: 38.57289, lr: 0.00325
-- Epoch:[54]; Batch:[1188]; Time: 0.05940 s; ppl: 37.86667, lr: 0.00325
-- Epoch:[54]; Batch:[1320]; Time: 0.05668 s; ppl: 37.54736, lr: 0.00325

Train epoch:[54]; epoch Time: 75.54808; ppl: 37.56197; avg_time: 17.59101 steps/s

ptblm lstm_language_model_static_duration_card1 74.7825333422
ptblm lstm_language_model_static_loss_card1 37.56197
Valid ppl: 82.56120
Saved model to: models/54.

Test ppl: 78.93229

PaddlePaddle / models 大约 1 年 前同步成功

Update the optimization of PaddingRNN model in benchmark repo to models !2413

PaddlePaddle / models
大约 1 年前同步成功