Incremental Learning Support for Fluid with Distribution
Created by: seiriosPlus
Incremental Learning Supported:
At current, Trainer will run save_model
at the end of a train.
But, when we run PaddlePaddle with distribution, there are two problems need to be solved:
**
- there are multi trainers, they will all save a model in
param_path
, but there is no need for that. - the parameter server will not run
save_model
, but it needs to load model at startup. **
The solution is same as #10376 (closed)
The different with checkpoint
is:
-
save_model
must be called by manual. -
save_model
only save models variables, do not need to save other things. -
save_model
do not delete files be saved.