diff --git a/doc/fluid/user_guides/howto/training/save_load_variables.rst b/doc/fluid/user_guides/howto/training/save_load_variables.rst index c985b96deecb079f4de4286c26beeab346fb3761..6ce016d84eb8fded69675d4689fb684527ed608e 100644 --- a/doc/fluid/user_guides/howto/training/save_load_variables.rst +++ b/doc/fluid/user_guides/howto/training/save_load_variables.rst @@ -270,12 +270,12 @@ save_vars、save_params、save_persistables 以及 save_inference_model的区别 pserver_endpoints = "127.0.0.1:1001,127.0.0.1:1002" trainers = 4 training_role == "PSERVER" + current_endpoint = "127.0.0.1:1002" config = fluid.DistributeTranspilerConfig() t = fluid.DistributeTranspiler(config=config) t.transpile(trainer_id, pservers=pserver_endpoints, trainers=trainers, sync_mode=True, current_endpoint=current_endpoint) if training_role == "PSERVER": - current_endpoint = "127.0.0.1:1001" pserver_prog = t.get_pserver_program(current_endpoint) pserver_startup = t.get_startup_program(current_endpoint, pserver_prog) @@ -284,7 +284,7 @@ save_vars、save_params、save_persistables 以及 save_inference_model的区别 exe.run(pserver_prog) if training_role == "TRAINER": main_program = t.get_trainer_program() - exe.run(main_program) + exe.run(main_program) 上面的例子中,每个PServer通过调用HDFS的命令获取到0号trainer保存的参数,通过配置获取到PServer的 :code:`fluid.Program` ,PaddlePaddle Fluid会从此 :code:`fluid.Program` 也就是 :code:`pserver_startup` 的所有模型变量中找出长期变量,并通过指定的 :code:`path` 目录下一一加载。 diff --git a/doc/fluid/user_guides/howto/training/save_load_variables_en.rst b/doc/fluid/user_guides/howto/training/save_load_variables_en.rst index 57e704de8f0bb6eba4acaa6ca0656ccf0382680f..7130cb03b579dd97ad64f02aa401ebd188047393 100644 --- a/doc/fluid/user_guides/howto/training/save_load_variables_en.rst +++ b/doc/fluid/user_guides/howto/training/save_load_variables_en.rst @@ -179,23 +179,23 @@ For the PServer to be loaded with parameters during training, for example: exe = fluid.Executor(fluid.CPUPlace()) path = "./models" - pserver_endpoints = "127.0.0.1:1001,127.0.0.1:1002" - trainers = 4 - Training_role == "PSERVER" - config = fluid.DistributeTranspilerConfig() - t = fluid.DistributeTranspiler(config=config) - t.transpile(trainer_id, pservers=pserver_endpoints, trainers=trainers, sync_mode=True) - - if training_role == "PSERVER": - current_endpoint = "127.0.0.1:1001" - pserver_prog = t.get_pserver_program(current_endpoint) - pserver_startup = t.get_startup_program(current_endpoint, pserver_prog) - - exe.run(pserver_startup) - fluid.io.load_persistables(exe, path, pserver_startup) - exe.run(pserver_prog) - if training_role == "TRAINER": - main_program = t.get_trainer_program() - exe.run(main_program) + pserver_endpoints = "127.0.0.1:1001,127.0.0.1:1002" + trainers = 4 + Training_role == "PSERVER" + current_endpoint = "127.0.0.1:1002" + config = fluid.DistributeTranspilerConfig() + t = fluid.DistributeTranspiler(config=config) + t.transpile(trainer_id, pservers=pserver_endpoints, trainers=trainers, sync_mode=True, current_endpoint=current_endpoint) + + if training_role == "PSERVER": + pserver_prog = t.get_pserver_program(current_endpoint) + pserver_startup = t.get_startup_program(current_endpoint, pserver_prog) + + exe.run(pserver_startup) + fluid.io.load_persistables(exe, path, pserver_startup) + exe.run(pserver_prog) + if training_role == "TRAINER": + main_program = t.get_trainer_program() + exe.run(main_program) In the above example, each PServer obtains the parameters saved by trainer 0 by calling the HDFS command, and obtains the PServer's :code:`fluid.Program` by configuration. PaddlePaddle Fluid will find all persistable variables in all model variables from this :code:`fluid.Program` , e.t. :code:`pserver_startup` , and load them from the specified :code:`path` directory.