diff --git a/_sources/guide/code_comment.rst.txt b/_sources/guide/code_comment_en.rst.txt similarity index 100% rename from _sources/guide/code_comment.rst.txt rename to _sources/guide/code_comment_en.rst.txt diff --git a/_sources/guide/doc_contribution.rst.txt b/_sources/guide/doc_contribution_en.rst.txt similarity index 100% rename from _sources/guide/doc_contribution.rst.txt rename to _sources/guide/doc_contribution_en.rst.txt diff --git a/_sources/guide/index.rst.txt b/_sources/guide/index.rst.txt index 00568c224ca5801b7841a8841ca757e83e78496c..fe38ce16856320f543b5726b9d4ea5a5fbe39c80 100644 --- a/_sources/guide/index.rst.txt +++ b/_sources/guide/index.rst.txt @@ -5,6 +5,6 @@ Developer Guide :maxdepth: 2 git_guide_en - doc_contribution - code_comment + doc_contribution_en + code_comment_en diff --git a/_sources/quick_start/index.rst.txt b/_sources/quick_start/index.rst.txt index 476c2024ba7da1a3fcfa9f229c3ee966c5616b48..7be5be8f75c9818181a92ae66e2a6b96bfd0e68d 100644 --- a/_sources/quick_start/index.rst.txt +++ b/_sources/quick_start/index.rst.txt @@ -201,8 +201,7 @@ DI-engine supports various useful tools in common RL training, as shown in follo Epsilon Greedy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -An easy way of deploying epsilon greedy exploration when sampling data has already been shown above. It is -called by the ``epsilon_greedy`` function each step. And you can select your own decay strategy, such as envstep and train_iter. +An easy way of deploying epsilon greedy exploration when sampling data is shown as follows: .. code-block:: python @@ -214,12 +213,14 @@ called by the ``epsilon_greedy`` function each step. And you can select your own eps = epsilon_greedy(learner.train_iter) ... +Firstly, you should call ``get_epsilon_greedy_fn`` to acquire an eps-greedy function. Then, you should call ``epsilon_greedy`` function at each step. The epsilon decay strategy can be configured by you, for example, start value, end value, type of decay(linear, exponential). And you can control whether it decay by env step or train iteration. + Visualization & Logging ~~~~~~~~~~~~~~~~~~~~~~~~~ -Some environments have a rendering surface or visualization. DI-engine doesn't use render interface but add a switch to save these replays. -After training, users can add the next lines to enable this function. If everything is working fine, you can find some videos with ``.mp4`` suffix in the ``replay_path`` (some GUI interfaces are normal). +Some environments have a rendering visualization. DI-engine doesn't use render interface, but supports saving replay videos instead. +After training, users can add the code shown below to enable this function. If everything works well, you can find some videos with ``.mp4`` suffix in directory ``replay_path`` (some GUI interfaces are normal). .. code-block:: python @@ -235,7 +236,7 @@ After training, users can add the next lines to enable this function. If everyth .. note:: - If users want to visualize with a trained policy, please refer to ``dizoo/classic_control/cartpole/entry/cartpole_dqn_eval.py`` to construct a user-defined evaluation function, and indicate two fields ``env.replay_path`` and ``policy.learn.learner.hook.load_ckpt_before_run`` in config, an example is shown as follows: + If users want to visualize with a trained policy, please refer to ``dizoo/classic_control/cartpole/entry/cartpole_dqn_eval.py`` to construct a user-defined evaluation function, and indicate two fields ``env.replay_path`` and ``policy.learn.learner.hook.load_ckpt_before_run`` in config. An example is shown as follows: .. code-block:: python @@ -252,7 +253,7 @@ After training, users can add the next lines to enable this function. If everyth .. tip:: - Each new RL environments can define their own ``enable_save_replay`` method to specify how to generate replay files. DI-engine utilizes ``gym wrapper (coupled with ffmpeg)`` to generate replay for some traditional environments. If users encounter some errors in recording videos by ``gym wrapper`` , you should install ``ffmpeg`` first. + All new RL environments can define their own ``enable_save_replay`` method to specify how to generate replay files. DI-engine utilizes ``gym wrapper (coupled with ffmpeg)`` to generate replays for some traditional environments. If users encounter some errors in recording videos by ``gym wrapper``, you should install ``ffmpeg`` first. Similar with other Deep Learning platforms, DI-engine uses tensorboard to record key parameters and results during @@ -269,8 +270,8 @@ DQN experiment. Loading & Saving checkpoints ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It is usually needed to save and resume an experiments with model checkpoints. DI-engine saves and loads checkpoints -in the same way as PyTorch. +It is usually needed to save and resume an experiment with model checkpoint. +DI-engine saves and loads checkpoints in the same way as PyTorch. .. code-block:: python @@ -282,7 +283,7 @@ in the same way as PyTorch. ... dirname = './ckpt_{}'.format(learner.name) - os.mkdir(dirname, exsit_ok=True) + os.mkdir(dirname, exist_ok=True) ckpt_name = 'iteration_{}.pth.tar'.format(learner.last_iter.val) path = os.path.join(dirname, ckpt_name) state_dict = learner.policy.state_dict() @@ -290,9 +291,9 @@ in the same way as PyTorch. learner.info('{} save ckpt in {}'.format(learner.name, path)) To deploy this in a more elegant way, DI-engine is configured to use -:class:`Learner Hooks ` to handle these cases. The saving hook is -automatically frequently called after training iterations. And to load & save checkpoints at the beginning and -in the end, users can simply add one line code before & after training as follow. +:class:`Learner Hook ` to handle these cases. The saving hook is +automatically called after training iterations. And to load & save checkpoints at the beginning and +in the end, users can simply add one-line code before & after training as follows. .. code-block:: python diff --git a/_sources/quick_start/tb_demo.rst.txt b/_sources/quick_start/tb_demo.rst.txt index 40bd979872813ed6bfc3a0db4e683d2dfd000066..95d46d8b2bc6f4a26bb4635e95339f7605256c6a 100644 --- a/_sources/quick_start/tb_demo.rst.txt +++ b/_sources/quick_start/tb_demo.rst.txt @@ -4,8 +4,8 @@ Tensorboard and Logging demo .. toctree:: :maxdepth: 3 -In this page, the default tensorboard and logging information is detailly described. A ``CartPole DQN`` experiment -is used as example. +In this page, the default tensorboard and logging information is described in detail. +A ``CartPole DQN`` experiment is used as example. Tensorboard info ----------------- diff --git a/guide/code_comment.html b/guide/code_comment_en.html similarity index 98% rename from guide/code_comment.html rename to guide/code_comment_en.html index efadd217c1f24f6878e241e621ea9e53d516fab7..790573172ffe708cb8b0d458a9a4776adad06196 100644 --- a/guide/code_comment.html +++ b/guide/code_comment_en.html @@ -38,7 +38,7 @@ - + @@ -100,7 +100,7 @@
  • Tutorial-Developer
  • @@ -237,7 +237,7 @@ git checkout -b base-master base/master # use `base-master` to