1. 17 4月, 2019 1 次提交
    • H
      GA3C example (#63) · 3c511e8f
      Hongsheng Zeng 提交于
      * add IMPALA algorithm and some common utils
      
      * update README.md
      
      * refactor files structure of impala algorithm; seperate numpy utils from utils
      
      * add hyper parameter scheduler module; add entropy and lr scheduler in impala
      
      * clip reward in atari wrapper instead of learner side; fix codestyle
      
      * add benchmark result of impala; refine code of impala example; add obs_format in atari_wrappers
      
      * Update README.md
      
      * add a3c algorithm, A2C example and rl_utils
      
      * require training in single gpu/cpu
      
      * only check cpu/gpu num in learner
      
      * refine Readme
      
      * update impala benchmark picture; update Readme
      
      * add benchmark result of A2C
      
      * move get_params/set_params in agent_base
      
      * add GA3C example
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * refine Readme
      
      * add benchmark
      
      * add default safe eps in numpy logp calculation
      
      * refine document; make unittest stable
      3c511e8f
  2. 15 4月, 2019 1 次提交
    • H
      A2C example (#62) · 39846831
      Hongsheng Zeng 提交于
      * add IMPALA algorithm and some common utils
      
      * update README.md
      
      * refactor files structure of impala algorithm; seperate numpy utils from utils
      
      * add hyper parameter scheduler module; add entropy and lr scheduler in impala
      
      * clip reward in atari wrapper instead of learner side; fix codestyle
      
      * add benchmark result of impala; refine code of impala example; add obs_format in atari_wrappers
      
      * Update README.md
      
      * add a3c algorithm, A2C example and rl_utils
      
      * require training in single gpu/cpu
      
      * only check cpu/gpu num in learner
      
      * refine Readme
      
      * update impala benchmark picture; update Readme
      
      * add benchmark result of A2C
      
      * move get_params/set_params in agent_base
      
      * fix shell script cannot run in ubuntu
      
      * refine comment and document
      
      * Update README.md
      
      * Update README.md
      39846831
  3. 08 4月, 2019 1 次提交
    • H
      implement of IMPALA with the newest parallel design (#60) · b28289ac
      Hongsheng Zeng 提交于
      * add IMPALA algorithm and some common utils
      
      * update README.md
      
      * refactor files structure of impala algorithm; seperate numpy utils from utils
      
      * add hyper parameter scheduler module; add entropy and lr scheduler in impala
      
      * clip reward in atari wrapper instead of learner side; fix codestyle
      
      * add benchmark result of impala; refine code of impala example; add obs_format in atari_wrappers
      
      * Update README.md
      b28289ac