• D
    feature(davide): Implementation of D4PG (#76) · 16a89c35
    Davide Liu 提交于
    * added experience replay and n-step
    
    * implementing distributional q value
    
    * added distributional q-value
    
    * added overview in qac_dist and d4pg
    
    * derived D4PG from DDPG
    
    * fixed a bug when action shape >1
    
    * benchmark D4PG mujoco + minor fixs
    
    -entry for DDPG mujoco
    -entry for D4PG mujoco
    -config for D4PG mujoco
    -fixed style D4PG code
    -unittests for QAC distributional
    
    * formatted code
    
    * minor updates (read description)
    
    -added d4pg seria_entry test
    -updated comments in QACDIST
    -added d4pg in commander register
    -added q_value in d4pg return dict
    -added priority update in d4pg entry
    -added assertion in QACDIST
    16a89c35
To learn more about this project, read the wiki.
README.md 22.7 KB