envs.env_manager¶
base_env_manager¶
BaseEnvManager¶
- class ding.envs.env_manager.base_env_manager.BaseEnvManager(env_fn: List[Callable], cfg: easydict.EasyDict = {})[source]¶
- Overview:
Create a BaseEnvManager to manage multiple environments.
- Interfaces:
reset, step, seed, close, enable_save_replay, launch, env_info, default_config
- Properties:
env_num, ready_obs, done, method_name_list,active_env
- enable_save_replay(replay_path: Union[List[str], str]) None [source]¶
- Overview:
Set each env’s replay save path.
- Arguments:
replay_path (
Union[List[str], str]
): List of paths for each environment; Or one path for all environments.
- env_info() collections.namedtuple [source]¶
- Overview:
Get one env’s info, for example, action space, observation space, reward space, etc.
- Returnns:
info (
namedtuple
): Usually a namedtupleBaseEnvInfo
, each element isEnvElementInfo
.
- launch(reset_param: Optional[Dict] = None) None [source]¶
- Overview:
Set up the environments and their parameters.
- Arguments:
reset_param (
Optional[Dict]
): Dict of reset parameters for each environment, key is the env_id, value is the cooresponding reset parameters.
- property ready_obs: Dict[int, Any]¶
- Overview:
Get the next observations(in
np.ndarray
type) and corresponding env id.- Return:
A dictionary with observations and their environment IDs.
- Example:
>>> obs_dict = env_manager.ready_obs >>> actions_dict = {env_id: model.forward(obs) for env_id, obs in obs_dict.items())}
- reset(reset_param: Optional[Dict] = None) None [source]¶
- Overview:
Reset the environments their parameters.
- Arguments:
reset_param (
List
): Dict of reset parameters for each environment, key is the env_id, value is the cooresponding reset parameters.
- seed(seed: Union[Dict[int, int], List[int], int], dynamic_seed: Optional[bool] = None) None [source]¶
- Overview:
Set the seed for each environment.
- Arguments:
seed (
Union[Dict[int, int], List[int], int]
): List of seeds for each environment; Or one seed for the first environment and other seeds are generated automatically.
- step(actions: Dict[int, Any]) Dict[int, collections.namedtuple] [source]¶
- Overview:
Step all environments. Reset an env if done.
- Arguments:
actions (
Dict[int, Any]
): {env_id: action}
- Returns:
timesteps (
Dict[int, namedtuple]
): {env_id: timestep}. Timestep is aBaseEnvTimestep
tuple with observation, reward, done, env_info.
- Example:
>>> actions_dict = {env_id: model.forward(obs) for env_id, obs in obs_dict.items())} >>> timesteps = env_manager.step(actions_dict): >>> for env_id, timestep in timesteps.items(): >>> pass
create_env_manager¶
- ding.envs.env_manager.base_env_manager.create_env_manager(manager_cfg: dict, env_fn: List[Callable]) ding.envs.env_manager.base_env_manager.BaseEnvManager [source]¶
- Overview:
Create an env manager according to manager cfg and env function.
- Arguments:
manager_cfg (
EasyDict
): Env manager config.env_fn (:obj:` List[Callable]`): A list of envs’ functions.
- ArgumentsKeys:
manager_cfg’s necessary: type
subprocess_env_manager¶
ShmBuffer¶
- class ding.envs.env_manager.subprocess_env_manager.ShmBuffer(dtype: numpy.generic, shape: Tuple[int])[source]¶
- Overview:
Shared memory buffer to store numpy array.
ShmBufferContainer¶
- class ding.envs.env_manager.subprocess_env_manager.ShmBufferContainer(dtype: numpy.generic, shape: Union[Dict[Any, tuple], tuple])[source]¶
- Overview:
Support multiple shared memory buffers. Each key-value is name-buffer.
SyncSubprocessEnvManager¶
- class ding.envs.env_manager.subprocess_env_manager.SyncSubprocessEnvManager(env_fn: List[Callable], cfg: easydict.EasyDict = {})[source]¶
- close() None ¶
- Overview:
CLose the env manager and release all related resources.
- enable_save_replay(replay_path: Union[List[str], str]) None ¶
- Overview:
Set each env’s replay save path.
- Arguments:
replay_path (
Union[List[str], str]
): List of paths for each environment; Or one path for all environments.
- env_info() collections.namedtuple ¶
- Overview:
Get one env’s info, for example, action space, observation space, reward space, etc.
- Returnns:
info (
namedtuple
): Usually a namedtupleBaseEnvInfo
, each element isEnvElementInfo
.
- launch(reset_param: Optional[Dict] = None) None ¶
- Overview:
Set up the environments and their parameters.
- Arguments:
reset_param (
Optional[Dict]
): Dict of reset parameters for each environment, key is the env_id, value is the cooresponding reset parameters.
- property ready_obs: Dict[int, Any]¶
- Overview:
Get the next observations.
- Return:
A dictionary with observations and their environment IDs.
- Note:
The observations are returned in np.ndarray.
- Example:
>>> obs_dict = env_manager.ready_obs >>> actions_dict = {env_id: model.forward(obs) for env_id, obs in obs_dict.items())}
- reset(reset_param: Optional[Dict] = None) None ¶
- Overview:
Reset the environments their parameters.
- Arguments:
reset_param (
List
): Dict of reset parameters for each environment, key is the env_id, value is the cooresponding reset parameters.
- seed(seed: Union[Dict[int, int], List[int], int], dynamic_seed: Optional[bool] = None) None ¶
- Overview:
Set the seed for each environment.
- Arguments:
seed (
Union[Dict[int, int], List[int], int]
): List of seeds for each environment; Or one seed for the first environment and other seeds are generated automatically.
- step(actions: Dict[int, Any]) Dict[int, collections.namedtuple] [source]¶
- Overview:
Step all environments. Reset an env if done.
- Arguments:
actions (
Dict[int, Any]
): {env_id: action}
- Returns:
timesteps (
Dict[int, namedtuple]
): {env_id: timestep}. Timestep is aBaseEnvTimestep
tuple with observation, reward, done, env_info.
- Example:
>>> actions_dict = {env_id: model.forward(obs) for env_id, obs in obs_dict.items())} >>> timesteps = env_manager.step(actions_dict): >>> for env_id, timestep in timesteps.items(): >>> pass
Note
The env_id that appears in
actions
will also be returned intimesteps
.Each environment is run by a subprocess separately. Once an environment is done, it is reset immediately.
AsyncSubprocessEnvManager¶
- class ding.envs.env_manager.subprocess_env_manager.AsyncSubprocessEnvManager(env_fn: List[Callable], cfg: easydict.EasyDict = {})[source]¶
- Overview:
Create an AsyncSubprocessEnvManager to manage multiple environments. Each Environment is run by a respective subprocess.
- Interfaces:
seed, launch, ready_obs, step, reset, env_info,active_env
- enable_save_replay(replay_path: Union[List[str], str]) None [source]¶
- Overview:
Set each env’s replay save path.
- Arguments:
replay_path (
Union[List[str], str]
): List of paths for each environment; Or one path for all environments.
- env_info() collections.namedtuple ¶
- Overview:
Get one env’s info, for example, action space, observation space, reward space, etc.
- Returnns:
info (
namedtuple
): Usually a namedtupleBaseEnvInfo
, each element isEnvElementInfo
.
- launch(reset_param: Optional[Dict] = None) None [source]¶
- Overview:
Set up the environments and their parameters.
- Arguments:
reset_param (
Optional[Dict]
): Dict of reset parameters for each environment, key is the env_id, value is the cooresponding reset parameters.
- property ready_obs: Dict[int, Any]¶
- Overview:
Get the next observations.
- Return:
A dictionary with observations and their environment IDs.
- Note:
The observations are returned in np.ndarray.
- Example:
>>> obs_dict = env_manager.ready_obs >>> actions_dict = {env_id: model.forward(obs) for env_id, obs in obs_dict.items())}
- reset(reset_param: Optional[Dict] = None) None [source]¶
- Overview:
Reset the environments their parameters.
- Arguments:
reset_param (
List
): Dict of reset parameters for each environment, key is the env_id, value is the cooresponding reset parameters.
- seed(seed: Union[Dict[int, int], List[int], int], dynamic_seed: Optional[bool] = None) None ¶
- Overview:
Set the seed for each environment.
- Arguments:
seed (
Union[Dict[int, int], List[int], int]
): List of seeds for each environment; Or one seed for the first environment and other seeds are generated automatically.
- step(actions: Dict[int, Any]) Dict[int, collections.namedtuple] [source]¶
- Overview:
Step all environments. Reset an env if done.
- Arguments:
actions (
Dict[int, Any]
): {env_id: action}
- Returns:
timesteps (
Dict[int, namedtuple]
): {env_id: timestep}. Timestep is aBaseEnvTimestep
tuple with observation, reward, done, env_info.
- Example:
>>> actions_dict = {env_id: model.forward(obs) for env_id, obs in obs_dict.items())} >>> timesteps = env_manager.step(actions_dict): >>> for env_id, timestep in timesteps.items(): >>> pass