common.head¶

Please Reference ding/ding/docs/source/api_doc/model/common/head.py for usage

head_cls_map = {
    # discrete
    'discrete': DiscreteHead,
    'dueling': DuelingHead,
    'distribution': DistributionHead,
    'rainbow': RainbowHead,
    'qrdqn': QRDQNHead,
    'quantile': QuantileHead,
    # continuous
    'regression': RegressionHead,
    'reparameterization': ReparameterizationHead,
    # multi
    'multi': MultiHead,
}

DiscreteHead¶

class ding.model.common.head.DiscreteHead(hidden_size: int, output_size: int, layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The number of output
layer_num (int): The num of layers used in the network to compute Q value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details
noise (bool): Whether use NoiseLinearLayer as layer_fn in Q networks’ MLP

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict discrete output. Parameter updates with DiscreteHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with DiscreteHead setups and return the result prediction dictionary.
Necessary Keys:
logit (torch.Tensor): Logit tensor with same size as input x.

Examples:

>>> head = DiscreteHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict) and outputs['logit'].shape == torch.Size([4, 64])

DistributionHead¶

class ding.model.common.head.DistributionHead(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False, eps: Optional[float] = 1e-06)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False, eps: Optional[float] = 1e-06) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
layer_num (int): The num of layers used in the network to compute Q value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details
noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict Distribution output. Parameter updates with DistributionHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with DistributionHead setups and return the result prediction dictionary.
Necessary Keys:
logit (torch.Tensor): Logit tensor with same size as input x.

distribution (torch.Tensor): Distribution tensor of size (B, N, n_atom)

Examples:

>>> head = DistributionHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default n_atom is 51
>>> assert outputs['distribution'].shape == torch.Size([4, 64, 51])

RainbowHead¶

class ding.model.common.head.RainbowHead(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = True, eps: Optional[float] = 1e-06)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = True, eps: Optional[float] = 1e-06) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
layer_num (int): The num of layers used in the network to compute Q value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details
noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict Rainbow output. Parameter updates with RainbowHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with RainbowHead setups and return the result prediction dictionary.
Necessary Keys:
logit (torch.Tensor): Logit tensor with same size as input x.

distribution (torch.Tensor): Distribution tensor of size (B, N, n_atom)

Examples:

>>> head = RainbowHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default n_atom is 51
>>> assert outputs['distribution'].shape == torch.Size([4, 64, 51])

QRDQNHead¶

class ding.model.common.head.QRDQNHead(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
layer_num (int): The num of layers used in the network to compute Q value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details
noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict QRDQN output. Parameter updates with QRDQNHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with QRDQNHead setups and return the result prediction dictionary.
Necessary Keys:
logit (torch.Tensor): Logit tensor with same size as input x.

q (torch.Tensor): Q valye tensor tensor of size (B, N, num_quantiles)

tau (torch.Tensor): tau tensor of size (B, N, 1)

Examples:

>>> head = QRDQNHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default num_quantiles is 32
>>> assert outputs['q'].shape == torch.Size([4, 64, 32])
>>> assert outputs['tau'].shape == torch.Size([4, 32, 1])

QuantileHead¶

class ding.model.common.head.QuantileHead(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, quantile_embedding_size: int = 128, beta_function_type: Optional[str] = 'uniform', activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, quantile_embedding_size: int = 128, beta_function_type: Optional[str] = 'uniform', activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
layer_num (int): The num of layers used in the network to compute Q value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details
noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor, num_quantiles: Optional[int] = None) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict Quantile output. Parameter updates with QuantileHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with QuantileHead setups and return the result prediction dictionary.
Necessary Keys:
logit (torch.Tensor): Logit tensor with same size as input x.

q (torch.Tensor): Q valye tensor tensor of size (num_quantiles, B, N)

quantiles (torch.Tensor): quantiles tensor of size (quantile_embedding_size, 1)

Examples:

>>> head = QuantileHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default num_quantiles is 32
>>> assert outputs['q'].shape == torch.Size([32, 4, 64])
>>> assert outputs['quantiles'].shape == torch.Size([128, 1])

quantile_net(quantiles: torch.Tensor) → torch.Tensor[source]¶

Overview:

Deterministic parametric function trained to reparameterize samples from a base distribution. By repeated Bellman update iterations of Q-learning, the optimal action-value function is estimated.

Arguments:

x (torch.Tensor): The encoded embedding tensor of parametric sample

Returns:

(torch.Tensor):
QN output tensor after reparameterization of shape (quantile_embedding_size, output_size)

Examples:

>>> head = QuantileHead(64, 64)
>>> quantiles = torch.randn(128,1)
>>> qn_output = head.quantile_net(quantiles)
>>> assert isinstance(qn_output, torch.Tensor)
>>> # default quantile_embedding_size: int = 128,
>>> assert qn_output.shape == torch.Size([128, 64])

DuelingHead¶

class ding.model.common.head.DuelingHead(hidden_size: int, output_size: int, layer_num: int = 1, a_layer_num: Optional[int] = None, v_layer_num: Optional[int] = None, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 1, a_layer_num: Optional[int] = None, v_layer_num: Optional[int] = None, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
a_layer_num (int): The num of layers used in the network to compute action output
v_layer_num (int): The num of layers used in the network to compute value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details
noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict Dueling output. Parameter updates with DuelingHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with DuelingHead setups and return the result prediction dictionary.
Necessary Keys:
logit (torch.Tensor): Logit tensor with same size as input x.

Examples:

>>> head = DuelingHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])

RegressionHead¶

class ding.model.common.head.RegressionHead(hidden_size: int, output_size: int, layer_num: int = 2, final_tanh: Optional[bool] = False, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 2, final_tanh: Optional[bool] = False, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
final_tanh (Optional[bool]): Whether a final tanh layer is needed
layer_num (int): The num of layers used in the network to compute Q value output
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict Regression output. Parameter updates with RegressionHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with RegressionHead setups and return the result prediction dictionary.
Necessary Keys:
pred (torch.Tensor): Tensor with prediction value cells, with same size as input x.

Examples:

>>> head = RegressionHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['pred'].shape == torch.Size([4, 64])

ReparameterizationHead¶

class ding.model.common.head.ReparameterizationHead(hidden_size: int, output_size: int, layer_num: int = 2, sigma_type: Optional[str] = None, fixed_sigma_value: Optional[float] = 1.0, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, bound_type: Optional[str] = None)[source]¶

__init__(hidden_size: int, output_size: int, layer_num: int = 2, sigma_type: Optional[str] = None, fixed_sigma_value: Optional[float] = 1.0, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, bound_type: Optional[str] = None) → None[source]¶

Overview:

Init the Head according to arguments.

Arguments:

hidden_size (int): The hidden_size used before connected to DuelingHead
output_size (int): The num of output
layer_num (int): The num of layers used in the network to compute Q value output
sigma_type (Optional[str]): Sigma type used in ['fixed', 'independent', 'conditioned']
fixed_sigma_value(Optional[float]):
When choosing fixed type, the tensor output['sigma'] is filled with this input value.
activation (nn.Module):
The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()
norm_type (str):
The type of normalization to use, see ding.torch_utils.fc_block for more details

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict Reparameterization output. Parameter updates with ReparameterizationHead’s MLPs forward setup.

Arguments:

x (torch.Tensor):
The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:

outputs (Dict):
Run MLP with ReparameterizationHead setups and return the result prediction dictionary.
Necessary Keys:
mu (torch.Tensor) Tensor of cells of updated mu values, with same size as x.

sigma (torch.Tensor) Tensor of cells of updated sigma values, with same size as x.

Examples:

>>> head =  ReparameterizationHead(64, 64, sigma_type='fixed')
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['mu'].shape == torch.Size([4, 64])
>>> assert outputs['sigma'].shape == torch.Size([4, 64])

MultiHead¶

class ding.model.common.head.MultiHead(head_cls: type, hidden_size: int, output_size_list: ding.utils.type_helper.SequenceType, **head_kwargs)[source]¶

__init__(head_cls: type, hidden_size: int, output_size_list: ding.utils.type_helper.SequenceType, **head_kwargs) → None[source]¶

Overview:

Init the MultiHead according to arguments.

Arguments:

head_cls (type):
The class of head, like DuelingHead, DistributionHead, QuatileHead, etc
hidden_size (int): The number of hidden layer size
output_size_list (int):
The collection of output_size, e.g.: multi discrete action, [2, 3, 5]
head_kwargs: (dict): Class-specific arguments

forward(x: torch.Tensor) → Dict[source]¶

Overview:

Use encoded embedding tensor to predict multi discrete output

Arguments:

x (torch.Tensor): The encoded embedding tensor, usually with shape (B, N)

Returns:

outputs (Dict):
Prediction output dict
Necessary Keys:
logit (torch.Tensor):
Logit tensor with logit tensors indexed by output each accessed at ['logit'][i]. Given that output_size_list==[o1,o2,o3,...] , ['logit'][i] is of size (B,Ni)

Examples:

>>> head = MultiHead(DuelingHead, 64, [2, 3, 5], v_layer_num=2)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> # output_size_list is [2, 3, 5] as set
>>> # Therefore each dim of logit is as follows
>>> outputs['logit'][0].shape
>>> torch.Size([4, 2])
>>> outputs['logit'][1].shape
>>> torch.Size([4, 3])
>>> outputs['logit'][2].shape
>>> torch.Size([4, 5])