common.head

Please Reference ding/ding/docs/source/api_doc/model/common/head.py for usage

head_cls_map = {
    # discrete
    'discrete': DiscreteHead,
    'dueling': DuelingHead,
    'distribution': DistributionHead,
    'rainbow': RainbowHead,
    'qrdqn': QRDQNHead,
    'quantile': QuantileHead,
    # continuous
    'regression': RegressionHead,
    'reparameterization': ReparameterizationHead,
    # multi
    'multi': MultiHead,
}

DiscreteHead

class ding.model.common.head.DiscreteHead(hidden_size: int, output_size: int, layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The number of output

  • layer_num (int): The num of layers used in the network to compute Q value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

  • noise (bool): Whether use NoiseLinearLayer as layer_fn in Q networks’ MLP

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict discrete output. Parameter updates with DiscreteHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with DiscreteHead setups and return the result prediction dictionary.

    Necessary Keys:
    • logit (torch.Tensor): Logit tensor with same size as input x.

Examples:
>>> head = DiscreteHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict) and outputs['logit'].shape == torch.Size([4, 64])

DistributionHead

class ding.model.common.head.DistributionHead(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False, eps: Optional[float] = 1e-06)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False, eps: Optional[float] = 1e-06) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • layer_num (int): The num of layers used in the network to compute Q value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

  • noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict Distribution output. Parameter updates with DistributionHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with DistributionHead setups and return the result prediction dictionary.

    Necessary Keys:
    • logit (torch.Tensor): Logit tensor with same size as input x.

    • distribution (torch.Tensor): Distribution tensor of size (B, N, n_atom)

Examples:
>>> head = DistributionHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default n_atom is 51
>>> assert outputs['distribution'].shape == torch.Size([4, 64, 51])

RainbowHead

class ding.model.common.head.RainbowHead(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = True, eps: Optional[float] = 1e-06)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = True, eps: Optional[float] = 1e-06) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • layer_num (int): The num of layers used in the network to compute Q value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

  • noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict Rainbow output. Parameter updates with RainbowHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with RainbowHead setups and return the result prediction dictionary.

    Necessary Keys:
    • logit (torch.Tensor): Logit tensor with same size as input x.

    • distribution (torch.Tensor): Distribution tensor of size (B, N, n_atom)

Examples:
>>> head = RainbowHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default n_atom is 51
>>> assert outputs['distribution'].shape == torch.Size([4, 64, 51])

QRDQNHead

class ding.model.common.head.QRDQNHead(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • layer_num (int): The num of layers used in the network to compute Q value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

  • noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict QRDQN output. Parameter updates with QRDQNHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with QRDQNHead setups and return the result prediction dictionary.

    Necessary Keys:
    • logit (torch.Tensor): Logit tensor with same size as input x.

    • q (torch.Tensor): Q valye tensor tensor of size (B, N, num_quantiles)

    • tau (torch.Tensor): tau tensor of size (B, N, 1)

Examples:
>>> head = QRDQNHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default num_quantiles is 32
>>> assert outputs['q'].shape == torch.Size([4, 64, 32])
>>> assert outputs['tau'].shape == torch.Size([4, 32, 1])

QuantileHead

class ding.model.common.head.QuantileHead(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, quantile_embedding_size: int = 128, beta_function_type: Optional[str] = 'uniform', activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, quantile_embedding_size: int = 128, beta_function_type: Optional[str] = 'uniform', activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • layer_num (int): The num of layers used in the network to compute Q value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

  • noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor, num_quantiles: Optional[int] = None) Dict[source]
Overview:

Use encoded embedding tensor to predict Quantile output. Parameter updates with QuantileHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with QuantileHead setups and return the result prediction dictionary.

    Necessary Keys:
    • logit (torch.Tensor): Logit tensor with same size as input x.

    • q (torch.Tensor): Q valye tensor tensor of size (num_quantiles, B, N)

    • quantiles (torch.Tensor): quantiles tensor of size (quantile_embedding_size, 1)

Examples:
>>> head = QuantileHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])
>>> # default num_quantiles is 32
>>> assert outputs['q'].shape == torch.Size([32, 4, 64])
>>> assert outputs['quantiles'].shape == torch.Size([128, 1])
quantile_net(quantiles: torch.Tensor) torch.Tensor[source]
Overview:

Deterministic parametric function trained to reparameterize samples from a base distribution. By repeated Bellman update iterations of Q-learning, the optimal action-value function is estimated.

Arguments:
  • x (torch.Tensor): The encoded embedding tensor of parametric sample

Returns:
  • (torch.Tensor):

    QN output tensor after reparameterization of shape (quantile_embedding_size, output_size)

Examples:
>>> head = QuantileHead(64, 64)
>>> quantiles = torch.randn(128,1)
>>> qn_output = head.quantile_net(quantiles)
>>> assert isinstance(qn_output, torch.Tensor)
>>> # default quantile_embedding_size: int = 128,
>>> assert qn_output.shape == torch.Size([128, 64])

DuelingHead

class ding.model.common.head.DuelingHead(hidden_size: int, output_size: int, layer_num: int = 1, a_layer_num: Optional[int] = None, v_layer_num: Optional[int] = None, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 1, a_layer_num: Optional[int] = None, v_layer_num: Optional[int] = None, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • a_layer_num (int): The num of layers used in the network to compute action output

  • v_layer_num (int): The num of layers used in the network to compute value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

  • noise (bool): Whether use noisy fc_block

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict Dueling output. Parameter updates with DuelingHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with DuelingHead setups and return the result prediction dictionary.

    Necessary Keys:
    • logit (torch.Tensor): Logit tensor with same size as input x.

Examples:
>>> head = DuelingHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['logit'].shape == torch.Size([4, 64])

RegressionHead

class ding.model.common.head.RegressionHead(hidden_size: int, output_size: int, layer_num: int = 2, final_tanh: Optional[bool] = False, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 2, final_tanh: Optional[bool] = False, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • final_tanh (Optional[bool]): Whether a final tanh layer is needed

  • layer_num (int): The num of layers used in the network to compute Q value output

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict Regression output. Parameter updates with RegressionHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with RegressionHead setups and return the result prediction dictionary.

    Necessary Keys:
    • pred (torch.Tensor): Tensor with prediction value cells, with same size as input x.

Examples:
>>> head = RegressionHead(64, 64)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['pred'].shape == torch.Size([4, 64])

ReparameterizationHead

class ding.model.common.head.ReparameterizationHead(hidden_size: int, output_size: int, layer_num: int = 2, sigma_type: Optional[str] = None, fixed_sigma_value: Optional[float] = 1.0, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, bound_type: Optional[str] = None)[source]
__init__(hidden_size: int, output_size: int, layer_num: int = 2, sigma_type: Optional[str] = None, fixed_sigma_value: Optional[float] = 1.0, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, bound_type: Optional[str] = None) None[source]
Overview:

Init the Head according to arguments.

Arguments:
  • hidden_size (int): The hidden_size used before connected to DuelingHead

  • output_size (int): The num of output

  • layer_num (int): The num of layers used in the network to compute Q value output

  • sigma_type (Optional[str]): Sigma type used in ['fixed', 'independent', 'conditioned']

  • fixed_sigma_value(Optional[float]):

    When choosing fixed type, the tensor output['sigma'] is filled with this input value.

  • activation (nn.Module):

    The type of activation function to use in MLP the after layer_fn, if None then default set to nn.ReLU()

  • norm_type (str):

    The type of normalization to use, see ding.torch_utils.fc_block for more details

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict Reparameterization output. Parameter updates with ReparameterizationHead’s MLPs forward setup.

Arguments:
  • x (torch.Tensor):

    The encoded embedding tensor, determined with given hidden_size, i.e. (B, N=hidden_size).

Returns:
  • outputs (Dict):

    Run MLP with ReparameterizationHead setups and return the result prediction dictionary.

    Necessary Keys:
    • mu (torch.Tensor) Tensor of cells of updated mu values, with same size as x.

    • sigma (torch.Tensor) Tensor of cells of updated sigma values, with same size as x.

Examples:
>>> head =  ReparameterizationHead(64, 64, sigma_type='fixed')
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> assert outputs['mu'].shape == torch.Size([4, 64])
>>> assert outputs['sigma'].shape == torch.Size([4, 64])

MultiHead

class ding.model.common.head.MultiHead(head_cls: type, hidden_size: int, output_size_list: ding.utils.type_helper.SequenceType, **head_kwargs)[source]
__init__(head_cls: type, hidden_size: int, output_size_list: ding.utils.type_helper.SequenceType, **head_kwargs) None[source]
Overview:

Init the MultiHead according to arguments.

Arguments:
  • head_cls (type):

    The class of head, like DuelingHead, DistributionHead, QuatileHead, etc

  • hidden_size (int): The number of hidden layer size

  • output_size_list (int):

    The collection of output_size, e.g.: multi discrete action, [2, 3, 5]

  • head_kwargs: (dict): Class-specific arguments

forward(x: torch.Tensor) Dict[source]
Overview:

Use encoded embedding tensor to predict multi discrete output

Arguments:
  • x (torch.Tensor): The encoded embedding tensor, usually with shape (B, N)

Returns:
  • outputs (Dict):

    Prediction output dict

    Necessary Keys:
    • logit (torch.Tensor):

      Logit tensor with logit tensors indexed by output each accessed at ['logit'][i]. Given that output_size_list==[o1,o2,o3,...] , ['logit'][i] is of size (B,Ni)

Examples:
>>> head = MultiHead(DuelingHead, 64, [2, 3, 5], v_layer_num=2)
>>> inputs = torch.randn(4, 64)
>>> outputs = head(inputs)
>>> assert isinstance(outputs, dict)
>>> # output_size_list is [2, 3, 5] as set
>>> # Therefore each dim of logit is as follows
>>> outputs['logit'][0].shape
>>> torch.Size([4, 2])
>>> outputs['logit'][1].shape
>>> torch.Size([4, 3])
>>> outputs['logit'][2].shape
>>> torch.Size([4, 5])