common.head¶
Please Reference ding/ding/docs/source/api_doc/model/common/head.py for usage
head_cls_map = {
# discrete
'discrete': DiscreteHead,
'dueling': DuelingHead,
'distribution': DistributionHead,
'rainbow': RainbowHead,
'qrdqn': QRDQNHead,
'quantile': QuantileHead,
# continuous
'regression': RegressionHead,
'reparameterization': ReparameterizationHead,
# multi
'multi': MultiHead,
}
DiscreteHead¶
- class ding.model.common.head.DiscreteHead(hidden_size: int, output_size: int, layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The number of outputlayer_num (
int
): The num of layers used in the network to compute Q value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
noise (
bool
): Whether useNoiseLinearLayer
aslayer_fn
in Q networks’ MLP
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict discrete output. Parameter updates with DiscreteHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withDiscreteHead
setups and return the result prediction dictionary.- Necessary Keys:
logit (
torch.Tensor
): Logit tensor with same size as inputx
.
- outputs (
- Examples:
>>> head = DiscreteHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) and outputs['logit'].shape == torch.Size([4, 64])
DistributionHead¶
- class ding.model.common.head.DistributionHead(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False, eps: Optional[float] = 1e-06)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False, eps: Optional[float] = 1e-06) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputlayer_num (
int
): The num of layers used in the network to compute Q value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
noise (
bool
): Whether use noisyfc_block
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict Distribution output. Parameter updates with DistributionHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withDistributionHead
setups and return the result prediction dictionary.- Necessary Keys:
logit (
torch.Tensor
): Logit tensor with same size as inputx
.distribution (
torch.Tensor
): Distribution tensor of size(B, N, n_atom)
- outputs (
- Examples:
>>> head = DistributionHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['logit'].shape == torch.Size([4, 64]) >>> # default n_atom is 51 >>> assert outputs['distribution'].shape == torch.Size([4, 64, 51])
RainbowHead¶
- class ding.model.common.head.RainbowHead(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = True, eps: Optional[float] = 1e-06)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 1, n_atom: int = 51, v_min: float = - 10, v_max: float = 10, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = True, eps: Optional[float] = 1e-06) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputlayer_num (
int
): The num of layers used in the network to compute Q value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
noise (
bool
): Whether use noisyfc_block
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict Rainbow output. Parameter updates with RainbowHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withRainbowHead
setups and return the result prediction dictionary.- Necessary Keys:
logit (
torch.Tensor
): Logit tensor with same size as inputx
.distribution (
torch.Tensor
): Distribution tensor of size(B, N, n_atom)
- outputs (
- Examples:
>>> head = RainbowHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['logit'].shape == torch.Size([4, 64]) >>> # default n_atom is 51 >>> assert outputs['distribution'].shape == torch.Size([4, 64, 51])
QRDQNHead¶
- class ding.model.common.head.QRDQNHead(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputlayer_num (
int
): The num of layers used in the network to compute Q value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
noise (
bool
): Whether use noisyfc_block
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict QRDQN output. Parameter updates with QRDQNHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withQRDQNHead
setups and return the result prediction dictionary.- Necessary Keys:
logit (
torch.Tensor
): Logit tensor with same size as inputx
.q (
torch.Tensor
): Q valye tensor tensor of size(B, N, num_quantiles)
tau (
torch.Tensor
): tau tensor of size(B, N, 1)
- outputs (
- Examples:
>>> head = QRDQNHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['logit'].shape == torch.Size([4, 64]) >>> # default num_quantiles is 32 >>> assert outputs['q'].shape == torch.Size([4, 64, 32]) >>> assert outputs['tau'].shape == torch.Size([4, 32, 1])
QuantileHead¶
- class ding.model.common.head.QuantileHead(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, quantile_embedding_size: int = 128, beta_function_type: Optional[str] = 'uniform', activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 1, num_quantiles: int = 32, quantile_embedding_size: int = 128, beta_function_type: Optional[str] = 'uniform', activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputlayer_num (
int
): The num of layers used in the network to compute Q value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
noise (
bool
): Whether use noisyfc_block
- forward(x: torch.Tensor, num_quantiles: Optional[int] = None) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict Quantile output. Parameter updates with QuantileHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withQuantileHead
setups and return the result prediction dictionary.- Necessary Keys:
logit (
torch.Tensor
): Logit tensor with same size as inputx
.q (
torch.Tensor
): Q valye tensor tensor of size(num_quantiles, B, N)
quantiles (
torch.Tensor
): quantiles tensor of size(quantile_embedding_size, 1)
- outputs (
- Examples:
>>> head = QuantileHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['logit'].shape == torch.Size([4, 64]) >>> # default num_quantiles is 32 >>> assert outputs['q'].shape == torch.Size([32, 4, 64]) >>> assert outputs['quantiles'].shape == torch.Size([128, 1])
- quantile_net(quantiles: torch.Tensor) torch.Tensor [source]¶
- Overview:
Deterministic parametric function trained to reparameterize samples from a base distribution. By repeated Bellman update iterations of Q-learning, the optimal action-value function is estimated.
- Arguments:
x (
torch.Tensor
): The encoded embedding tensor of parametric sample
- Returns:
- (
torch.Tensor
): QN output tensor after reparameterization of shape
(quantile_embedding_size, output_size)
- (
- Examples:
>>> head = QuantileHead(64, 64) >>> quantiles = torch.randn(128,1) >>> qn_output = head.quantile_net(quantiles) >>> assert isinstance(qn_output, torch.Tensor) >>> # default quantile_embedding_size: int = 128, >>> assert qn_output.shape == torch.Size([128, 64])
DuelingHead¶
- class ding.model.common.head.DuelingHead(hidden_size: int, output_size: int, layer_num: int = 1, a_layer_num: Optional[int] = None, v_layer_num: Optional[int] = None, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 1, a_layer_num: Optional[int] = None, v_layer_num: Optional[int] = None, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, noise: Optional[bool] = False) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputa_layer_num (
int
): The num of layers used in the network to compute action outputv_layer_num (
int
): The num of layers used in the network to compute value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
noise (
bool
): Whether use noisyfc_block
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict Dueling output. Parameter updates with DuelingHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withDuelingHead
setups and return the result prediction dictionary.- Necessary Keys:
logit (
torch.Tensor
): Logit tensor with same size as inputx
.
- outputs (
- Examples:
>>> head = DuelingHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['logit'].shape == torch.Size([4, 64])
RegressionHead¶
- class ding.model.common.head.RegressionHead(hidden_size: int, output_size: int, layer_num: int = 2, final_tanh: Optional[bool] = False, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 2, final_tanh: Optional[bool] = False, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputfinal_tanh (
Optional[bool]
): Whether a final tanh layer is neededlayer_num (
int
): The num of layers used in the network to compute Q value output- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict Regression output. Parameter updates with RegressionHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withRegressionHead
setups and return the result prediction dictionary.- Necessary Keys:
pred (
torch.Tensor
): Tensor with prediction value cells, with same size as inputx
.
- outputs (
- Examples:
>>> head = RegressionHead(64, 64) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['pred'].shape == torch.Size([4, 64])
ReparameterizationHead¶
- class ding.model.common.head.ReparameterizationHead(hidden_size: int, output_size: int, layer_num: int = 2, sigma_type: Optional[str] = None, fixed_sigma_value: Optional[float] = 1.0, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, bound_type: Optional[str] = None)[source]¶
- __init__(hidden_size: int, output_size: int, layer_num: int = 2, sigma_type: Optional[str] = None, fixed_sigma_value: Optional[float] = 1.0, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None, bound_type: Optional[str] = None) None [source]¶
- Overview:
Init the Head according to arguments.
- Arguments:
hidden_size (
int
): Thehidden_size
used before connected toDuelingHead
output_size (
int
): The num of outputlayer_num (
int
): The num of layers used in the network to compute Q value outputsigma_type (
Optional[str]
): Sigma type used in['fixed', 'independent', 'conditioned']
- fixed_sigma_value(
Optional[float]
): When choosing
fixed
type, the tensoroutput['sigma']
is filled with this input value.
- fixed_sigma_value(
- activation (
nn.Module
): The type of activation function to use in
MLP
the afterlayer_fn
, ifNone
then default set tonn.ReLU()
- activation (
- norm_type (
str
): The type of normalization to use, see
ding.torch_utils.fc_block
for more details
- norm_type (
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict Reparameterization output. Parameter updates with ReparameterizationHead’s MLPs forward setup.
- Arguments:
- x (
torch.Tensor
): The encoded embedding tensor, determined with given
hidden_size
, i.e.(B, N=hidden_size)
.
- x (
- Returns:
- outputs (
Dict
): Run
MLP
withReparameterizationHead
setups and return the result prediction dictionary.- Necessary Keys:
mu (
torch.Tensor
) Tensor of cells of updated mu values, with same size asx
.sigma (
torch.Tensor
) Tensor of cells of updated sigma values, with same size asx
.
- outputs (
- Examples:
>>> head = ReparameterizationHead(64, 64, sigma_type='fixed') >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> assert outputs['mu'].shape == torch.Size([4, 64]) >>> assert outputs['sigma'].shape == torch.Size([4, 64])
MultiHead¶
- class ding.model.common.head.MultiHead(head_cls: type, hidden_size: int, output_size_list: ding.utils.type_helper.SequenceType, **head_kwargs)[source]¶
- __init__(head_cls: type, hidden_size: int, output_size_list: ding.utils.type_helper.SequenceType, **head_kwargs) None [source]¶
- Overview:
Init the MultiHead according to arguments.
- Arguments:
- head_cls (
type
): The class of head, like
DuelingHead
,DistributionHead
,QuatileHead
, etc
- head_cls (
hidden_size (
int
): The number of hidden layer size- output_size_list (
int
): The collection of
output_size
, e.g.: multi discrete action,[2, 3, 5]
- output_size_list (
head_kwargs: (
dict
): Class-specific arguments
- forward(x: torch.Tensor) Dict [source]¶
- Overview:
Use encoded embedding tensor to predict multi discrete output
- Arguments:
x (
torch.Tensor
): The encoded embedding tensor, usually with shape(B, N)
- Returns:
- outputs (
Dict
): Prediction output dict
- Necessary Keys:
- logit (
torch.Tensor
): Logit tensor with logit tensors indexed by
output
each accessed at['logit'][i]
. Given thatoutput_size_list==[o1,o2,o3,...]
,['logit'][i]
is of size(B,Ni)
- logit (
- outputs (
- Examples:
>>> head = MultiHead(DuelingHead, 64, [2, 3, 5], v_layer_num=2) >>> inputs = torch.randn(4, 64) >>> outputs = head(inputs) >>> assert isinstance(outputs, dict) >>> # output_size_list is [2, 3, 5] as set >>> # Therefore each dim of logit is as follows >>> outputs['logit'][0].shape >>> torch.Size([4, 2]) >>> outputs['logit'][1].shape >>> torch.Size([4, 3]) >>> outputs['logit'][2].shape >>> torch.Size([4, 5])