Nets

simple_img_conv_pool

paddle.v2.fluid.nets.simple_img_conv_pool(input, num_filters, filter_size, pool_size, pool_stride, act, param_attr=None, pool_type='max', use_cudnn=True)

img_conv_group

paddle.v2.fluid.nets.img_conv_group(input, conv_num_filter, pool_size, conv_padding=1, conv_filter_size=3, conv_act=None, param_attr=None, conv_with_batchnorm=False, conv_batchnorm_drop_rate=0.0, pool_stride=1, pool_type=None, use_cudnn=True)

Image Convolution Group, Used for vgg net.

sequence_conv_pool

paddle.v2.fluid.nets.sequence_conv_pool(input, num_filters, filter_size, param_attr=None, act='sigmoid', pool_type='max')

glu

paddle.v2.fluid.nets.glu(input, dim=-1)

The gated linear unit composed by split, sigmoid activation and elementwise multiplication. Specifically, Split the input into two equal sized parts \(a\) and \(b\) along the given dimension and then compute as following:

\[{GLU}(a, b)= a \otimes \sigma(b)\]

Refer to Language Modeling with Gated Convolutional Networks.

Parameters:
  • input (Variable) – The input variable which is a Tensor or LoDTensor.
  • dim (int) – The dimension along which to split. If \(dim < 0\), the dimension to split along is \(rank(input) + dim\).
Returns:

The Tensor variable with half the size of input.

Return type:

Variable

Examples

# x is a Tensor variable with shape [3, 6, 9]
fluid.nets.glu(input=x, dim=1)  # shape of output: [3, 3, 9]

dot_product_attention

paddle.v2.fluid.nets.dot_product_attention(querys, keys, values)

The dot-product attention.

Attention mechanism can be seen as mapping a query and a set of key-value pairs to an output. The output is computed as a weighted sum of the values, where the weight assigned to each value is computed by a compatibility function (dot-product here) of the query with the corresponding key.

The dot-product attention can be implemented through (batch) matrix multipication as follows:

\[Attention(Q, K, V)= softmax(QK^\mathrm{T})V\]

Refer to Attention Is All You Need.

Note that batch data containing sequences with different lengths is not supported by this because of the (batch) matrix multipication.

Parameters:
  • query (Variable) – The input variable which is a Tensor or LoDTensor.
  • key (Variable) – The input variable which is a Tensor or LoDTensor.
  • value (Variable) – The input variable which is a Tensor or LoDTensor.
Returns:

The Tensor variables representing the output and attention scores.

Return type:

tuple

Examples

# Suppose q, k, v are tensor variables with the following shape:
# q: [3, 5, 9], k: [3, 6, 9], v: [3, 6, 10]
out, attn_scores = fluid.nets.dot_product_attention(q, k, v)
out.shape  # [3, 5, 10]
attn_scores.shape  # [3, 5, 6]