Base¶

Layer¶

class paddle::Layer¶

Base class for layer. Define necessary variables and functions for every layer.

Subclassed by paddle::AddtoLayer, paddle::AgentLayer, paddle::BatchNormBaseLayer, paddle::BilinearInterpLayer, paddle::BlockExpandLayer, paddle::BootBiasLayer, paddle::ConcatenateLayer, paddle::ConcatenateLayer2, paddle::ConvBaseLayer, paddle::ConvexCombinationLayer, paddle::ConvShiftLayer, paddle::CosSimLayer, paddle::CosSimVecMatLayer, paddle::CostLayer, paddle::CRFLayer, paddle::CTCLayer, paddle::DataLayer, paddle::DataNormLayer, paddle::EosIdCheckLayer, paddle::ExpandLayer, paddle::FeatureMapExpandLayer, paddle::FullyConnectedLayer, paddle::GatedRecurrentLayer, paddle::GatherAgentLayer, paddle::GetOutputLayer, paddle::GruStepLayer, paddle::HierarchicalSigmoidLayer, paddle::InterpolationLayer, paddle::LambdaCost, paddle::LstmLayer, paddle::LstmStepLayer, paddle::MaxIdLayer, paddle::MaxOutLayer, paddle::MixedLayer, paddle::MultiplexLayer, paddle::NCELayer, paddle::NormLayer, paddle::OuterProdLayer, paddle::ParameterReluLayer, paddle::PoolLayer, paddle::PowerLayer, paddle::PrintLayer, paddle::RankingCost, paddle::RecurrentLayer, paddle::RecurrentLayerGroup, paddle::ResizeLayer, paddle::SamplingIdLayer, paddle::ScalingLayer, paddle::ScatterAgentLayer, paddle::SelectiveFullyConnectedLayer, paddle::SequenceConcatLayer, paddle::SequencePoolLayer, paddle::SequenceReshapeLayer, paddle::SlopeInterceptLayer, paddle::SpatialPyramidPoolLayer, paddle::SubSequenceLayer, paddle::SumCostLayer, paddle::SumToOneNormLayer, paddle::TensorLayer, paddle::TransLayer, paddle::ValidationLayer

Public Functions

void waitInputValue()¶: Wait until all input value ready. Called before Layer::forward() function.

void copyOutputToOtherDevice()¶: Copy layer’s output_ to other device. If output layer is in other device, called after Layer::forward() function.

void waitAndMergeOutputGrad()¶: Wait until all output grad ready and merge them to output_.grad. Called before Layer::backward() function.

void markAllInputGrad()¶: Notify previous layer the output grad ready. Called after Layer::backward() function.

Layer(const LayerConfig &config, bool useGpu = FLAGS_use_gpu)¶

virtual ~Layer()¶

bool needGradient() const¶: Get the flag whether layer need to compute gradient.

void setNeedGradient(bool need)¶: Set the flag whether layer need to compute gradient.

void setNeedSequenceInfo(bool need)¶: Set the flag whether layer need to re-compute sequence information, which includes sequenceStartPositions or subSequenceStartPositions.

const std::string &getName() const¶: Get layer’s name.

const std::string &getType() const¶: Get layer’s type.

size_t getSize() const¶: Get layer’s size.

int getDeviceId() const¶: Get layer’s deviceId.

void addPrev(LayerPtr l)¶: Add the inputLayer.

const LayerPtr &getPrev(size_t i)¶: Get the size of inputLayer[i].

const MatrixPtr &getOutputValue()¶: Get the forward-output value.

const IVectorPtr &getOutputLabel()¶: Get the forward-output label.

const MatrixPtr &getOutputGrad()¶: Get the backward-Loss value.

void setOutput(const std::string &name, Argument *output)¶: If layer has multi-output, set output into outputMap_.

Argument &getOutput(const std::string &str = "")¶: Get the output based on layer’s name.

const Argument &getOutput(int deviceId) const¶: Get the output based on deviceId.

const std::vector<ParameterPtr> &getParameters()¶: Get layer’s parameters.

const ParameterPtr &getBiasParameter()¶: Get layer’s bias-parameters.

void resizeOutput(size_t height, size_t width)¶: Resize the output matrix size.

void reserveOutput(size_t height, size_t width)¶: Resize the output matrix size, and reset value to zero.

void resetOutput(size_t height, size_t width)¶: Resize the output matrix size, and reset value and grad to zero.

void zeroGrad()¶: Clear the gradient of output.

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void initSubNetwork(NeuralNetwork *rootNetwork, const ModelConfig &config, const std::vector<ParameterType> &parameterTypes, bool useGpu)¶

Intialization for sub network if there has sub network.

Parameters

rootNetwork: root network
config: model config
parameterTypes: parameter’s type
useGpu: whether to use gpu or not

virtual void accessSubNetwork(const std::function<void(NeuralNetwork&)> &callback)¶

Access SubNetwork Object. If subnetwork exists, then invoke callback with subnetwrk.

Parameters

callback: if sub-network is exist, the callback is invoked.

virtual void prefetch()¶: If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void resetState()¶

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

virtual void setState(LayerStatePtr state)¶: Set layer state.

virtual LayerStatePtr getState()¶

Get layer state.

Return: A copy of internal state.

void showOutputStats()¶: Show output state.

virtual void backward(const UpdateCallback &callback = nullptr) = 0¶: Backward propagation. Should only be called after Layer::forward() function.

virtual void onPassEnd()¶: One pass is finished.

Public Static Functions

LayerPtr create(const LayerConfig &config)¶: Create pointer of layer.

Public Static Attributes

ClassRegistrar<Layer, LayerConfig> registrar_¶: Register a Layer.

Protected Functions

void markInputGrad(int inputIndex)¶: Notify specified layer the output grad ready. Called in the backward function. If do mark input grad in the backward function, you should to ensure that all input grad will be marked in the backward function.

const Argument &getInput(size_t inputIndex) const¶: Get the argument of input layer.

const Argument &getInput(const Layer &inputLayer) const¶: Get the argument of input layer.

const MatrixPtr &getInputValue(int inputIndex)¶: Get the forward-input value.

const MatrixPtr &getInputValue(const Layer &inputLayer)¶: Get the forward-input value.

const MatrixPtr &getInputGrad(int inputIndex)¶: Get the forward-input grad.

const MatrixPtr &getInputGrad(const Layer &inputLayer)¶: Get the forward-input grad.

const IVectorPtr &getInputLabel(const Layer &inputLayer)¶: Get the forward-input label.

void resetSpecifyOutput(Argument &output, size_t height, size_t width, bool isValueClean, bool isGradClean)¶: Change the size of output (value, grad). Reset to value zero if isValueClean = true, Reset to grad zero if isGradClean = true.

void addOutputArgument(int deviceId)¶: Add output argument to other devices.

void forwardActivation()¶: Forward of activation function.

void backwardActivation()¶: Backward of activation function.

void forwardDropOut()¶: Forward of dropOut.

void initNeedFlags()¶: Initilize the needGradient_ flag.

Protected Attributes

LayerConfig config_¶: Layer config.

bool useGpu_¶: whether to use GPU

int deviceId_¶: Device Id. CPU is -1, and GPU is 0, 1, 2 ...

std::vector<LayerPtr> inputLayers_¶: Input layers.

std::vector<std::string> inputArgument_¶: Argument of input layers.

std::vector<ParameterPtr> parameters_¶: Parameter for each input layer. Parameters_[i] is nullptr if inputLayers_[i] does not need parameter.

ParameterPtr biasParameter_¶: nullptr if bias is not needed.

Argument output_¶: Output.

std::vector<Argument> outputOtherDevice_¶: Several outputs stored on different devices, used in ‘parallel_nn’ case, and record them by deviceId_.

std::map<std::string, Argument *> outputMap_¶: If there are several outputs, map them by each name.

MatrixPtr tmpGrad_¶: Used to merge grad on different devices.

std::unique_ptr<ActivationFunction> activation_¶

PassType passType_¶: Current passType, PASS_TRAIN or PASS_TEST.

MatrixPtr dropOutMask_¶: Random 0-1 matrix for dropOut.

bool needGradient_¶: Whether the layer need to compute gradient.

bool needSequenceInfo_¶: Whether the layer need to compute re-sequence information.

std::vector<bool> markInBackward_¶: Mark input grad in(true) or out(false) of backward function.

Projection¶

class paddle::Projection¶

A projection takes one Argument as input, calculate the result and add it to output Argument.

Subclassed by paddle::ContextProjection, paddle::ConvProjection, paddle::DotMulProjection, paddle::FullMatrixProjection, paddle::IdentityOffsetProjection, paddle::IdentityProjection, paddle::PoolProjection, paddle::ScalingProjection, paddle::TableProjection, paddle::TransposedFullMatrixProjection

Public Functions

Projection(const ProjectionConfig &config, ParameterPtr parameter, bool useGpu)¶

virtual ~Projection()¶

const std::string &getName() const¶

void forward(const Argument *in, const Argument *out, PassType passType)¶

Forward propagation. If backward() will be called, in and out must be kept valid until then.

Parameters

in: input of projection
out: output of projection
passType: PASS_TRAIN of PASS_TEST

virtual void prefetch(const Argument *in)¶

virtual void forward() = 0¶

virtual void backward(const UpdateCallback &callback) = 0¶

virtual void resetState()¶: See comment in Layer.h for the function with the same name.

virtual void setState(LayerStatePtr state)¶: Set layer state.

virtual LayerStatePtr getState()¶: Get layer state. A copy of internal state is returned.

size_t getOutputSize() const¶: Get output size of projection.

Public Static Functions

Projection *create(const ProjectionConfig &config, ParameterPtr parameter, bool useGpu)¶

Public Static Attributes

ClassRegistrar<Projection, ProjectionConfig, ParameterPtr, bool> registrar_¶: Register a projection.

Protected Attributes

ProjectionConfig config_¶: Config of projection.

ParameterPtr parameter_¶: Parameter of projection.

bool useGpu_¶

const Argument *in_¶: Store in passed to forward()

const Argument *out_¶: Store out passed to forward()

PassType passType_¶: Store passType passed to forward()

Operator¶

class paddle::Operator¶

Operator like Projection, but takes more than one Arguments as input.

Note: : Operator can’t have parameters.

Subclassed by paddle::ConvOperator, paddle::DotMulOperator

Public Functions

Operator(const OperatorConfig &config, bool useGpu)¶

virtual ~Operator()¶

const OperatorConfig &getConfig() const¶

void forward(std::vector<const Argument *> ins, Argument *out, PassType passType)¶

Forward propagation. If backward() will be called, in and out must be kept valid until then.

Parameters

ins: inputs of operator
out: output of operator
passType: PASS_TRAIN of PASS_TEST

virtual void prefetch(const Argument *in)¶

virtual void forward() = 0¶

virtual void backward() = 0¶

virtual void resetState()¶: See comment in Layer.h for the function with the same name.

virtual void setState(LayerStatePtr state)¶: Set layer state.

virtual LayerStatePtr getState()¶: Set layer state.

Public Static Functions

Operator *create(const OperatorConfig &config, bool useGpu)¶

Public Static Attributes

ClassRegistrar<Operator, OperatorConfig, bool> registrar_¶

Protected Attributes

OperatorConfig config_¶: Config of operator.

bool useGpu_¶

std::vector<const Argument *> ins_¶: Store ins passed to forward()

Argument *out_¶: Store out passed to forward()

PassType passType_¶: Store passType passed to forward()

Data Layer¶

class paddle::DataLayer¶

This layer just copy data to output, and has no backward propagation.

The config file api is data_layer.

Inherits from paddle::Layer

Public Functions

DataLayer(const LayerConfig &config)¶

virtual void setData(const Argument &data)¶

void prefetch()¶: Prefetch sparse matrix/ids only.

virtual void forward(PassType passType)¶: Forward propagation. Copy data_ (value, in, grad, ids, cpuSequenceDims, sequenceStartPositions, subSequenceStartPositions, strs) to output_.

virtual void backward(const UpdateCallback &callback)¶: Data layer’s backward propagation do nothing.

virtual void copyOutputToOtherDevice()¶: Copy layer’s output_ to other device. If output layer is in other device, called after Layer::forward() function.

Protected Attributes

Argument data_¶

Fully Connected Layers¶

FullyConnectedLayer¶

class paddle::FullyConnectedLayer¶

A layer has full connections to all neurons in the previous layer. It computes an inner product with a set of learned weights, and (optionally) adds biases.

The config file api is fc_layer.

Inherits from paddle::Layer

Public Functions

FullyConnectedLayer(const LayerConfig &config)¶

~FullyConnectedLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)¶

void prefetch()¶: If use sparse row matrix as parameter, prefetch feature ids in input label.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

WeightList weights_¶

std::unique_ptr<Weight> biases_¶

SelectiveFullyConnectedLayer¶

class paddle::SelectiveFullyConnectedLayer¶

The SelectiveFullyConnectedLayer class.

SelectiveFullyConnectedLayer differs from FullyConnectedLayer by that it requires an additional input to indicate several selected columns, and only compute the multiplications between the input matrices and the selected columns of the parameter matrices of this layer. If the selected columns is not specified, SelectiveFullyConnected layer acts exactly like FullyConnectedLayer.

The config file api is selective_fc_layer.

Inherits from paddle::Layer

Public Functions

SelectiveFullyConnectedLayer(const LayerConfig &config)¶

~SelectiveFullyConnectedLayer()¶

void prefetch()¶: If use sparse row matrix as parameter, prefetch feature ids in input label.

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)¶

void reserveOutput(size_t height, size_t width, size_t nnz)¶: Resize the output matrix size. And reset value to zero.

void fillSelectiveData(const std::shared_ptr<std::vector<std::pair<int *, size_t>>> &candidates)¶

Fill candidates to select several activations as output.

Note

CURRENTLY, THIS METHOD IS ONLY USED FOR BEAM SEARCH

Parameters

candidates: specifies several selected columns of the parameter matrices of this layer. Multiplications only between the input matrices and the selected columns are computed. If the candidates is a nullptr, selective fc layer acts exactly like the fully connected layer.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

WeightList weights_¶

std::unique_ptr<Weight> biases_¶

Conv Layers¶

ConvBaseLayer¶

class paddle::ConvBaseLayer¶

A Base Convolution Layer, which convolves the input image with learned filters and (optionally) adds biases.

Inherits from paddle::Layer

Subclassed by paddle::CudnnConvLayer, paddle::ExpandConvBaseLayer

Public Functions

ConvBaseLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

size_t calOutputSize()¶: imgSizeH_ and imgSizeW_ will be set according to the previous input layers in this function. Then it will calculate outputH_ and outputW_ and set them into output argument.

Weight &getWeight(int idx)¶

Protected Types

typedef std::vector<int> IntV¶

Protected Attributes

bool isDeconv_¶: True if it’s deconv layer, false if it’s convolution layer.

int numFilters_¶: The number of filters.

IntV padding_¶: The x dimension of the padding.

IntV paddingY_¶: The y dimension of the padding.

IntV stride_¶: The x dimension of the stride.

IntV strideY_¶: The y dimension of the stride.

IntV filterSize_¶: The x dimension of a filter kernel.

IntV filterSizeY_¶: The y dimension of a filter kernel.

IntV channels_¶: The spatial dimensions of the convolution input.

IntV imgSizeH_¶: The spatial dimensions of input feature map height.

IntV imgSizeW_¶: The spatial dimensions of input feature map width.

IntV filterPixels_¶: filterPixels_ = filterSizeX_ * filterSizeY_.

IntV filterChannels_¶: filterChannels_ = channels_/groups_.

IntV outputH_¶: The spatial dimensions of output feature map height.

IntV outputW_¶: The spatial dimensions of output feature map width.

IntV groups_¶: Group size, refer to grouped convolution in Alex Krizhevsky’s paper: when group=2, the first half of the filters are only connected to the first half of the input channels, and the second half only connected to the second half.

bool sharedBiases_¶: Whether the bias is shared for feature in each channel.

WeightList weights_¶: shape of weight: (numChannels * filterPixels_, numFilters)

std::unique_ptr<Weight> biases_¶: If shared_biases is false shape of bias: (numFilters_, 1) If shared_biases is ture shape of bias: (numFilters_ * outputX * outputY, 1)

bool caffeMode_¶: True by default. The only difference is the calculation of output size.

ConvOperator¶

class paddle::ConvOperator¶

ConvOperator takes two inputs to perform the convolution. The first input is the image, and the second input is the convolution kernel. The height of data for two inputs are the same. Each data of the first input is convolved with each data of the second input indepedently.

The config file api is conv_operator.

Inherits from paddle::Operator

Public Functions

ConvOperator(const OperatorConfig &config, bool useGpu)¶

virtual ~ConvOperator()¶: Free workspace in device and destroy cudnn tensor descriptor.

void forward()¶

void backward()¶

ConvShiftLayer¶

class paddle::ConvShiftLayer¶

A layer for circular convluation of two vectors, which is used in NEURAL TURING MACHINE.

Input: two vectors, the first is data (batchSize x dataDim) the second is shift weights (batchSize x shiftDim)
Output: a vector (batchSize x dataDim) Assumed that:
a[in]: contains M elements.
b[in]: contains N elements (N should be odd).
c[out]: contains M elements.

\[ c[i] = \sum_{j=-(N-1)/2}^{(N-1)/2}a_{i+j} * b_{j} \]

In this formula:

a’s index is computed modulo M.
b’s index is comupted modulo N.

The config file api is conv_shift_layer.

Inherits from paddle::Layer

Public Functions

ConvShiftLayer(const LayerConfig &config)¶

~ConvShiftLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

CudnnConvLayer¶

class paddle::CudnnConvLayer¶

A 2-dimension conv layer implemented by cuDNN. It only supports GPU mode. We automatic select CudnnConvLayer for GPU mode and ExpandConvLayer for CPU mode if you set type of “conv”. User also can specfiy type of “exconv” or “cudnn_conv” for particular type.

The config file api is img_conv_layer.

Inherits from paddle::ConvBaseLayer

Public Functions

CudnnConvLayer(const LayerConfig &config)¶

~CudnnConvLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

void addBiases()¶

void bpropBiases()¶

Protected Attributes

std::vector<std::unique_ptr<ProjectionConfig>> projConf_¶

std::vector<std::unique_ptr<Projection>> projections_¶

hl_tensor_descriptor biasDesc_¶

hl_tensor_descriptor outputDesc_¶

int biasOffset_¶

int outputOffset_¶

ExpandConvLayer¶

class paddle::ExpandConvLayer¶

A subclass of convolution layer. This layer expands input and use matrix multiplication to calculate convolution operation.

The config file api is img_conv_layer.

Inherits from paddle::ExpandConvBaseLayer

Public Functions

ExpandConvLayer(const LayerConfig &config)¶

~ExpandConvLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

ContextProjection¶

class paddle::ContextProjection¶

Context projection concatenate features in adjacent time steps in a sequence. The i-th row of the output is the concatenation of context_length rows of the input. The context_length rows are the consecutive rows from the i+shift_start row.

For example, assumed input (x) has 4 words and the dimension of each word representation is 2. If we use zero to pad instead of learned weight to pad, and the context_lenth is 3, the output (y) is:

x = [a1, a2;
     b1, b2;
     c1, c2;
     d1, d2]
y = [0,  0,  a1, a2, b1, b2;
     a1, a2, b1, b2, c1, c2;
     b1, b2, c1, c2, d1, d2;
     c1, c2, d1, d2, 0,  0]

The config file api is context_projection.

Inherits from paddle::Projection

Public Functions

ContextProjection(const ProjectionConfig &config, ParameterPtr parameter, bool useGpu)¶: Constructor. If context_start is zero and context_lenth is one, it will set trainable_padding false. trainable_padding is an optional arguments and if it is set, constructor will set learned weight, which is used to pad output.

void forward()¶

void backward(const UpdateCallback &callback)¶

void resetState()¶: See comment in Layer.h for the function with the same name.

void setState(LayerStatePtr state)¶: Set layer state.

LayerStatePtr getState()¶: Get layer state. A copy of internal state is returned.

Protected Attributes

std::unique_ptr<Weight> weight_¶

size_t beginPad_¶: number of extra timesteps added at the beginning

size_t endPad_¶: number of extra timesteps added at the end

MatrixPtr state_¶: state_ and state2_ are used in sequence generating and saved previous inputs.

MatrixPtr state2_¶

Pooling Layers¶

PoolLayer¶

class paddle::PoolLayer¶

Basic parent layer of pooling Pools the input within regions.

Inherits from paddle::Layer

Subclassed by paddle::CudnnPoolLayer, paddle::PoolProjectionLayer

Public Functions

PoolLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

Public Static Functions

Layer *create(const LayerConfig &config)¶: create pooling layer by pool_type

Protected Attributes

size_t channels_¶

size_t sizeX_¶

size_t stride_¶

size_t outputX_¶

size_t imgSize_¶

int confPadding_¶

size_t sizeY_¶

size_t imgSizeY_¶

size_t strideY_¶

size_t outputY_¶

int confPaddingY_¶

std::string poolType_¶

PoolProjectionLayer¶

class paddle::PoolProjectionLayer¶

Basic parent layer of different kinds of pooling.

Inherits from paddle::PoolLayer

Public Functions

PoolProjectionLayer(const LayerConfig &config)¶

size_t getSize()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

size_t imgSizeH_¶

size_t imgSizeW_¶

size_t outputH_¶

size_t outputW_¶

std::unique_ptr<PoolProjection> poolProjection_¶

ProjectionConfig projectionConfig_¶

CudnnPoolLayer¶

class paddle::CudnnPoolLayer¶

CudnnPoolLayer is subclass of PoolLayer, which is implemented by cudnn api and only supports GPU.

The config file api is img_pool_layer.

Inherits from paddle::PoolLayer

Public Functions

CudnnPoolLayer(const LayerConfig &config)¶

~CudnnPoolLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void reshape(int batchSize)¶: Reshape input and output tensor descriptor. The batch size maybe change during training in last batch of each pass. So reshaping is needed.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Public Static Functions

bool typeCheck(const std::string &poolType, hl_pooling_mode_t *mode = nullptr)¶

Protected Attributes

int windowHeight¶

int windowWidth¶

int heightPadding¶

int widthPadding¶

int strideHeight¶

int strideWidth¶

int imageH_¶

int imageW_¶

int outputH_¶

int outputW_¶

hl_pooling_mode_t mode_¶: mode_ is poolint type, inlcuding “cudnn-max-pool”, “cudnn-avg-pool” “cudnn-avg-excl-pad-pool”.

hl_tensor_descriptor inputDesc_¶: cudnn tensor descriptor for input.

hl_tensor_descriptor outputDesc_¶: cudnn tensor descriptor for output.

hl_pooling_descriptor poolingDesc_¶: A description of a pooling operation.

Norm Layers¶

NormLayer¶

class paddle::NormLayer¶

Basic parent layer of normalization.

Note: Normalize the input in local region

Inherits from paddle::Layer

Subclassed by paddle::ResponseNormLayer

Public Functions

NormLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

Public Static Functions

Layer *create(const LayerConfig &config)¶: create norm layer by norm_type

CMRProjectionNormLayer¶

class paddle::CMRProjectionNormLayer¶

response normalization across feature maps namely normalize in number of size_ channels

Inherits from paddle::ResponseNormLayer

Public Functions

CMRProjectionNormLayer(const LayerConfig &config)¶

~CMRProjectionNormLayer()¶

size_t getSize()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

DataNormLayer¶

class paddle::DataNormLayer¶

A layer for data normalization.

Input: One and only one input layer is accepted. The input layer must be DataLayer with dense data type.
Output: The normalization of the input data

Reference: LA Shalabi, Z Shaaban, B Kasasbeh. Data mining: A preprocessing engine

Three data normalization methoeds are considered

z-score: y = (x-mean)/std
min-max: y = (x-min)/(max-min)
decimal-scaling: y = x/10^j, where j is the smallest integer such that max(|y|)<1

Inherits from paddle::Layer

Public Types

enum NormalizationStrategy¶

Values:

kZScore = 0¶

kMinMax = 1¶

kDecimalScaling = 2¶

Public Functions

DataNormLayer(const LayerConfig &config)¶

~DataNormLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

int mode_¶

std::unique_ptr<Weight> weight_¶

MatrixPtr min_¶

MatrixPtr rangeReciprocal_¶

MatrixPtr mean_¶

MatrixPtr stdReciprocal_¶

MatrixPtr decimalReciprocal_¶

ResponseNormLayer¶

class paddle::ResponseNormLayer¶

response normalization within feature maps namely normalize in independent channel When code refactoring, we delete the original implementation. Need to implement in the futrue.

Inherits from paddle::NormLayer

Subclassed by paddle::CMRProjectionNormLayer

Public Functions

ResponseNormLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

size_t channels_¶

size_t size_¶

size_t outputX_¶

size_t imgSize_¶

float scale_¶

float pow_¶

MatrixPtr denoms_¶

BatchNormBaseLayer¶

class paddle::BatchNormBaseLayer¶

Batch normalization layer use to normalizes the input to across the batch.

By default, calculating global mean and variance statistics via a running average in the training peroid. Then the pre-calculated global mean and variance are used for testing.

Moving mean and variance are located in Parameter object when constructing and the calculation will change them. Now we only save global mean and variance of one thread in first node for GPU. But the calculation in CPU is different, because parameters are shared by multiple threads. Here using ShareCpuMatrix with lock to calculate. We still save global mean and variance in first node in CPU when multi machine.

[1] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” arXiv preprint arXiv:1502.03167 (2015).

Inherits from paddle::Layer

Subclassed by paddle::BatchNormalizationLayer, paddle::CudnnBatchNormLayer

Public Functions

BatchNormBaseLayer(const LayerConfig &config)¶

~BatchNormBaseLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void calFeatureMapSize()¶: Calculate feature map size. Some input uses frameHeight and frameWidth to store feature size.

Public Static Functions

static Layer *create(const LayerConfig &config)¶: Create BatchNorm layer by norm_type, including batch_norm and cudnn_batch_norm. If do not set norm_type, it will automatically select cudnn_batch_norm for GPU and batch_norm for CPU.

Protected Attributes

std::unique_ptr<Weight> weight_¶: Batch normalization scale parameter, which is referred to as gamma in in original paper.

std::unique_ptr<Weight> movingMean_¶: Moving average of mean.

std::unique_ptr<Weight> movingVar_¶: Moving average of variance.

std::unique_ptr<Weight> biases_¶: Batch normalization bias parameter, which is referred to as beta in in original paper.

MatrixPtr savedMean_¶: Save intermediate results computed during the forward pass, these can then be reused to speed up the backward pass.

MatrixPtr savedInvVar_¶

int imgSize_¶: Height or width of input image feature, now height is equal to width. imgSize is 1 if the input is fully-connected layer.

int imageH_¶

int imageW_¶

int imgPixels_¶: Height * Width.

int channels_¶: Feature dimension. If the input layer is conv layer, it is the channels of feature map of the conv layer. If the input layer is fully-connected layer, it is the dimension of fc layer.

bool useGlobalStats_¶

real movingAvgFraction_¶

BatchNormalizationLayer¶

class paddle::BatchNormalizationLayer¶

A Inheritance class of Batch normalization layer. It supports both CPU and GPU.

The config file api is batch_norm_layer.

Inherits from paddle::BatchNormBaseLayer

Public Functions

BatchNormalizationLayer(const LayerConfig &config)¶

~BatchNormalizationLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

void setMeanAndStd()¶: Load pre-calculated mean and std.

void calMeanAndStd(const MatrixPtr &mat)¶: Calculate mean and std.

void calMovingMeanAndVar()¶: Calculate moving mean and variance.

void expandMat(const MatrixPtr &in, MatrixPtr &out)¶: expand a Matrix from batch, channels* imagePixels to batch * ImagePixels * channels.

void shrinkMat(const MatrixPtr &in, MatrixPtr &out)¶: Shrink a Matrix from from batch * ImagePixels * channels to batch, channels* imagePixels.

Protected Attributes

MatrixPtr tmpMat_¶

MatrixPtr tmpGrad_¶

MatrixPtr expandedIn_¶

MatrixPtr expandedOut_¶

MatrixPtr expandedInGrad_¶

MatrixPtr expandedOutGrad_¶

MatrixPtr inGrad_¶

MatrixPtr normIn_¶

MatrixPtr normInGrad_¶

MatrixPtr meanGrad_¶

MatrixPtr stdGrad_¶

bool firstTest_¶: Load mean and variance only once flag.

Protected Static Attributes

const real EPS¶: Epsilon value used in the batch normalization formula.

CudnnBatchNormLayer¶

class paddle::CudnnBatchNormLayer¶

Cudnn Batch normalization layer use to cuDNN lib to implentment.

The config file api is batch_norm_layer.

Note: Cudnn version must >= v4.0, and better to use the latest version (v5.1).

Inherits from paddle::BatchNormBaseLayer

Public Functions

CudnnBatchNormLayer(const LayerConfig &config)¶

~CudnnBatchNormLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void reshape(int batchSize)¶: reshape tensor of ioDesc_.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

hl_tensor_descriptor ioDesc_¶: Input/output tensor descriptor desc.

hl_tensor_descriptor bnParamDesc_¶: Shared tensor descriptor desc for the 6 tenros: bnScale, bnBias, running mean/var, save_mean/var

MatrixPtr tmpWGrad_¶: The gradient of weight and bias in cudnn api can not be empty. If set is_static for weight or bias, it will not allocate memory for them, and the gradient is NULL. In this case, will use two matrix.

MatrixPtr tmpBiasGrad_¶

Protected Static Attributes

const double EPS¶: Epsilon value used in the batch normalization formula. Minimum allowed value is CUDNN_BN_MIN_EPSILON defined in cudnn.h. Same epsilon value should be used in forward and backward functions.

SumToOneNormLayer¶

class paddle::SumToOneNormLayer¶

A layer for sum-to-one normalization, which is used in NEURAL TURING MACHINE.

\[ out[i] = \frac {in[i]} {\sum_{k=1}^N in[k]} \]

where \(in\) is a (batchSize x dataDim) input vector, and \(out\) is a (batchSize x dataDim) output vector.

The config file api is sum_to_one_norm_layer.

Inherits from paddle::Layer

Public Functions

SumToOneNormLayer(const LayerConfig &config)¶

~SumToOneNormLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr reciprocalRowSum_¶: reciprocalRowSum_ = \(1 / \sum_{k=1}^N in[k]\)

MatrixPtr dotSum_¶: dotSum = output_.grad \(.*\) output_.value

Activation Layer¶

ParameterReluLayer¶

class paddle::ParameterReluLayer¶

ParameterReluLayer active inputs with learnable parameter weight_. forward:

\[ y = x > 0 ? x : w .* x \]

backward:

\[\begin{split} dx = x > 0 ? dy : w .* dy \\ dw = x > 0 ? 0 : dy.*x \end{split}\]

Here, x is the input, w is the weight, y is the output. dx, dw, dy is the gradient.

Inherits from paddle::Layer

Public Functions

ParameterReluLayer(const LayerConfig &config)¶

~ParameterReluLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> weight_¶

size_t partialSum_¶

partialSum_ makes a group of inputs share same weights,

partialSum_ = 1: element wise activation: each element has a weight_,
partialSum_ = number of elements in one channel, channels wise parameter activation, elements in a channel share same weight_,
partialSum_ = number of outputs all elements share same weight_,

Recurrent Layers¶

RecurrentLayer¶

class paddle::RecurrentLayer¶

RecurrentLayer takes 1 input layer. The output size is the same with input layer. For each sequence [start, end] it performs the following computation:

\[\begin{split} out_{i} = act(in_{i}) \ \ \text{for} \ i = start \\ out_{i} = act(in_{i} + out_{i-1} * W) \ \ \text{for} \ start < i <= end \end{split}\]

If reversed is true, the order is reversed:

\[\begin{split} out_{i} = act(in_{i}) \ \ \text{for} \ i = end \\ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start <= i < end \end{split}\]

There are two methods to calculate rnn. One way is to compute rnn one sequence by one sequence. The other way is to reorganize the input into batches, then compute rnn one batch by one batch. Users can select them by rnn_use_batch flag.

Inherits from paddle::Layer

Public Functions

RecurrentLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

void resetState()¶

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

void setState(LayerStatePtr state)¶: Set layer state.

LayerStatePtr getState()¶

Get layer state.

Return: A copy of internal state.

Protected Functions

void forwardSequence(int batchSize, size_t numSequences, const int *starts)¶

If user do not set rnn_use_batch=true, it will compute rnn forward one sequence by one sequence in default.

Parameters

batchSize: Total words number of all samples in this batch.
numSequences: The sample number.
starts: Each start position of each samples.

void forwardOneSequence(int start, int length)¶

Compute rnn forward by one sequence.

Parameters

start: The start position of this sequence (or sample).
length: The length of this sequence (or sample), namely the words number of this sequence.

void backwardSequence(int batchSize, size_t numSequences, const int *starts)¶

Compute rnn backward one sequence by onesequence.

Parameters

batchSize: Total words number of all samples in this batch.
numSequences: The sample number.
starts: Each start position of each samples.

void backwardOneSequence(int start, int length)¶

Compute rnn backward by one sequence.

Parameters

start: The start position of this sequence (or sample).
length: The length of this sequence (or sample), namely the words number of this sequence.

void forwardBatch(int batchSize, size_t numSequences, const int *starts)¶

Reorganize input into batches and compute rnn forward batch by batch. It will convert batch shape to sequence after finishing forward. The batch info can refer to SequenceToBatch class.

Parameters

batchSize: Total words number of all samples in this batch.
numSequences: The sample number.
starts: Each start position of each samples.

void backwardBatch(int batchSize, size_t numSequences, const int *starts)¶

Reorganize input into batches and compute rnn forward batch by batch.

Parameters

batchSize: Total words number of all samples in this batch.
numSequences: The sample number.
starts: Each start position of each samples.

Protected Attributes

std::unique_ptr<Weight> weight_¶

std::unique_ptr<Weight> bias_¶

std::vector<Argument> frameOutput_¶: frameOutput_[i] is used to hold the i-th sample of output_

MatrixPtr prevOutput_¶

bool reversed_¶: Whether compute rnn by reverse.

std::unique_ptr<SequenceToBatch> batchValue_¶: If compute batch by batch, batchValue_ will be used to save the reorganized input value.

std::unique_ptr<SequenceToBatch> batchGrad_¶: If compute batch by batch, batchGrad_ will be used to save the gradient with respect to reorganized input value.

SequenceToBatch¶

class paddle::SequenceToBatch¶

Public Functions

SequenceToBatch(bool useGpu)¶

void resizeOrCreateBatch(int batchSize, size_t numSequences, const int *seqStarts, bool reversed, bool prevBatchState = false)¶

void copy(Matrix &seqValue, Matrix &batchValue, bool seq2batch)¶

void add(Matrix &seqValue, Matrix &batchValue, bool seq2batch)¶

MatrixPtr getBatchValue(Matrix &batchValue, int batchId, int numRows = 0)¶

size_t getNumBatch() const¶

void resizeOrCreate(Matrix &seqValue)¶

void copyFromSeq(Matrix &seqValue)¶

void copyBackSeq(Matrix &seqValue)¶

MatrixPtr getBatchValue(int batchId, int numRows = 0)¶

MatrixPtr getBatchValue()¶

void prevOutput2Batch(Matrix &src, Matrix &dst)¶

void getSeqOutputFromBatch(Matrix &sequence, Matrix &batch)¶

void shareIndexWith(const SequenceToBatch &seq2batch)¶

Protected Functions

void sequence2BatchCopy(Matrix &batch, Matrix &sequence, IVector &seq2BatchIdx, bool seq2batch)¶

void sequence2BatchAdd(Matrix &batch, Matrix &sequence, IVector &seq2BatchIdx, bool seq2batch)¶

Protected Attributes

IVectorPtr batchStartPositions_¶

IVectorPtr seq2BatchIdx_¶

IVectorPtr cpuSeq2BatchIdx_¶

IVectorPtr cpuSeqIdx_¶

IVectorPtr cpuSeqEndIdxInBatch_¶

IVectorPtr seqIdx_¶

IVectorPtr seqEndIdxInBatch_¶

size_t numBatch_¶

bool useGpu_¶

MatrixPtr batchValue_¶

LSTM¶

LstmLayer¶

class paddle::LstmLayer¶

LstmLayer takes 1 input layer with size * 4. Input layer is diveded into 4 equal parts: (input_s, input_ig, input_fg, input_og)

For each sequence [start, end] it performs the following computation:

output_{i} = actState(state_{i}) * actGate(outputGate_{i})
state_{i} = actInput(input_s_{i} + bias_s +
            output_{i-1} * recurrIW) * actGate(inputGate_{i}) +
            actGate(forgetGate_{i}) * state_{i-1}
inputGate = input_ig_{i} + bias_ig + output_{i-1} * recurrIGW +
            state_{i-1} * inputCheck
ouputGate = input_og_{i} + bias_og + output_{i-1} * recurrOGW +
            state_{i} * outputCheck
forgetGate = input_fg_{i} + bias_fg + output_{i-1} * recurrFGW +
             state_{i-1} * forgetCheck

parameter[0] consists of (recurrIW, recurrIGW, recurrFGW, recurrOGW)
baisParameter consists of (bias_s, bias_ig, bias_og, bias_fg, inputCheck, forgetCheck, outputCheck)
actInput is defined by config active_type.
actState is defined by config active_state_type.
actGate is defined by config actvie_gate_type.

There are two ways to compute, namely one sequence by one sequence or one batch by one batch. By default and no setting pre_batch_state true, it will compute batch by batch.

The formula in the paper is as follows:

\[\begin{split} i_t = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i) \\ f_t = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f) \\ \tilde{c_t} = tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c) \\ o_t = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o) \\ c_t = f_t * c_{t-1} + i_t * \tilde{c_t} \\ h_t = o_t tanh(c_t) \end{split}\]

The weight ([size, 4*size]) contains \(W_{hi}, W_{hf}, W_{hc}, W_{ho}\). The bias contains \(b_i, b_f, b_c, b_o\) and \(W_{ci}, W_{cf}, W_{co}\).

Note: These \(W_{xi}x_{t}, W_{xf}x_{t}, W_{xc}x_{t}, W_{xo}x_{t}\) operations on the input sequence were NOT included in LstmLayer. So users should use fc_layer or mixed_layer before lstm_later.

Inherits from paddle::Layer, paddle::LstmCompute

Subclassed by paddle::MDLstmLayer

Public Functions

LstmLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

void resetState()¶

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

void setState(LayerStatePtr state)¶: Set layer state.

LayerStatePtr getState()¶

Get layer state.

Return: A copy of internal state.

Protected Functions

void forwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)¶

Compute lstm forward one sequence by one sequence.

Parameters

batchSize: The batchSize is not equal to the batch_size in the config file. It is the total words number of all samples in this forward batch.
numSequences: The sample number. It is equal to the batch_size in the config file.
starts: Each start position of each samples.
inputValue: The input values.

void backwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)¶: Compute lstm backward one sequence by one sequence.

void forwardBatch(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)¶

Compute lstm forward one batch by one batch. The batch value is reorganized by SequenceToBatch class. The batch output value will be convert into sequence value after finishing forward. Here, one batch contains one word of each sample. If the length of each sample is not equality, the batch will not pads zero and contains less words. The total batch numbers are the max length of the sequence. The details can refer to SequenceToBatch class. On GPU mode, it will launch GPU kernel for loop.

for (int i = 0; i < numBatch(max_sequence_length); ++i) {
  compute one batch.
}

void backwardBatch(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)¶: Compute lstm backward one batch by one batch.

void forwardSeqParallel(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)¶: This function only supports GPU. It not need to reorganize input into batch value. It will launch one kernel to parallelly compute forward propagation in sequence level.

void backwardSeqParallel(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)¶: Backward propagation corresponding to forwardSeqParallel.

void getPrevBatchOutput(size_t numSequences)¶: This function is used for sequence generation and get output after forwardBatch.

void getPrevBatchState(size_t numSequences)¶: This function is used for sequence generation and get state after forwardBatch.

Protected Attributes

std::unique_ptr<Weight> weight_¶: Learned parameters, shape: (size, 4*size). The weight ([size, 4*size]) contains \(W_{hi}, W_{hf}, W_{hc}, W_{ho}\).

std::unique_ptr<Weight> bias_¶: Learned bias parameter, shape: (1, 7 * size). The bias contains \(b_i, b_f, b_c, b_o\) and \(W_{ci}, W_{cf}, W_{co}\).

MatrixPtr localBias_¶: The reeal bias, point to \(b_i, b_f, b_c, b_o\).

MatrixPtr checkIg_¶: The peephole connection for input gate.

MatrixPtr checkFg_¶: The peephole connection for forget gate.

MatrixPtr checkOg_¶: The peephole connection for output gate.

MatrixPtr localBiasGrad_¶: The gradient of real bias.

MatrixPtr checkIgGrad_¶: The gradient of peephole connection for input gates.

MatrixPtr checkFgGrad_¶: The gradient of peephole connection for forget gates.

MatrixPtr checkOgGrad_¶: The gradient of peephole connection for output gates.

Argument state_¶: Stores the cell state of previous time step, namely \(c_{t-1}\).

Argument preOutput_¶: Stores the hidden of previous time step, namely \(h_{t-1}\).

Argument gate_¶: Stores the value and gradient of four gates, namely \(i_t, f_t, o_t, c_t\).

bool reversed_¶: Whether it is reversed lstm.

bool useBatch_¶: Whether to use batch method to compute.

bool useSeqParallel_¶: Whether to use sequence parallell method to compute.

std::unique_ptr<SequenceToBatch> batchValue_¶: batchValue_ is used in method of batch calculation. It stores the batch value after reorganized input.

std::unique_ptr<SequenceToBatch> batchGrad_¶: The gradient of batchValue_.

MatrixPtr prevState_¶: Used in generation and stores the state of previous time step.

MatrixPtr prevOutput_¶: Used in generation and stores the output of previous time step.

MatrixPtr prevBatchOutput2_¶

MatrixPtr totalState_¶: The total state.

LstmStepLayer¶

class paddle::LstmStepLayer¶

Inherits from paddle::Layer, paddle::LstmCompute

Public Functions

LstmStepLayer(const LayerConfig &config)¶

~LstmStepLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

Argument state_¶

Argument gate_¶

Argument stateActive_¶

MatrixPtr checkIg_¶

MatrixPtr checkFg_¶

MatrixPtr checkOg_¶

MatrixPtr checkIgGrad_¶

MatrixPtr checkFgGrad_¶

MatrixPtr checkOgGrad_¶

std::unique_ptr<Weight> weight_¶

LstmCompute¶

class paddle::LstmCompute¶

Subclassed by paddle::LstmLayer, paddle::LstmStepLayer

Public Functions

void init(LayerConfig &config)¶

template <bool useGpu>
void forwardBatch(hl_lstm_value value, int frameSize, int batchSize)¶: LstmLayer batch compute API (forwardBatch, backwardBatch). If use batch compute api, lstm value(and grad) need to be batch structure. Compute order: forwardBatch: for 0 <= id < numBatch backwardBatch: for numBatch > id >= 0

template <bool useGpu>
void backwardBatch(hl_lstm_value value, hl_lstm_grad grad, int frameSize, int batchSize)¶

template <bool useGpu>
void forwardOneSequence(hl_lstm_value value, int frameSize)¶: LstmLayer sequence compute API (forwardOneSequence, backwardOneSequence). Compute order(for each sequence): forwardOneSequence: if (!reversed) for 0 <= seqId < seqLength if (reversed) for seqLength > seqId >= 0 backwardOneSequence: if (!reversed) for seqLength > seqId >= 0 if (reversed) for 0 <= seqId < seqLength

template <bool useGpu>
void backwardOneSequence(hl_lstm_value value, hl_lstm_grad grad, int frameSize)¶

template <>
void forwardOneSequence(hl_lstm_value value, int frameSize)¶

template <>
void backwardOneSequence(hl_lstm_value value, hl_lstm_grad grad, int frameSize)¶

template <>
void forwardBatch(hl_lstm_value value, int frameSize, int batchSize)¶

template <>
void backwardBatch(hl_lstm_value value, hl_lstm_grad grad, int frameSize, int batchSize)¶

template <>
void forwardBatch(hl_lstm_value value, int frameSize, int batchSize)

template <>
void backwardBatch(hl_lstm_value value, hl_lstm_grad grad, int frameSize, int batchSize)

template <>
void forwardOneSequence(hl_lstm_value value, int frameSize)

template <>
void backwardOneSequence(hl_lstm_value value, hl_lstm_grad grad, int frameSize)

Public Members

hl_activation_mode_t activeNode_¶

hl_activation_mode_t activeGate_¶

hl_activation_mode_t activeState_¶

MDLSTM¶

MDLstmLayer¶

class paddle::MDLstmLayer¶

Inherits from paddle::LstmLayer

Public Functions

MDLstmLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

void forwardOneSequence(int start, CoordIterator &coordIter)¶

void backwardOneSequence(int start, CoordIterator &coordIter)¶

void forwardGate2OutputSequence(int start, CoordIterator &coordIter)¶

void backwardGate2OutputSequence(int start, CoordIterator &coordIter)¶

Protected Attributes

std::vector<Argument> frameInputGate_¶

std::vector<Argument> frameForgetGate_¶

std::vector<Argument> frameOutputGate_¶

std::vector<Argument> frameInputNode_¶

std::vector<Argument> frameGate_¶

std::vector<Argument> frameState_¶

std::vector<Argument> framePreOutput_¶

std::vector<Argument> frameOutput_¶

std::unique_ptr<ActivationFunction> activationGate_¶

std::unique_ptr<ActivationFunction> activationState_¶

int numDims_¶

size_t numBlocks_¶

std::vector<bool> directions_¶

std::vector<int> delays_¶

std::vector<std::vector<int>> dimsV_¶

CoordIterator¶

class paddle::CoordIterator¶

Public Functions

void step(size_t d, bool reversed)¶

CoordIterator(std::vector<int> dim, std::vector<bool> directions)¶

CoordIterator &operator++()¶

CoordIterator &operator--()¶

std::vector<int> &curPos()¶

int offset()¶

int offset(const std::vector<int> &pos)¶

std::vector<int> &begin()¶

std::vector<int> &rbegin()¶

bool end()¶

bool getPrePos(const std::vector<int> &delays, int idx, std::vector<int> &prePos)¶

bool getNextPos(const std::vector<int> &delays, int idx, std::vector<int> &nextPos)¶

Public Members

std::vector<int> dims_¶

std::vector<bool> directions_¶

std::vector<int> curPos_¶

bool end_¶

GRU¶

GatedRecurrentLayer¶

class paddle::GatedRecurrentLayer¶

Please refer to “Junyoung Chung, Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling”.

GatedRecurrentLayer takes 1 input layer with size * 3. Input layer is diveded into 3 equal parts: (xz_t, xr_t, xi_t). parameter and biasParameter is also diveded into 3 equal parts:

parameter consists of (U_z, U_r, U)
baisParameter consists of (bias_z, bias_r, bias_o)

\[\begin{split} update \ gate: z_t = actGate(xz_t + U_z * h_{t-1} + bias_z) \\ reset \ gate: r_t = actGate(xr_t + U_r * h_{t-1} + bias_r) \\ output \ candidate: {h}_t = actNode(xi_t + U * dot(r_t, h_{t-1}) + bias_o) \\ hidden \ activation: h_t = dot((1-z_t), h_{t-1}) + dot(z_t, {h}_t) \\ \end{split}\]

The config file is grumemory.

Note

dot denotes “element-wise multiplication”.
actNode is defined by config active_type
actGate is defined by config actvie_gate_type

Inherits from paddle::Layer, paddle::GruCompute

Public Functions

GatedRecurrentLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

void resetState()¶

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

void setState(LayerStatePtr state)¶: Set layer state.

LayerStatePtr getState()¶

Get layer state.

Return: A copy of internal state.

Protected Functions

void forwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)¶

void backwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)¶

void forwardBatch(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)¶

void backwardBatch(int batchSize, MatrixPtr inputGrad)¶

Protected Attributes

std::unique_ptr<Weight> weight_¶

std::unique_ptr<Weight> gateWeight_¶

std::unique_ptr<Weight> stateWeight_¶

std::unique_ptr<Weight> bias_¶

Argument gate_¶

Argument resetOutput_¶

bool reversed_¶

bool useBatch_¶

std::unique_ptr<SequenceToBatch> batchValue_¶

std::unique_ptr<SequenceToBatch> batchGrad_¶

std::unique_ptr<ActivationFunction> activationGate_¶

MatrixPtr prevOutput_¶

GruStepLayer¶

class paddle::GruStepLayer¶

GruStepLayer is like GatedRecurrentLayer, but used in recurrent layer group. GruStepLayer takes 2 input layer.

input[0] with size * 3 and diveded into 3 equal parts: (xz_t, xr_t, xi_t).
input[1] with size: {prev_out}.

parameter and biasParameter is also diveded into 3 equal parts:

parameter consists of (U_z, U_r, U)
baisParameter consists of (bias_z, bias_r, bias_o)

\[\begin{split} update \ gate: z_t = actGate(xz_t + U_z * prev_out + bias_z) \\ reset \ gate: r_t = actGate(xr_t + U_r * prev_out + bias_r) \\ output \ candidate: {h}_t = actNode(xi_t + U * dot(r_t, prev_out) + bias_o) \\ output: h_t = dot((1-z_t), prev_out) + dot(z_t, prev_out) \end{split}\]

The config file api if gru_step_layer.

Note

dot denotes “element-wise multiplication”.
actNode is defined by config active_type
actGate is defined by config actvie_gate_type

Inherits from paddle::Layer, paddle::GruCompute

Public Functions

GruStepLayer(const LayerConfig &config)¶

~GruStepLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

Argument gate_¶

Argument resetOutput_¶

std::unique_ptr<Weight> weight_¶

std::unique_ptr<Weight> bias_¶

GruCompute¶

class paddle::GruCompute¶

Subclassed by paddle::GatedRecurrentLayer, paddle::GruStepLayer

Public Functions

void init(LayerConfig &config)¶

template <bool useGpu>
void forward(hl_gru_value value, int frameSize, int batchSize = 1)¶

template <bool useGpu>
void backward(hl_gru_value value, hl_gru_grad grad, int frameSize, int batchSize = 1)¶

template <>
void forward(hl_gru_value value, int frameSize, int batchSize)¶

template <>
void backward(hl_gru_value value, hl_gru_grad grad, int frameSize, int batchSize)¶

template <>
void forward(hl_gru_value value, int frameSize, int batchSize)

template <>
void backward(hl_gru_value value, hl_gru_grad grad, int frameSize, int batchSize)

Public Members

hl_activation_mode_t activeNode_¶

hl_activation_mode_t activeGate_¶

Recurrent Layer Group¶

AgentLayer¶

class paddle::AgentLayer¶

AgentLayer use as a virtual input of another layer in config, before execute forward/backward, setRealLayer() should be called to set one and only one real layer

Inherits from paddle::Layer

Subclassed by paddle::SequenceAgentLayer

Public Functions

AgentLayer(const LayerConfig &config)¶

~AgentLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void setRealLayer(LayerPtr layer, int numSamples = 0)¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

LayerPtr realLayer_¶

int numSamples_¶

SequenceAgentLayer¶

class paddle::SequenceAgentLayer¶

like AgentLayer, but use first numSamples sequences

Inherits from paddle::AgentLayer

Public Functions

SequenceAgentLayer(const LayerConfig &config)¶

~SequenceAgentLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

GatherAgentLayer¶

class paddle::GatherAgentLayer¶

Like AgentLayer, but it can gather many real layers. Each real layer give a few rows of a sequence, after gather all real layers, GatherAgentLayer collect a complete sequence.

Inherits from paddle::Layer

Subclassed by paddle::SequenceGatherAgentLayer

Public Functions

GatherAgentLayer(const LayerConfig &config)¶

virtual ~GatherAgentLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void copyIdAndSequenceInfo(const Argument &input, const IVectorPtr &allIds, const std::vector<int> &idIndex)¶

void addRealLayer(LayerPtr layer)¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::vector<LayerPtr> realLayers_¶

std::vector<IVectorPtr> idsVec_¶

IVectorPtr allIds_¶

std::vector<int> idIndex_¶

SequenceGatherAgentLayer¶

class paddle::SequenceGatherAgentLayer¶

Like GatherAgentLayer, but select a few sequence in real layer. ids in addRealLayer() are the ids of selected sequence. It’s used to reorder sequence output.

Inherits from paddle::GatherAgentLayer

Public Functions

SequenceGatherAgentLayer(const LayerConfig &config)¶

virtual ~SequenceGatherAgentLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

ScatterAgentLayer¶

class paddle::ScatterAgentLayer¶

Like AgentLayer, but only select a few rows in real layer. [idIndex, idIndex + idSize) of ids in setRealLayerAndOutput() are the selected row ids. It’s used to scatter one layer’s output to many small submodels. ScatterAgentLayer can support ids real layer, if it is, the agent will select a few ids in real layer.

Inherits from paddle::Layer

Subclassed by paddle::SequenceScatterAgentLayer

Public Functions

ScatterAgentLayer(const LayerConfig &config)¶

virtual ~ScatterAgentLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void setRealLayer(LayerPtr layer, const std::vector<int> &ids, bool copyId = false)¶

set real layer in generation

Parameters

layer[input]: realLayer
ids[input]: row id in real layer
copyId[input]: whether to copy a cpu version of ids, false(default) in ScatterAgentLayer, and true in SequenceScatterAgentLayer.

void setRealLayerAndOutput(LayerPtr layer, const Argument &outArg, const IVectorPtr &ids, int idIndex, int idSize)¶

void setSequenceStartPositions(const ICpuGpuVectorPtr &sequenceStartPositions, int seqStartPosIndex, int numSequences)¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

LayerPtr realLayer_¶

IVectorPtr ids_¶

IVectorPtr cpuIds_¶

Argument realOutArg_¶

int idIndex_¶

int idSize_¶

int seqStartPosIndex_¶

int numSequences_¶

SequenceScatterAgentLayer¶

class paddle::SequenceScatterAgentLayer¶

Like ScatterAgentLayer, but select a few sequence in real layer. ids in setRealLayer() or setRealLayerAndOutput() are the ids of selected sequence. It’s used to reorder sequence input.

Inherits from paddle::ScatterAgentLayer

Public Functions

SequenceScatterAgentLayer(const LayerConfig &config)¶

virtual ~SequenceScatterAgentLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

ICpuGpuVectorPtr inputStartPos_¶

GetOutputLayer¶

class paddle::GetOutputLayer¶

Inherits from paddle::Layer

Public Functions

GetOutputLayer(const LayerConfig &config)¶

~GetOutputLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Mixed Layer¶

class paddle::MixedLayer¶

A mixed layer has multiple input layers. Each input layer was processed by a Projection or Operator. The results of all projections or Operators are summed together with bias (if configured), and then go through an activation function and dropout (if configured).

The config file api is mixed_layer.

Inherits from paddle::Layer

Public Functions

MixedLayer(const LayerConfig &config)¶

~MixedLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void prefetch()¶: If use sparse row matrix as parameter, prefetch feature ids in input label.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

void resetState()¶

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

void setState(LayerStatePtr state)¶: setState() should be called after getState(). Argument state consists of all projections states.

LayerStatePtr getState()¶: Return state which consists of all projections states.

Protected Attributes

std::vector<std::unique_ptr<Projection>> projections_¶

std::vector<std::unique_ptr<Operator>> operators_¶

std::vector<int> projectionStateMatrixSize_¶: the matrix size of projection state

std::unique_ptr<Weight> biases_¶

bool sharedBias_¶

DotMulProjection¶

class paddle::DotMulProjection¶

DotMulProjection performs element-wise multiplication with weight:

\[ out.row[i] += in.row[i] .* weight \]

where \(.*\) means element-wise multiplication.

The config file api is dotmul_projection.

Inherits from paddle::Projection

Public Functions

DotMulProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)¶

void forward()¶

void backward(const UpdateCallback &callback)¶

Protected Attributes

std::unique_ptr<Weight> weight_¶: shared memory with parameter

DotMulOperator¶

class paddle::DotMulOperator¶

DotMulOperator takes two inputs, performs element-wise multiplication:

\[ out.row[i] += scale * (in1.row[i] .* in2.row[i]) \]

where \(.*\) means element-wise multiplication, and scale is a config scalar, its default value is one.

The config file api is dotmul_operator.

Inherits from paddle::Operator

Public Functions

DotMulOperator(const OperatorConfig &config, bool useGpu)¶

void forward()¶

void backward()¶

FullMatrixProjection¶

class paddle::FullMatrixProjection¶

FullMatrixProjection performs full matrix multiplication:

\[ out.row[i] += in.row[i] * weight \]

The config file api is full_matrix_projection.

Inherits from paddle::Projection

Public Functions

FullMatrixProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)¶

void forward()¶

void backward(const UpdateCallback &callback)¶

Protected Attributes

std::unique_ptr<Weight> weight_¶

IdentityProjection¶

class paddle::IdentityProjection¶

IdentityProjection performs addition:

\[ out.row[i] += in.row[i] \]

The config file api is identity_projection.

Inherits from paddle::Projection

Public Functions

IdentityProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)¶

Constructed function.

Note: IdentityProjection should not have any parameter.

void forward()¶

void backward(const UpdateCallback &callback)¶

IdentityOffsetProjection¶

class paddle::IdentityOffsetProjection¶

IdentityOffsetProjection likes IdentityProjection, but layer size may be smaller than input size. It selects dimensions [offset, offset+layer_size) from input to perform addition:

\[ out.row[i] += in.row[i + \textrm{offset}] \]

The config file api is identity_projection.

Inherits from paddle::Projection

Public Functions

IdentityOffsetProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)¶

Constructed function.

Note: IdentityOffsetProjection should not have any parameter.

void forward()¶

void backward(const UpdateCallback &callback)¶

TableProjection¶

class paddle::TableProjection¶

Table projection takes index data input. It select rows from parameter where row_id is in input_ids:

\[ out.row[i] += table.row[ids[i]] \]

where \(out\) is out, \(table\) is parameter, \(ids\) is input_ids, and \(i\) is row_id.

The config file api is table_projection.

Note: If \(ids[i] = -1\), it will be ignored.

Inherits from paddle::Projection

Public Functions

TableProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)¶

void prefetch(const Argument *in)¶: If use sparse row matrix as parameter, prefetch feature ids in input label.

void forward()¶

void backward(const UpdateCallback &callback)¶

Protected Attributes

std::unique_ptr<Weight> table_¶

TransposedFullMatrixProjection¶

class paddle::TransposedFullMatrixProjection¶

TransposedFullMatrixProjection performs full matrix multiplication: out.row[i] += in.row[i] * weight.transpose.

The config file api is trans_full_matrix_projection.

Inherits from paddle::Projection

Public Functions

TransposedFullMatrixProjection(const ProjectionConfig &config, ParameterPtr parameter, bool useGPu)¶

void forward()¶

void backward(const UpdateCallback &callback)¶

Protected Attributes

std::unique_ptr<Weight> weight_¶

Aggregate Layers¶

Aggregate¶

AverageLayer¶

class paddle::AverageLayer¶

A layer for “internal average” for sequence input. Input: one or more sequences. Each sequence contains some instances. If SequenceLevel = kNonSeq: Output: output size is the number of input sequences (NOT input instances) output[i] = average_{for each instance in this sequence}{input[i]} If SequenceLevel = kSeq: Check input sequence must has sub-sequence Output: output size is the number of input sub-sequences output[i] = average_{for each instance in this sub-sequence}{input[i]}

The config file api is pooling_layer.

Inherits from paddle::SequencePoolLayer

Public Types

enum AverageStrategy¶

Values:

kAverage = 0¶

kSum = 1¶

kAverageSquareRootN = 2¶

Public Functions

AverageLayer(const LayerConfig &config)¶

~AverageLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr outMtx_¶

MatrixPtr dataMtx_¶

int mode_¶

MaxLayer¶

class paddle::MaxLayer¶

A layer for “internal max” for sequence input. Input: one or more sequences. Each sequence contains some instances. If SequenceLevel = kNonSeq: Output: output size is the number of input sequences (NOT input instances) output[i] = max_{for each instance in this sequence}{input[i]} If SequenceLevel = kSeq: Check input sequence must has sub-sequence Output: output size is the number of input sub-sequences output[i] = max_{for each instance in this sub-sequence}{input[i]}

The config file api is pooling_layer.

Inherits from paddle::SequencePoolLayer

Public Functions

MaxLayer(const LayerConfig &config)¶

~MaxLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

IVectorPtr maxIndex_¶

SequenceLastInstanceLayer¶

class paddle::SequenceLastInstanceLayer¶

A layer for extracting the last instance of the input sequence. Input: a sequence If SequenceLevel = kNonseq: Output: a sequence containing only the last instance of the input sequence If SequenceLevel = kSeq: Check input sequence must has sub-sequence Output: a sequence containing only the last instance of each sub-sequence of the input sequence

The config file api is last_seq and first_seq.

Inherits from paddle::SequencePoolLayer

Public Functions

SequenceLastInstanceLayer(const LayerConfig &config)¶

~SequenceLastInstanceLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpSrc_¶

MatrixPtr tmpDest_¶

Concat¶

ConcatenateLayer¶

class paddle::ConcatenateLayer¶

A concatenate layer has multiple input layers. It concatenates rows of each input as one row for the output of this layer and apply activation.

Inherits from paddle::Layer

Public Functions

ConcatenateLayer(const LayerConfig &config)¶

~ConcatenateLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

ConcatenateLayer2¶

class paddle::ConcatenateLayer2¶

concat2 layer is like concat layer, but each input layer was processed by a Projection.

Inherits from paddle::Layer

Public Functions

ConcatenateLayer2(const LayerConfig &config)¶

~ConcatenateLayer2()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::vector<std::unique_ptr<Projection>> projections_¶

std::vector<Argument> projOutput_¶

std::vector<std::pair<size_t, size_t>> projCol_¶

bool sharedBias_¶

std::unique_ptr<Weight> biases_¶

SequenceConcatLayer¶

class paddle::SequenceConcatLayer¶

A layer for concatenating the first sequence with the second sequence following the first Input: two sequences each containing some instances Output: a concatenated sequence of the two input sequences

Inherits from paddle::Layer

Public Functions

SequenceConcatLayer(const LayerConfig &config)¶

~SequenceConcatLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_¶

Subset¶

SubSequenceLayer¶

class paddle::SubSequenceLayer¶

A layer for taking the subsequence according to given offset and size Input: original sequence, offset, size Output: subsequence

Inherits from paddle::Layer

Public Functions

SubSequenceLayer(const LayerConfig &config)¶

~SubSequenceLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_¶

MatrixPtr tmpSrc_¶

MatrixPtr tmpDest_¶

Reshaping Layers¶

BlockExpandLayer¶

class paddle::BlockExpandLayer¶

Expand feature map to minibatch matrix.

matrix width is: blockH_ * blockW_ * channels_
matirx height is: outputH_ * outputW_

\[\begin{split} outputH\_ = 1 + (2 * paddingH\_ + imgSizeH\_ - blockH\_ + strideH\_ - 1) / strideH\_ \\ outputW\_ = 1 + (2 * paddingW\_ + imgSizeW\_ - blockW\_ + strideW\_ - 1) / strideW\_ \end{split}\]

The expand method is the same with ExpandConvLayer, but saved the transposed value. After expanding, output_.sequenceStartPositions will store timeline. The number of time steps are outputH_ * outputW_ and the dimension of each time step is blockH_ * blockW_ * channels_. This layer can be used after convolution neural network, and before recurrent neural network.

The config file api is block_expand_layer.

Inherits from paddle::Layer

Public Functions

BlockExpandLayer(const LayerConfig &config)¶

~BlockExpandLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

size_t getBlockNum()¶

Calculate outputH_ and outputW_ and return block number which actually is time steps.

Return: time steps, outoutH_ * outputW_.

Protected Attributes

size_t blockH_¶

size_t blockW_¶

size_t strideH_¶

size_t strideW_¶

size_t paddingH_¶

size_t paddingW_¶

size_t imgSizeH_¶

size_t imgSizeW_¶

size_t outputH_¶

size_t outputW_¶

size_t channels_¶

MatrixPtr outVTrans_¶: auxiliary variable, which saves the transposed output value.

ExpandLayer¶

class paddle::ExpandLayer¶

A layer for “Expand Dense data or (sequence data where the length of each sequence is one) to sequence data.”

It should have exactly 2 input, one for data, one for size:

first one for data
- If ExpandLevel = kNonSeq: dense data
- If ExpandLevel = kSeq: sequence data where the length of each sequence is one
second one only for sequence info
- should be sequence data with or without sub-sequence.

And the output size is the batch size(not instances) of second input.

The config file api is expand_layer.

Inherits from paddle::Layer

Public Functions

ExpandLayer(const LayerConfig &config)¶

~ExpandLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Types

enum ExpandLevel¶

if input[0] is dense data, ExpandLevel=kNonSeq; if input[0] is sequence data, ExpandLevel=kSeq

Values:

kNonSeq = 0¶

kSeq = 1¶

Protected Attributes

std::unique_ptr<Weight> biases_¶

int type_¶: store the ExpandLevel

ICpuGpuVectorPtr expandStartsPos_¶: expanded sequenceStartPositions or subSequenceStartPositions of input[1]

FeatureMapExpandLayer¶

class paddle::FeatureMapExpandLayer¶

A layer for expanding a batch of images to feature maps. Each data of the input is a 2 dimensional matrix. Each element of the matrix is replicated num_filters times to create a feature map with num_filters channels.

Input: Input one should be dense image data.
Output: expanded fature maps.
\[ y.row[i] = x.row[i \mod x.width], i = 0,1,..., (x.width * num\_filters - 1) \]
For example, num_filters = 4:
```
x = [a1,a2;
     b1,b2]
y = [a1, a2, a1, a2, a1, a2, a1, a2;
     b1, b2, b1, b2, b1, b2, b1, b2;]
```

Inherits from paddle::Layer

Public Functions

FeatureMapExpandLayer(const LayerConfig &config)¶

~FeatureMapExpandLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

ResizeLayer¶

class paddle::ResizeLayer¶

A layer for resizing a minibatch matrix h*w to h’*w’.

Note: origin matrix height * witdth) resize matrix: (height * width / size) * size

Inherits from paddle::Layer

Public Functions

ResizeLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

SequenceReshapeLayer¶

class paddle::SequenceReshapeLayer¶

A layer for reshaping the sequence Input: a sequence Output: a sequence

Inherits from paddle::Layer

Public Functions

SequenceReshapeLayer(const LayerConfig &config)¶

~SequenceReshapeLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_¶

MatrixPtr reshapedOutputGrad¶

Math Layers¶

AddtoLayer¶

class paddle::AddtoLayer¶

This layer just simply add all input layers together, then activate the sum inputs. Each input of this layer should be the same size, which is also the output size of this layer.

\[ y=f(\sum_{i}x_i + b) \]

where \(y\) is output, \(x\) is input, \(b\) is bias, and \(f\) is activation function.

The config file api is addto_layer.

Inherits from paddle::Layer

Public Functions

AddtoLayer(const LayerConfig &config)¶

~AddtoLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization of AddtoLayer.

void forward(PassType passType)¶

Forward propagation.

Note: There is no weight matrix for each input, because it just a simple add operation.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation.

Protected Attributes

std::unique_ptr<Weight> biases_¶

ConvexCombinationLayer¶

class paddle::ConvexCombinationLayer¶

A layer for weighted sum of vectors, which is used in NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE.

Input: the the size of the first input is weightDim, and the size of the second input is weightdim * dataDim.
Output: the sizeof the output is dataDim
\[ out(j) = \sum_{i}(in0(i) * in1(i,j + i * dataDim)), i = 0,1,...,(weightDim-1); j = 0, 1,...,(dataDim-1) \]
Note that the above computation is for one sample. Multiple samples are processed in one batch.

The config file api is linear_comb_layer.

Inherits from paddle::Layer

Public Functions

ConvexCombinationLayer(const LayerConfig &config)¶

~ConvexCombinationLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx0¶: A matrix pointer pointing to second input.

MatrixPtr tmpRow0¶: A matrix pointer pointing to first input.

MatrixPtr tmpRow1¶: A matrix pointer pointing to output.

InterpolationLayer¶

class paddle::InterpolationLayer¶

A layer for linear interpolation with two inputs, which is used in NEURAL TURING MACHINE.

\[ y.row[i] = w[i] * x_1.row[i] + (1 - w[i]) * x_2.row[i] \]

where \(x_1\) and \(x_2\) are two (batchSize x dataDim) inputs, \(w\) is (batchSize x 1) weight vector, and \(y\) is (batchSize x dataDim) output.

The config file api is interpolation_layer.

Inherits from paddle::Layer

Public Functions

InterpolationLayer(const LayerConfig &config)¶

~InterpolationLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr weightLast_¶: weightLast = 1 - weight

MatrixPtr tmpMatrix¶

MultiplexLayer¶

class paddle::MultiplexLayer¶

This layer multiplex multiple layers according to the index, which is provided by the first input layer.

Input[0]: the index of the layer to output of size batchSize.
Input[1:N]; the candidate output data. For each index i from 0 to batchSize -1, the output is the i-th row of the (index[i] + 1)-th layer.

For each i-th row of output:

\[ y[i][j] = x_{x_{0}[i] + 1}[i][j], j = 0,1, ... , (x_{1}.width - 1) \]

where, y is output. \(x_{k}\) is the k-th input layer and \(k = x_{0}[i] + 1\).

Inherits from paddle::Layer

Public Functions

MultiplexLayer(const LayerConfig &config)¶

~MultiplexLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::vector<CopyInfo> copySchedule_¶: A list of CopyInfo used to save copy information.

MatrixPtr tmpSrc_¶: Temporary matrix pointer to point to input data.

MatrixPtr tmpDest_¶: Temporary matrix pointer to point to output data.

struct CopyInfo¶

A struct is used to save the copy information, includes input layer index and copy size.

Public Functions

CopyInfo(int inStartIdx, int inLength, int inCopyIdx)¶

Public Members

int startIdx¶: The start row of input.

int length¶: Number of rows. If the layer index in Input[0] is not consecutive, the length is one. Otherwise, the length is > 1 and copy multi rows once.

int copyIdx¶: The copied layer index, which needs to add 1.

OuterProdLayer¶

class paddle::OuterProdLayer¶

A layer for computing the outer product of two vectors.

Note: used in NEURAL TURING MACHINE Input1: vector (batchSize * dim1) Input2: vector (batchSize * dim2) Output: a matrix: (batchSize * (dim1*dim2))

Inherits from paddle::Layer

Public Functions

OuterProdLayer(const LayerConfig &config)¶

~OuterProdLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx0¶

MatrixPtr tmpRow0¶

MatrixPtr tmpRow1¶

PowerLayer¶

class paddle::PowerLayer¶

This layer applys a power function to a vector element-wise, which is used in NEURAL TURING MACHINE.

\[ y = x^w \]

where \(x\) is a input vector, \(w\) is scalar weight, and output \(y\) is a vector.

The config file api is power_layer.

Inherits from paddle::Layer

Public Functions

PowerLayer(const LayerConfig &config)¶

~PowerLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx¶

ScalingLayer¶

class paddle::ScalingLayer¶

A layer for each row of a matrix, multiplying with a element of a vector, which is used in NEURAL TURING MACHINE.

\[ y.row[i] = w[i] * x.row[i] \]

where \(x\) is (batchSize x dataDim) input, \(w\) is (batchSize x 1) weight vector, and \(y\) is (batchSize x dataDim) output.

The config file api is scaling_layer.

Inherits from paddle::Layer

Public Functions

ScalingLayer(const LayerConfig &config)¶

~ScalingLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

SlopeInterceptLayer¶

class paddle::SlopeInterceptLayer¶

A layer for applying a slope and an intercept to the input element-wise. This layer is used in NEURAL TURING MACHINE.

\[ y = ax + b \]

Note: There is no activation and weight in this layer.

Here, a is scale and b is offset, which are provided as attributes of the layer.

The config file api is slope_intercept_layer.

Inherits from paddle::Layer

Public Functions

SlopeInterceptLayer(const LayerConfig &config)¶

~SlopeInterceptLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

TensorLayer¶

class paddle::TensorLayer¶

TensorLayer takes two input vectors.

\[ y_{i} = x_{1} * W_{i} * x_{2}^{\rm T}, i=0, 1, ...,K-1 \]

.

\(x_{1}\): the first input, size is M.
\(x_{2}\): the second input, size is N.
y: output, size is K.
\(y_{i}\): i-th element of y.
\(W_{i}\): the i-th learned weight, dimensions: [M, N].
\(x_{2}^{\rm T}\): the transpose of \(x_{2}\).

The config file api is tensor_layer.

Inherits from paddle::Layer

Public Functions

TensorLayer(const LayerConfig &config)¶

~TensorLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

WeightList weights_¶

std::unique_ptr<Weight> biases_¶

TransLayer¶

class paddle::TransLayer¶

A layer for transposition.

\[ y = x^\mathrm{T} \]

where \(x\) is (M x N) input, and \(y\) is (N x M) output.

The config file api is trans_layer.

Inherits from paddle::Layer

Public Functions

TransLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Sampling Layers¶

MultinomialSampler¶

class paddle::MultinomialSampler¶

Given the probability of N objects, the sampler random select one of the object.

The space requirement is O(N)=O(N * sizeof(Interval)). The computational complexity of generate one sample is O(1).

Note: : prob does not have to be unnormalized.

Public Functions

MultinomialSampler(const real *prob, int size)¶

template <typename URNG>

int gen(URNG &g)¶

Generate a random sample.

Return

Random integer.

Parameters

g: is a random number engine. See <random>.

Protected Functions

template <typename Rand>

int gen1(Rand rand)¶

Generation.

Return

random int number or intervals_[random_int_number].otherId.

Parameters

rand: rand is a real random number distribution for the range [0, size).

Protected Attributes

std::vector<Interval> intervals_¶: The probability of each interval will be 1./size.

std::uniform_real_distribution<double> rand_¶

struct Interval¶

Public Members

int otherId¶

real thresh¶

MaxIdLayer¶

class paddle::MaxIdLayer¶

A layer for finding the id which has the maximal value for each sample. The result is stored in output_.ids.

The config file api is maxid_layer.

Inherits from paddle::Layer

Public Functions

MaxIdLayer(const LayerConfig &config)¶

virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

SamplingIdLayer¶

class paddle::SamplingIdLayer¶

A layer for sampling id from multinomial distribution from the input layer. Sampling one id for one sample. The result is stored in output_.ids.

The config file api is sampling_id_layer.

Inherits from paddle::Layer

Public Functions

SamplingIdLayer(const LayerConfig &config)¶

virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void forwardImp(const Argument &input)¶

virtual void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Cost Layers¶

CostLayer¶

class paddle::CostLayer¶

Base class for a particular type of cost layer. This type of cost should have one data layer, one label layer and an optional weight layer as input. The derived class should implemnt forwardImp() and backwardImp() which calculate the cost for data and label. The weight is automatically handled by the base class.

Inherits from paddle::Layer

Subclassed by paddle::HuberTwoClass, paddle::MultiBinaryLabelCrossEntropy, paddle::MultiClassCrossEntropy, paddle::MultiClassCrossEntropyWithSelfNorm, paddle::SoftBinaryClassCrossEntropy, paddle::SumOfSquaresCostLayer

Public Functions

CostLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer()¶

LayerPtr getLabelLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

virtual void forwardImp(Matrix &outputValue, Argument &label, Matrix &cost) = 0¶

virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad) = 0¶

Protected Attributes

LayerPtr weightLayer_¶

real coeff_¶

HuberTwoClass¶

class paddle::HuberTwoClass¶

Huber loss for robust 2-classes classification.

For label={0, 1}, let y=2*label-1. Given output f, the loss is:

\[\begin{split} Loss = \left\{\begin{matrix} 4 * y * f & \textit{if} \ \ y* f < -1 \\ (1 - y * f)^2 & \textit{if} \ \ -1 < y * f < 1 \\ 0 & \textit{otherwise} \end{matrix}\right. \end{split}\]

Inherits from paddle::CostLayer

Public Functions

HuberTwoClass(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void forwardImpIn(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

void backwardImpIn(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

LambdaCost¶

class paddle::LambdaCost¶

LambdaRank os a method for learning arbitrary information retrieval measures. It can be applied to any algorithm that learns through gradient descent. LambdaRank is a listwise method, in that the cost depends on the sorted order of the documents. LambdaRank gives the gradient of cost function:

\[ \lambda_{ij} = \frac{1}{1 + e^{o_i - o_j}} \left| \Delta_{NDCG} \right| \]

[1] Christopher J.C. Burges, Robert Ragno, Quoc Viet Le. Learning to Rank with Nonsmooth Cost Functions.

Inherits from paddle::Layer

Public Functions

LambdaCost(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer()¶

LayerPtr getScoreLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

void onPassEnd()¶: One pass is finished.

real calcNDCG(const real *outputScore, const real *score, int size)¶

void calcGrad(const real *outputScore, const real *score, real *gradData, int size)¶

MultiBinaryLabelCrossEntropy¶

class paddle::MultiBinaryLabelCrossEntropy¶

Cross entropy for multi binary labels.

\[ cost[i] = -sum(label[i][j]*log(output[i][j]) + (1-label[i][j])*log(1-output[i][j])) \]

Inherits from paddle::CostLayer

Public Functions

MultiBinaryLabelCrossEntropy(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

Protected Attributes

MatrixPtr targetPerDim_¶

MultiClassCrossEntropy¶

class paddle::MultiClassCrossEntropy¶

The cross-entropy loss for multi-class classification task. The loss function is:

\[ L = - \sum_{i}{t_{k} * log(P(y=k))} \]

Inherits from paddle::CostLayer

Public Functions

MultiClassCrossEntropy(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

MultiClassCrossEntropyWithSelfNorm¶

class paddle::MultiClassCrossEntropyWithSelfNorm¶

The cross-entropy with self-normalization for multi-class classification.

The loss function is:

\[ L = \sum_{i}[-log(P(x_{i})) + alpha * log(Z(x_{i})^2)] \]

The \(Z(x)\) is the softmax normalizer.

[1] Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the ACL 2014 Conference.

Inherits from paddle::CostLayer

Public Functions

MultiClassCrossEntropyWithSelfNorm(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

Protected Attributes

MatrixPtr sftMaxSum_¶

MatrixPtr sumInv_¶

RankingCost¶

class paddle::RankingCost¶

A cost layer for learning to rank (LTR) task. This layer contains at leat three inputs.

\[\begin{split} C_{i,j} = -\tilde{P_{ij}} * o_{i,j} + log(1 + e^{o_{i,j}}) \\ o_{i,j} = o_i - o_j \\ \tilde{P_{i,j}} = \left \{0, 0.5, 1 \right \} \ or \ \left \{0, 1 \right \} \end{split}\]

[1]. Chris Burges, Tal Shaked, Erin Renshaw, et al. Learning to Rank useing Gradient Descent.

Inherits from paddle::Layer

Public Functions

RankingCost(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer(size_t i)¶

LayerPtr getLabelLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

void onPassEnd()¶: One pass is finished.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

SoftBinaryClassCrossEntropy¶

class paddle::SoftBinaryClassCrossEntropy¶

The cross-entropy for soft binary class.

\[ L = \sum_i (\sum_j -y_j(i)*log(x_j(i))-(1-y_j(i))*log(1-x_j(i))) \]

Inherits from paddle::CostLayer

Public Functions

SoftBinaryClassCrossEntropy(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

Protected Attributes

MatrixPtr targetPerDim_¶

SumOfSquaresCostLayer¶

class paddle::SumOfSquaresCostLayer¶

This cost layer compute Euclidean (L2) loss for real-valued regression tasks.

\[ L = \sum_{i=1}^N {|| \hat{y}_i - y_i||_2^2} \]

Inherits from paddle::CostLayer

Public Functions

SumOfSquaresCostLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)¶

void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)¶

SumCostLayer¶

class paddle::SumCostLayer¶

This cost layer compute the sum of its input as loss.

\[ o(i) = \sum_{j=1}^D y_{ij} \]

Inherits from paddle::Layer

Public Functions

SumCostLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

CosSimLayer¶

class paddle::CosSimLayer¶

A layer for calculating cosine similarity between two vector

\[ f(x,y)=scale\frac{x_1y_1+x_2y_2+...+x_ny_n}{\sqrt{x_1^2+x_2^2+... +x_n^2}\sqrt{y_1^2+y_2^2+...+y_n^2}} \]

.

Input1: A vector (batchSize * dataDim) *
Input2: A vector (batchSize * dataDim) or (1 * dataDim) *
Output: A vector (dataDim * 1)

The config file api is cos_sim.

Inherits from paddle::Layer

Public Functions

CosSimLayer(const LayerConfig &config)¶

~CosSimLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

CosSimVecMatLayer¶

class paddle::CosSimVecMatLayer¶

A layer for computing cosine similarity between a vector and each row of a matrix out[i] = cos_scale * cos(in1, in2(i,:));.

Input1: a vector (batchSize * dataDim)

Note: used in NEURAL TURING MACHINE

Input2: a matrix in vector form (batchSize * (weightDim*dataDim))

Output: a vector (batchSize * weightDim)

Inherits from paddle::Layer

Public Functions

CosSimVecMatLayer(const LayerConfig &config)¶

~CosSimVecMatLayer()¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx0¶

MatrixPtr tmpMtx1¶

MatrixPtr tmpRow0¶

MatrixPtr tmpRow1¶

MatrixPtr tmpRow2¶

MatrixPtr tmpRow3¶

CRFDecodingLayer¶

class paddle::CRFDecodingLayer¶

A layer for calculating the decoding sequence of sequential conditional random field model. The decoding sequence is stored in output_.ids It also calculate error, output_.value[i] is 1 for incorrect decoding or 0 for correct decoding) See LinearChainCRF.h for the detail of the CRF formulation.

Inherits from paddle::CRFLayer

Public Functions

CRFDecodingLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<LinearChainCRF> crf_¶

CRFLayer¶

class paddle::CRFLayer¶

A layer for calculating the cost of sequential conditional random field model. See class LinearChainCRF for the detail of the CRF formulation.

Inherits from paddle::Layer

Subclassed by paddle::CRFDecodingLayer

Public Functions

CRFLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

size_t numClasses_¶

ParameterPtr parameter_¶

std::vector<LinearChainCRF> crfs_¶

LayerPtr weightLayer_¶

real coeff_¶

CTCLayer¶

class paddle::CTCLayer¶

Inherits from paddle::Layer

Public Functions

CTCLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void forwardImp(const Argument &softmaxSeqs, const Argument &labelSeqs)¶

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

void backwardImp(const UpdateCallback &callback, const Argument &softmaxSeqs, const Argument &labelSeqs)¶

Protected Attributes

size_t numClasses_¶

bool normByTimes_¶

std::vector<LinearChainCTC> ctcs_¶

std::vector<Argument> tmpCpuInput_¶

HierarchicalSigmoidLayer¶

class paddle::HierarchicalSigmoidLayer¶

Organize the classes into a binary tree. At each node, a sigmoid function is used to calculate the probability of belonging to the right branch. This idea is from “F. Morin, Y. Bengio (AISTATS 05): Hierarchical Probabilistic Neural Network Language Model.”

Here we uses a simple way of making the binary tree. Assuming the number of classes C = 6, The classes are organized as a binary tree in the following way:

*-*-*- 2
| | |- 3
| |
| |-*- 4
|   |- 5
|
|-*- 0
|- 1

where * indicates an internal node, and each leaf node represents a class.

Node 0 ... C-2 are internal nodes.
Node C-1 ... 2C-2 are leaf nodes.
Class c is represented by leaf node \(c+C-1\).

We assign an id for each node:

the id of root be 0.
the left child of a node i is 2*i+1.
the right child of a node i is 2*i+2.

It’s easy to see that:

the parent of node i is \(\left\lfloor(i-1)/2\right\rfloor\).
the j-th level ancestor of node i is \(\left\lfloor(i+1)/2^{j+1}\right\rfloor - 1\).
A node i is a left child of its parent if \((i-1)\%2==0\).

The config file api is hsigmod_layer.

Inherits from paddle::Layer

Public Functions

HierarchicalSigmoidLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

LayerPtr getLabelLayer()¶: The last of inputs is label layer.

Protected Attributes

WeightList weights_¶

std::unique_ptr<Weight> biases_¶

size_t numClasses_¶: number of classes

int codeLength_¶: codeLength_ = \(1 + \left\lfloor log_{2}(numClasses-1)\right\rfloor\)

Argument preOutput_¶: temporary result of output_

LinearChainCRF¶

class paddle::LinearChainCRF¶

Public Functions

LinearChainCRF(int numClasses, real *para, real *grad)¶

The size of para and grad must be \((numClasses + 2) * numClasses\). The first numClasses values of para are for starting weights ( \(a\)). The next numClasses values of para are for ending weights ( \(b\)), The remaning values are for transition weights ( \(w\)).

The probability of a state sequence s of length \(L\) is defined as: \(P(s) = (1/Z) exp(a_{s_1} + b_{s_L} + \sum_{l=1}^L x_{s_l} + \sum_{l=2}^L w_{s_{l-1},s_l})\) where \(Z\) is a normalization value so that the sum of \(P(s)\) over all possible sequences is \(1\), and \(x\) is the input feature to the CRF.

real forward(real *x, int *s, int length)¶: Calculate the negative log likelihood of s given x. The size of x must be length * numClasses. Each consecutive numClasses values are the features for one time step.

void backward(real *x, real *dx, int *s, int length)¶

Calculate the gradient with respect to x, a, b, and w. The gradient of x will be stored in dx. backward() can only be called after a corresponding call to forward() with the same x, s and length.

Note: The gradient is added to dx and grad (provided at constructor).

void decode(real *x, int *s, int length)¶: Find the most probable sequence given x. The result will be stored in s.

Protected Attributes

int numClasses_¶

MatrixPtr a_¶

MatrixPtr b_¶

MatrixPtr w_¶

MatrixPtr da_¶

MatrixPtr db_¶

MatrixPtr dw_¶

MatrixPtr ones_¶

MatrixPtr expX_¶

MatrixPtr alpha_¶

MatrixPtr beta_¶

MatrixPtr maxX_¶

MatrixPtr expW_¶

IVectorPtr track_¶

LinearChainCTC¶

class paddle::LinearChainCTC¶

Public Functions

LinearChainCTC(int numClasses, bool normByTimes)¶

real forward(real *softmaxSeq, int softmaxSeqLen, int *labelSeq, int labelSeqLen)¶

void backward(real *softmaxSeq, real *softmaxSeqGrad, int *labelSeq, int labelSeqLen)¶

Protected Functions

void segmentRange(int &start, int &end, int time)¶

Protected Attributes

int numClasses_¶

int blank_¶

int totalSegments_¶

int totalTime_¶

bool normByTimes_¶

bool isInvalid_¶

MatrixPtr logActs_¶

MatrixPtr forwardVars_¶

MatrixPtr backwardVars_¶

MatrixPtr gradTerms_¶

real logProb_¶

NCELayer¶

class paddle::NCELayer¶

Noise-contrastive estimation. Implements the method in the following paper: A fast and simple algorithm for training neural probabilistic language models.

The config file api is nce_layer.

Inherits from paddle::Layer

Public Functions

NCELayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void prepareSamples()¶

void prefetch()¶: If use sparse row matrix as parameter, prefetch feature ids in input label.

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.

void forwardBias()¶

void backwardBias(const UpdateCallback &callback)¶

void forwardOneInput(int layerId)¶

void backwardOneInput(int layerId, const UpdateCallback &callback)¶

void forwardCost()¶

void backwardCost()¶

Validation Layers¶

ValidationLayer¶

class paddle::ValidationLayer¶

Inherits from paddle::Layer

Subclassed by paddle::AucValidation, paddle::PnpairValidation

Public Functions

ValidationLayer(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer()¶

LayerPtr getLabelLayer()¶

LayerPtr getInfoLayer()¶

void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

void backward(const UpdateCallback &callback = nullptr)¶: Backward propagation. Should only be called after Layer::forward() function.

virtual void validationImp(MatrixPtr outputValue, IVectorPtr label) = 0¶

virtual void onPassEnd() = 0¶: One pass is finished.

AucValidation¶

class paddle::AucValidation¶

Inherits from paddle::ValidationLayer

Public Functions

AucValidation(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void validationImp(MatrixPtr outputValue, IVectorPtr label)¶

void onPassEnd()¶: One pass is finished.

Public Members

std::vector<PredictionResult> predictArray_¶

struct PredictionResult¶

Public Functions

PredictionResult(real __out, int __label)¶

Public Members

real out¶

int label¶

PnpairValidation¶

class paddle::PnpairValidation¶

Inherits from paddle::ValidationLayer

Public Functions

PnpairValidation(const LayerConfig &config)¶

bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

void validationImp(MatrixPtr outputValue, IVectorPtr label)¶

void onPassEnd()¶: One pass is finished.

Check Layers¶

EosIdCheckLayer¶

class paddle::EosIdCheckLayer¶

A layer for checking EOS for each sample:

output_id = (input_id == conf.eos_id)

The result is stored in output_.ids. It is used by recurrent layer group.

Inherits from paddle::Layer

Public Functions

EosIdCheckLayer(const LayerConfig &config)¶

virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)¶: Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)¶: Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)¶: Backward propagation. Should only be called after Layer::forward() function.