Base

Layer

class paddle::Layer

Base class for layer. Define necessary variables and functions for every layer.

Subclassed by paddle::AddtoLayer, paddle::AgentLayer, paddle::AverageLayer, paddle::BatchNormBaseLayer, paddle::BlockExpandLayer, paddle::BootBiasLayer, paddle::ConcatenateLayer, paddle::ConcatenateLayer2, paddle::ConvBaseLayer, paddle::ConvexCombinationLayer, paddle::ConvShiftLayer, paddle::CosSimLayer, paddle::CosSimVecMatLayer, paddle::CostLayer, paddle::CRFLayer, paddle::CTCLayer, paddle::DataLayer, paddle::DataNormLayer, paddle::EosIdCheckLayer, paddle::ExpandLayer, paddle::FeatureMapExpandLayer, paddle::FullyConnectedLayer, paddle::GatedRecurrentLayer, paddle::GatherAgentLayer, paddle::GetOutputLayer, paddle::GruStepLayer, paddle::HierarchicalSigmoidLayer, paddle::InterpolationLayer, paddle::LambdaCost, paddle::LstmLayer, paddle::LstmStepLayer, paddle::MaxIdLayer, paddle::MaxLayer, paddle::MixedLayer, paddle::MultiplexLayer, paddle::NCELayer, paddle::NormLayer, paddle::OuterProdLayer, paddle::ParameterReluLayer, paddle::PoolLayer, paddle::PowerLayer, paddle::PrintLayer, paddle::RankingCost, paddle::RecurrentLayer, paddle::RecurrentLayerGroup, paddle::ResizeLayer, paddle::SamplingIdLayer, paddle::ScalingLayer, paddle::ScatterAgentLayer, paddle::SelectiveFullyConnectedLayer, paddle::SequenceConcatLayer, paddle::SequenceLastInstanceLayer, paddle::SequenceReshapeLayer, paddle::SlopeInterceptLayer, paddle::SubSequenceLayer, paddle::SumToOneNormLayer, paddle::TensorLayer, paddle::TransLayer, paddle::ValidationLayer

Public Functions

virtual void waitInputValue()

Wait until all input value ready. Called before Layer::forward() function.

virtual void copyOutputToOtherDevice()

Copy layer’s output_ to other device. If output layer is in other device, called after Layer::forward() function.

virtual void waitAndMergeOutputGrad()

Wait until all output grad ready and merge them to output_.grad. Called before Layer::backward() function.

virtual void markAllInputGrad()

Notify previous layer the output grad ready. Called after Layer::backward() function.

Layer(const LayerConfig &config, bool useGpu = FLAGS_use_gpu)
virtual ~Layer()
bool needGradient() const

Get the flag whether layer need to compute gradient.

void setNeedGradient(bool need)

Set the flag whether layer need to compute gradient.

void setNeedSequenceInfo(bool need)

Set the flag whether layer need to re-compute sequence information, which includes sequenceStartPositions or subSequenceStartPositions.

const std::string &getName() const

Get layer’s name.

const std::string &getType() const

Get layer’s type.

size_t getSize() const

Get layer’s size.

int getDeviceId() const

Get layer’s deviceId.

void addPrev(LayerPtr l)

Add the inputLayer.

const LayerPtr &getPrev(size_t i)

Get the size of inputLayer[i].

const MatrixPtr &getOutputValue()

Get the forward-output value.

const IVectorPtr &getOutputLabel()

Get the forward-output label.

const MatrixPtr &getOutputGrad()

Get the backward-Loss value.

void setOutput(const std::string &name, Argument *output)

If layer has multi-output, set output into outputMap_.

Argument &getOutput(const std::string &str = "")

Get the output based on layer’s name.

const Argument &getOutput(int deviceId) const

Get the output based on deviceId.

const std::vector<ParameterPtr> &getParameters()

Get layer’s parameters.

const ParameterPtr &getBiasParameter()

Get layer’s bias-parameters.

void resizeOutput(size_t height, size_t width)

Resize the output matrix size.

void reserveOutput(size_t height, size_t width)

Resize the output matrix size, and reset value to zero.

void resetOutput(size_t height, size_t width)

Resize the output matrix size, and reset value and grad to zero.

void zeroGrad()

Clear the gradient of output.

virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void initSubNetwork(NeuralNetwork *rootNetwork, const ModelConfig &config, const std::vector<ParameterType> &parameterTypes, bool useGpu)

Intialization for sub network if there has sub network.

Parameters
  • rootNetwork -

    root network

  • config -

    model config

  • parameterTypes -

    parameter’s type

  • useGpu -

    whether to use gpu or not

virtual void accessSubNetwork(const std::function<void(NeuralNetwork&)> &callback)

Access SubNetwork Object. If subnetwork exists, then invoke callback with subnetwrk.

Parameters
  • callback -

    if sub-network is exist, the callback is invoked.

virtual void prefetch()

If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void resetState()

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Get layer state.

Return
A copy of internal state.

void showOutputStats()

Show output state.

virtual void backward(const UpdateCallback &callback = nullptr) = 0

Backward propagation. Should only be called after Layer::forward() function.

virtual void onPassEnd()

One pass is finished.

Public Static Functions

LayerPtr create(const LayerConfig &config)

Create pointer of layer.

Public Static Attributes

ClassRegistrar<Layer, LayerConfig> registrar_

Register a Layer.

Protected Functions

void markInputGrad(int inputIndex)

Notify specified layer the output grad ready. Called in the backward function. If do mark input grad in the backward function, you should to ensure that all input grad will be marked in the backward function.

const Argument &getInput(size_t inputIndex) const

Get the argument of input layer.

const Argument &getInput(const Layer &inputLayer) const

Get the argument of input layer.

const MatrixPtr &getInputValue(int inputIndex)

Get the forward-input value.

const MatrixPtr &getInputValue(const Layer &inputLayer)

Get the forward-input value.

const MatrixPtr &getInputGrad(int inputIndex)

Get the forward-input grad.

const MatrixPtr &getInputGrad(const Layer &inputLayer)

Get the forward-input grad.

const IVectorPtr &getInputLabel(const Layer &inputLayer)

Get the forward-input label.

void resetSpecifyOutput(Argument &output, size_t height, size_t width, bool isValueClean, bool isGradClean)

Change the size of output (value, grad). Reset to value zero if isValueClean = true, Reset to grad zero if isGradClean = true.

void addOutputArgument(int deviceId)

Add output argument to other devices.

void forwardActivation()

Forward of activation function.

void backwardActivation()

Backward of activation function.

void forwardDropOut()

Forward of dropOut.

void initNeedFlags()

Initilize the needGradient_ flag.

Protected Attributes

LayerConfig config_

Layer config.

bool useGpu_

whether to use GPU

int deviceId_

Device Id. CPU is -1, and GPU is 0, 1, 2 ...

std::vector<LayerPtr> inputLayers_

Input layers.

std::vector<std::string> inputArgument_

Argument of input layers.

std::vector<ParameterPtr> parameters_

Parameter for each input layer. Parameters_[i] is nullptr if inputLayers_[i] does not need parameter.

ParameterPtr biasParameter_

nullptr if bias is not needed.

Argument output_

Output.

std::vector<Argument> outputOtherDevice_

Several outputs stored on different devices, used in ‘parallel_nn’ case, and record them by deviceId_.

std::map<std::string, Argument *> outputMap_

If there are several outputs, map them by each name.

MatrixPtr tmpGrad_

Used to merge grad on different devices.

std::unique_ptr<ActivationFunction> activation_
PassType passType_

Current passType, PASS_TRAIN or PASS_TEST.

MatrixPtr dropOutMask_

Random 0-1 matrix for dropOut.

bool needGradient_

Whether the layer need to compute gradient.

bool needSequenceInfo_

Whether the layer need to compute re-sequence information.

std::vector<bool> markInBackward_

Mark input grad in(true) or out(false) of backward function.

Projection

class paddle::Projection

A projection takes one Argument as input, calculate the result and add it to output Argument.

Subclassed by paddle::ContextProjection, paddle::DotMulProjection, paddle::FullMatrixProjection, paddle::IdentityOffsetProjection, paddle::IdentityProjection, paddle::TableProjection, paddle::TransposedFullMatrixProjection

Public Functions

Projection(const ProjectionConfig &config, ParameterPtr parameter, bool useGpu)
virtual ~Projection()
const std::string &getName() const
void forward(const Argument *in, const Argument *out, PassType passType)

Forward propagation. If backward() will be called, in and out must be kept valid until then.

Parameters
  • in -

    input of projection

  • out -

    output of projection

  • passType -

    PASS_TRAIN of PASS_TEST

virtual void prefetch(const Argument *in)
virtual void forward() = 0
virtual void backward(const UpdateCallback &callback) = 0
virtual void resetState()

See comment in Layer.h for the function with the same name.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Get layer state. A copy of internal state is returned.

size_t getOutputSize() const

Get output size of projection.

Public Static Functions

Projection *create(const ProjectionConfig &config, ParameterPtr parameter, bool useGpu)

Public Static Attributes

ClassRegistrar<Projection, ProjectionConfig, ParameterPtr, bool> registrar_

Register a projection.

Protected Attributes

ProjectionConfig config_

Config of projection.

ParameterPtr parameter_

Parameter of projection.

bool useGpu_
const Argument *in_

Store in passed to forward()

const Argument *out_

Store out passed to forward()

PassType passType_

Store passType passed to forward()

Operator

class paddle::Operator

Operator like Projection, but takes more than one Arguments as input.

Note
: Operator can’t have parameters.

Subclassed by paddle::ConvOperator, paddle::DotMulOperator

Public Functions

Operator(const OperatorConfig &config, bool useGpu)
virtual ~Operator()
const OperatorConfig &getConfig() const
void forward(std::vector<const Argument *> ins, Argument *out, PassType passType)

Forward propagation. If backward() will be called, in and out must be kept valid until then.

Parameters
  • ins -

    inputs of operator

  • out -

    output of operator

  • passType -

    PASS_TRAIN of PASS_TEST

virtual void prefetch(const Argument *in)
virtual void forward() = 0
virtual void backward() = 0
virtual void resetState()

See comment in Layer.h for the function with the same name.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Set layer state.

Public Static Functions

Operator *create(const OperatorConfig &config, bool useGpu)

Public Static Attributes

ClassRegistrar<Operator, OperatorConfig, bool> registrar_

Protected Attributes

OperatorConfig config_

Config of operator.

bool useGpu_
std::vector<const Argument *> ins_

Store ins passed to forward()

Argument *out_

Store out passed to forward()

PassType passType_

Store passType passed to forward()

Data Layer

class paddle::DataLayer

This layer just copy data to output, and has no backward propagation.

The config file api is data_layer.

Inherits from paddle::Layer

Public Functions

DataLayer(const LayerConfig &config)
virtual void setData(const Argument &data)
virtual void prefetch()

Prefetch sparse matrix/ids only.

virtual void forward(PassType passType)

Forward propagation. Copy data_ (value, in, grad, ids, cpuSequenceDims, sequenceStartPositions, subSequenceStartPositions, strs) to output_.

virtual void backward(const UpdateCallback &callback)

Data layer’s backward propagation do nothing.

virtual void copyOutputToOtherDevice()

Copy layer’s output_ to other device. If output layer is in other device, called after Layer::forward() function.

Protected Attributes

Argument data_

Fully Connected Layers

FullyConnectedLayer

class paddle::FullyConnectedLayer

A layer has full connections to all neurons in the previous layer. It computes an inner product with a set of learned weights, and (optionally) adds biases.

The config file api is fc_layer.

Inherits from paddle::Layer

Public Functions

FullyConnectedLayer(const LayerConfig &config)
~FullyConnectedLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)
virtual void prefetch()

If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

WeightList weights_
std::unique_ptr<Weight> biases_

SelectiveFullyConnectedLayer

class paddle::SelectiveFullyConnectedLayer

The SelectiveFullyConnectedLayer class.

SelectiveFullyConnectedLayer differs from FullyConnectedLayer by that it requires an additional input to indicate several selected columns, and only compute the multiplications between the input matrices and the selected columns of the parameter matrices of this layer. If the selected columns is not specified, SelectiveFullyConnected layer acts exactly like FullyConnectedLayer.

The config file api is selective_fc_layer.

Inherits from paddle::Layer

Public Functions

SelectiveFullyConnectedLayer(const LayerConfig &config)
~SelectiveFullyConnectedLayer()
virtual void prefetch()

If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)
void reserveOutput(size_t height, size_t width, size_t nnz)

Resize the output matrix size. And reset value to zero.

void fillSelectiveData(const std::shared_ptr<std::vector<std::pair<int *, size_t>>> &candidates)

Fill candidates to select several activations as output.

Note
CURRENTLY, THIS METHOD IS ONLY USED FOR BEAM SEARCH
Parameters
  • candidates -

    specifies several selected columns of the parameter matrices of this layer. Multiplications only between the input matrices and the selected columns are computed. If the candidates is a nullptr, selective fc layer acts exactly like the fully connected layer.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

WeightList weights_
std::unique_ptr<Weight> biases_

Conv Layers

ConvBaseLayer

class paddle::ConvBaseLayer

A Base Convolution Layer, which convolves the input image with learned filters and (optionally) adds biases.

Inherits from paddle::Layer

Subclassed by paddle::CudnnConvLayer, paddle::ExpandConvLayer

Public Functions

ConvBaseLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)
int outputSize(int imageSize, int filterSize, int padding, int stride)

Calculate output size based on caffeMode_.

  • input(+padding): 0123456789
  • imageSize(+padding) = 10;
  • filterSize = 3;
  • stride = 2;
  • caffeMode_ is true:
    • output: (012), (234), (456), (678)
    • outputSize = 4;
  • caffeMode_ is false:
    • output: (012), (234), (456), (678), (9)
    • outputSize = 5;

Protected Types

typedef std::vector<int> IntV

Protected Attributes

int numFilters_

The number of filters.

IntV padding_

The x dimension of the padding.

IntV paddingY_

The y dimension of the padding.

IntV stride_

The x dimension of the stride.

IntV strideY_

The y dimension of the stride.

IntV filterSize_

The x dimension of a filter kernel.

IntV filterSizeY_

The y dimension of a filter kernel.

IntV channels_

The spatial dimensions of the convolution input.

IntV imgSize_

The spatial dimensions of input feature map.

IntV imgPixels_

The total pixel size of input feature map. imgPixels_ = imgSizeX_ * imgSizeY_.

IntV filterPixels_

filterPixels_ = filterSizeX_ * filterSizeY_.

IntV filterChannels_

filterChannels_ = channels_/groups_.

IntV outputX_

The spatial dimensions of output feature map.

IntV outputs_

The spatial dimensions of output feature map.

IntV groups_

Group size, refer to grouped convolution in Alex Krizhevsky’s paper: when group=2, the first half of the filters are only connected to the first half of the input channels, and the second half only connected to the second half.

bool sharedBiases_

Whether the bias is shared for feature in each channel.

WeightList weights_

shape of weight: (numChannels * filterPixels_, numFilters)

std::unique_ptr<Weight> biases_

If shared_biases is false shape of bias: (numFilters_, 1) If shared_biases is ture shape of bias: (numFilters_ * outputX * outputY, 1)

bool caffeMode_

True by default. The only difference is the calculation of output size.

ConvOperator

class paddle::ConvOperator

ConvOperator takes two inputs to perform the convolution. The first input is the image, and the second input is the convolution kernel. The height of data for two inputs are the same. Each data of the first input is convolved with each data of the second input indepedently.

The config file api is conv_operator.

Inherits from paddle::Operator

Public Functions

ConvOperator(const OperatorConfig &config, bool useGpu)
virtual ~ConvOperator()

Free workspace in device and destroy cudnn tensor descriptor.

virtual void forward()
virtual void backward()

ConvShiftLayer

class paddle::ConvShiftLayer

A layer for circular convluation of two vectors, which is used in NEURAL TURING MACHINE.

  • Input: two vectors, the first is data (batchSize x dataDim) the second is shift weights (batchSize x shiftDim)

  • Output: a vector (batchSize x dataDim) Assumed that:

  • a[in]: contains M elements.

  • b[in]: contains N elements (N should be odd).

  • c[out]: contains M elements.

    \[ c[i] = \sum_{j=-(N-1)/2}^{(N-1)/2}a_{i+j} * b_{j} \]

In this formula:

  • a’s index is computed modulo M.
  • b’s index is comupted modulo N.

The config file api is conv_shift_layer.

Inherits from paddle::Layer

Public Functions

ConvShiftLayer(const LayerConfig &config)
~ConvShiftLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

CudnnConvLayer

class paddle::CudnnConvLayer

A subclass of ConvBaseLayer by cuDNN implementation. It only supports GPU mode. We automatic select CudnnConvLayer for GPU mode and ExpandConvLayer for CPU mode if you set type of “conv”. User also can specfiy type of “exconv” or “cudnn_conv” for particular type.

The config file api is img_conv_layer.

Inherits from paddle::ConvBaseLayer

Public Functions

CudnnConvLayer(const LayerConfig &config)
~CudnnConvLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. Initialize member variables and create tenor descriptor.

void reshape(int batchSize)

Reshape is done each forward. Reshape tensor decriptor inputDesc_, outputDesc_, convDesc_. And search the faster algo or the fastest algo within a given memeory limit.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

void addBiases()
void bpropBiases()

Protected Attributes

int imageH_
int imageW_
int outputH_
int outputW_
hl_tensor_descriptor biasDesc_

Cudnn tensor descriptor for bias.

std::vector<hl_tensor_descriptor> inputDesc_

Cudnn tensor descriptor for input.

std::vector<hl_tensor_descriptor> outputDesc_

Cudnn tensor descriptor for output.

std::vector<hl_filter_descriptor> filterDesc_

Cudnn tensor descriptor for filter.

std::vector<hl_convolution_descriptor> convDesc_

Cudnn tensor descriptor for a convolution operation.

IntV inputOffset_

One sample offset of input data.

IntV outputOffset_

One sample offset of output data.

IntV weightOffset_

One group offset of weight.

int biasOffset_

One group offset of bias.

std::vector<int> fwdAlgo_

Save the algorithm for forward convolution, which is obtained by cudnn api to search the best suited algorithm.

std::vector<int> bwdFilterAlgo_

Save the algorithm for computing convolution gradient with respect to filter coefficients.

std::vector<int> bwdDataAlgo_

Save the algorithm for computing convolution gradient with respect to the output.

std::vector<size_t> fwdLimitBytes_

Amount of GPU memory needed as workspace to be able to execute a forward convolution with the specified algo.

std::vector<size_t> bwdFilterLimitBytes_

Amount of GPU memory needed as workspace to be able to execute a backwardFilter with the specified algo.

std::vector<size_t> bwdDataLimitBytes_

Amount of GPU memory needed as workspace to be able to execute a backwardData with the specified algo.

std::vector<void *> workSpace_

Device work space address for each group.

int maxGroups_

Max number of groups.

void *workSpaceData_

Total work space address in device for all groups.

size_t workSpaceInBytes_

Size of total work space.

bool isSelectAlgo_

Is or not select conv algorihtm.

int batchNum_

batchNum is used to record batch size. If the batch size is changed, the selection algorithm will be called.

ExpandConvLayer

class paddle::ExpandConvLayer

A subclass of convolution layer. This layer expands input and use matrix multiplication to calculate convolution operation.

The config file api is img_conv_layer.

Inherits from paddle::ConvBaseLayer

Public Functions

ExpandConvLayer(const LayerConfig &config)
~ExpandConvLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

size_t getSize()
void resetExpandInput(size_t height, size_t width)

Create or resize expandInput_.

void resetConvOutput(size_t batchSize, int inIdx)

Create or resize transOutValue_.

void expandOneFrame(MatrixPtr image, size_t startIdx, int inIdx)

Expand one input sample.

void expandFwdOnce(MatrixPtr image, int inIdx, int startIdx)

Expand one input sample and perform matrix multiplication.

void addSharedBias()

Add shared bias.

void addUnsharedBias()

Add unshared bias.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

void bpropSharedBias(MatrixPtr biases, MatrixPtr v)
void bpropBiases(MatrixPtr v)
virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

void bpropWeights(MatrixPtr v, int inpIdx)
void bpropActs(MatrixPtr v, int inpIdx)

Protected Attributes

IntV subM_

For expand convolution. subM_ = numFilters_ / groups_.

IntV subN_

subN_ = outputH_ * outputW_.

IntV subK_

subK_ = channels_ * filterPixels_ * groups_.

IntV imgSizeH_

The spatial dimensions of height of input feature map.

IntV imgSizeW_

The spatial dimensions of width of input feature map.

IntV outputH_

The spatial dimensions of height of output feature map.

IntV outputW_

The spatial dimensions of width of output feature map.

MatrixPtr expandInput_

Expand one sample at a time. shape: (numChannels * filterPixels_, outputSizeH * outputSizeW)

MatrixPtr transOutValue_

The transpose of output, which is an auxiliary matrix.

ContextProjection

class paddle::ContextProjection

Context projection concatenate features in adjacent time steps in a sequence. The i-th row of the output is the concatenation of context_length rows of the input. The context_length rows are the consecutive rows from the i+shift_start row.

For example, assumed input (x) has 4 words and the dimension of each word representation is 2. If we use zero to pad instead of learned weight to pad, and the context_lenth is 3, the output (y) is:

x = [a1, a2;
     b1, b2;
     c1, c2;
     d1, d2]
y = [0,  0,  a1, a2, b1, b2;
     a1, a2, b1, b2, c1, c2;
     b1, b2, c1, c2, d1, d2;
     c1, c2, d1, d2, 0,  0]

The config file api is context_projection.

Inherits from paddle::Projection

Public Functions

ContextProjection(const ProjectionConfig &config, ParameterPtr parameter, bool useGpu)

Constructor. If context_start is zero and context_lenth is one, it will set trainable_padding false. trainable_padding is an optional arguments and if it is set, constructor will set learned weight, which is used to pad output.

virtual void forward()
virtual void backward(const UpdateCallback &callback)
virtual void resetState()

See comment in Layer.h for the function with the same name.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Get layer state. A copy of internal state is returned.

Protected Attributes

std::unique_ptr<Weight> weight_
size_t beginPad_

number of extra timesteps added at the beginning

size_t endPad_

number of extra timesteps added at the end

MatrixPtr state_

state_ and state2_ are used in sequence generating and saved previous inputs.

MatrixPtr state2_

Pooling Layers

PoolLayer

class paddle::PoolLayer

Basic parent layer of pooling Pools the input within regions.

Inherits from paddle::Layer

Subclassed by paddle::CudnnPoolLayer, paddle::PoolProjectionLayer

Public Functions

PoolLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

int outputSize(int imageSize, int windowSize, int padding, int stride)

Calculate output size according window size and padding size.

Public Static Functions

Layer *create(const LayerConfig &config)

create pooling layer by pool_type

Protected Attributes

size_t channels_
size_t sizeX_
size_t stride_
size_t outputX_
size_t imgSize_
int confPadding_
size_t sizeY_
size_t imgSizeY_
size_t strideY_
size_t outputY_
int confPaddingY_
std::string poolType_

PoolProjectionLayer

class paddle::PoolProjectionLayer

Basic parent layer of different kinds of pooling.

Inherits from paddle::PoolLayer

Subclassed by paddle::AvgPoolProjectionLayer, paddle::MaxPoolProjectionLayer

Public Functions

size_t getSize()
PoolProjectionLayer(const LayerConfig &config)

Protected Attributes

size_t imgSizeH_
size_t imgSizeW_
size_t outputH_
size_t outputW_

CudnnPoolLayer

class paddle::CudnnPoolLayer

CudnnPoolLayer is subclass of PoolLayer, which is implemented by cudnn api and only supports GPU.

The config file api is img_pool_layer.

Inherits from paddle::PoolLayer

Public Functions

CudnnPoolLayer(const LayerConfig &config)
~CudnnPoolLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void reshape(int batchSize)

Reshape input and output tensor descriptor. The batch size maybe change during training in last batch of each pass. So reshaping is needed.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Public Static Functions

bool typeCheck(const std::string &poolType, hl_pooling_mode_t *mode = nullptr)

Protected Attributes

int windowHeight
int windowWidth
int heightPadding
int widthPadding
int strideHeight
int strideWidth
int imageH_
int imageW_
int outputH_
int outputW_
hl_pooling_mode_t mode_

mode_ is poolint type, inlcuding “cudnn-max-pool”, “cudnn-avg-pool” “cudnn-avg-excl-pad-pool”.

hl_tensor_descriptor inputDesc_

cudnn tensor descriptor for input.

hl_tensor_descriptor outputDesc_

cudnn tensor descriptor for output.

hl_pooling_descriptor poolingDesc_

A description of a pooling operation.

Norm Layers

NormLayer

class paddle::NormLayer

Basic parent layer of normalization.

Note
Normalize the input in local region

Inherits from paddle::Layer

Subclassed by paddle::ResponseNormLayer

Public Functions

NormLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

Public Static Functions

Layer *create(const LayerConfig &config)

create norm layer by norm_type

CMRProjectionNormLayer

class paddle::CMRProjectionNormLayer

response normalization across feature maps namely normalize in number of size_ channels

Inherits from paddle::ResponseNormLayer

Public Functions

CMRProjectionNormLayer(const LayerConfig &config)
~CMRProjectionNormLayer()
size_t getSize()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

DataNormLayer

class paddle::DataNormLayer

A layer for data normalization.

  • Input: One and only one input layer is accepted. The input layer must be DataLayer with dense data type.
  • Output: The normalization of the input data

Reference: LA Shalabi, Z Shaaban, B Kasasbeh. Data mining: A preprocessing engine

Three data normalization methoeds are considered

  • z-score: y = (x-mean)/std
  • min-max: y = (x-min)/(max-min)
  • decimal-scaling: y = x/10^j, where j is the smallest integer such that max(|y|)<1

Inherits from paddle::Layer

Public Types

enum NormalizationStrategy

Values:

kZScore = 0
kMinMax = 1
kDecimalScaling = 2

Public Functions

DataNormLayer(const LayerConfig &config)
~DataNormLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

int mode_
std::unique_ptr<Weight> weight_
MatrixPtr min_
MatrixPtr rangeReciprocal_
MatrixPtr mean_
MatrixPtr stdReciprocal_
MatrixPtr decimalReciprocal_

ResponseNormLayer

class paddle::ResponseNormLayer

response normalization within feature maps namely normalize in independent channel When code refactoring, we delete the original implementation. Need to implement in the futrue.

Inherits from paddle::NormLayer

Subclassed by paddle::CMRProjectionNormLayer

Public Functions

ResponseNormLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

size_t channels_
size_t size_
size_t outputX_
size_t imgSize_
float scale_
float pow_
MatrixPtr denoms_

BatchNormBaseLayer

class paddle::BatchNormBaseLayer

Batch normalization layer use to normalizes the input to across the batch.

By default, calculating global mean and variance statistics via a running average in the training peroid. Then the pre-calculated global mean and variance are used for testing.

Moving mean and variance are located in Parameter object when constructing and the calculation will change them. Now we only save global mean and variance of one thread in first node for GPU. But the calculation in CPU is different, because parameters are shared by multiple threads. Here using ShareCpuMatrix with lock to calculate. We still save global mean and variance in first node in CPU when multi machine.

[1] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” arXiv preprint arXiv:1502.03167 (2015).

Inherits from paddle::Layer

Subclassed by paddle::BatchNormalizationLayer, paddle::CudnnBatchNormLayer

Public Functions

BatchNormBaseLayer(const LayerConfig &config)
~BatchNormBaseLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void calFeatureMapSize()

Calculate feature map size. Some input uses frameHeight and frameWidth to store feature size.

Public Static Functions

static Layer *create(const LayerConfig &config)

Create BatchNorm layer by norm_type, including batch_norm and cudnn_batch_norm. If do not set norm_type, it will automatically select cudnn_batch_norm for GPU and batch_norm for CPU.

Protected Attributes

std::unique_ptr<Weight> weight_

Batch normalization scale parameter, which is referred to as gamma in in original paper.

std::unique_ptr<Weight> movingMean_

Moving average of mean.

std::unique_ptr<Weight> movingVar_

Moving average of variance.

std::unique_ptr<Weight> biases_

Batch normalization bias parameter, which is referred to as beta in in original paper.

MatrixPtr savedMean_

Save intermediate results computed during the forward pass, these can then be reused to speed up the backward pass.

MatrixPtr savedInvVar_
int imgSize_

Height or width of input image feature, now height is equal to width. imgSize is 1 if the input is fully-connected layer.

int imageH_
int imageW_
int imgPixels_

Height * Width.

int channels_

Feature dimension. If the input layer is conv layer, it is the channels of feature map of the conv layer. If the input layer is fully-connected layer, it is the dimension of fc layer.

bool useGlobalStats_
real movingAvgFraction_

BatchNormalizationLayer

class paddle::BatchNormalizationLayer

A Inheritance class of Batch normalization layer. It supports both CPU and GPU.

The config file api is batch_norm_layer.

Inherits from paddle::BatchNormBaseLayer

Public Functions

BatchNormalizationLayer(const LayerConfig &config)
~BatchNormalizationLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

void setMeanAndStd()

Load pre-calculated mean and std.

void calMeanAndStd(const MatrixPtr &mat)

Calculate mean and std.

void calMovingMeanAndVar()

Calculate moving mean and variance.

void expandMat(const MatrixPtr &in, MatrixPtr &out)

expand a Matrix from batch, channels* imagePixels to batch * ImagePixels * channels.

void shrinkMat(const MatrixPtr &in, MatrixPtr &out)

Shrink a Matrix from from batch * ImagePixels * channels to batch, channels* imagePixels.

Protected Attributes

MatrixPtr tmpMat_
MatrixPtr tmpGrad_
MatrixPtr expandedIn_
MatrixPtr expandedOut_
MatrixPtr expandedInGrad_
MatrixPtr expandedOutGrad_
MatrixPtr inGrad_
MatrixPtr normIn_
MatrixPtr normInGrad_
MatrixPtr meanGrad_
MatrixPtr stdGrad_
bool firstTest_

Load mean and variance only once flag.

Protected Static Attributes

const real EPS

Epsilon value used in the batch normalization formula.

CudnnBatchNormLayer

class paddle::CudnnBatchNormLayer

Cudnn Batch normalization layer use to cuDNN lib to implentment.

The config file api is batch_norm_layer.

Note
Cudnn version must >= v4.0, and better to use the latest version (v5.1).

Inherits from paddle::BatchNormBaseLayer

Public Functions

CudnnBatchNormLayer(const LayerConfig &config)
~CudnnBatchNormLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void reshape(int batchSize)

reshape tensor of ioDesc_.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

hl_tensor_descriptor ioDesc_

Input/output tensor descriptor desc.

hl_tensor_descriptor bnParamDesc_

Shared tensor descriptor desc for the 6 tenros: bnScale, bnBias, running mean/var, save_mean/var

MatrixPtr tmpWGrad_

The gradient of weight and bias in cudnn api can not be empty. If set is_static for weight or bias, it will not allocate memory for them, and the gradient is NULL. In this case, will use two matrix.

MatrixPtr tmpBiasGrad_

Protected Static Attributes

const double EPS

Epsilon value used in the batch normalization formula. Minimum allowed value is CUDNN_BN_MIN_EPSILON defined in cudnn.h. Same epsilon value should be used in forward and backward functions.

SumToOneNormLayer

class paddle::SumToOneNormLayer

A layer for sum-to-one normalization, which is used in NEURAL TURING MACHINE.

\[ out[i] = \frac {in[i]} {\sum_{k=1}^N in[k]} \]
where \(in\) is a (batchSize x dataDim) input vector, and \(out\) is a (batchSize x dataDim) output vector.

The config file api is sum_to_one_norm_layer.

Inherits from paddle::Layer

Public Functions

SumToOneNormLayer(const LayerConfig &config)
~SumToOneNormLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr reciprocalRowSum_

reciprocalRowSum_ = \(1 / \sum_{k=1}^N in[k]\)

MatrixPtr dotSum_

dotSum = output_.grad \(.*\) output_.value

Activation Layer

ParameterReluLayer

class paddle::ParameterReluLayer

ParameterReluLayer active inputs with learnable parameter weight_. forward:

\[ y = x > 0 ? x : w .* x \]
backward:
\[\begin{split} dx = x > 0 ? dy : w .* dy \\ dw = x > 0 ? 0 : dy.*x \end{split}\]
Here, x is the input, w is the weight, y is the output. dx, dw, dy is the gradient.

Inherits from paddle::Layer

Public Functions

ParameterReluLayer(const LayerConfig &config)
~ParameterReluLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> weight_
size_t partialSum_

partialSum_ makes a group of inputs share same weights,

  • partialSum_ = 1: element wise activation: each element has a weight_,
  • partialSum_ = number of elements in one channel, channels wise parameter activation, elements in a channel share same weight_,
  • partialSum_ = number of outputs all elements share same weight_,

Recurrent Layers

RecurrentLayer

class paddle::RecurrentLayer

RecurrentLayer takes 1 input layer. The output size is the same with input layer. For each sequence [start, end] it performs the following computation:

\[\begin{split} out_{i} = act(in_{i}) \ \ \text{for} \ i = start \\ out_{i} = act(in_{i} + out_{i-1} * W) \ \ \text{for} \ start < i <= end \end{split}\]
If reversed is true, the order is reversed:
\[\begin{split} out_{i} = act(in_{i}) \ \ \text{for} \ i = end \\ out_{i} = act(in_{i} + out_{i+1} * W) \ \ \text{for} \ start <= i < end \end{split}\]
There are two methods to calculate rnn. One way is to compute rnn one sequence by one sequence. The other way is to reorganize the input into batches, then compute rnn one batch by one batch. Users can select them by rnn_use_batch flag.

Inherits from paddle::Layer

Public Functions

RecurrentLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

virtual void resetState()

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Get layer state.

Return
A copy of internal state.

Protected Functions

void forwardSequence(int batchSize, size_t numSequences, const int *starts)

If user do not set rnn_use_batch=true, it will compute rnn forward one sequence by one sequence in default.

Parameters
  • batchSize -

    Total words number of all samples in this batch.

  • numSequences -

    The sample number.

  • starts -

    Each start position of each samples.

void forwardOneSequence(int start, int length)

Compute rnn forward by one sequence.

Parameters
  • start -

    The start position of this sequence (or sample).

  • length -

    The length of this sequence (or sample), namely the words number of this sequence.

void backwardSequence(int batchSize, size_t numSequences, const int *starts)

Compute rnn backward one sequence by onesequence.

Parameters
  • batchSize -

    Total words number of all samples in this batch.

  • numSequences -

    The sample number.

  • starts -

    Each start position of each samples.

void backwardOneSequence(int start, int length)

Compute rnn backward by one sequence.

Parameters
  • start -

    The start position of this sequence (or sample).

  • length -

    The length of this sequence (or sample), namely the words number of this sequence.

void forwardBatch(int batchSize, size_t numSequences, const int *starts)

Reorganize input into batches and compute rnn forward batch by batch. It will convert batch shape to sequence after finishing forward. The batch info can refer to SequenceToBatch class.

Parameters
  • batchSize -

    Total words number of all samples in this batch.

  • numSequences -

    The sample number.

  • starts -

    Each start position of each samples.

void backwardBatch(int batchSize, size_t numSequences, const int *starts)

Reorganize input into batches and compute rnn forward batch by batch.

Parameters
  • batchSize -

    Total words number of all samples in this batch.

  • numSequences -

    The sample number.

  • starts -

    Each start position of each samples.

Protected Attributes

std::unique_ptr<Weight> weight_
std::unique_ptr<Weight> bias_
std::vector<Argument> frameOutput_

frameOutput_[i] is used to hold the i-th sample of output_

MatrixPtr prevOutput_
bool reversed_

Whether compute rnn by reverse.

std::unique_ptr<SequenceToBatch> batchValue_

If compute batch by batch, batchValue_ will be used to save the reorganized input value.

std::unique_ptr<SequenceToBatch> batchGrad_

If compute batch by batch, batchGrad_ will be used to save the gradient with respect to reorganized input value.

SequenceToBatch

class paddle::SequenceToBatch

Public Functions

SequenceToBatch(bool useGpu)
void resizeOrCreateBatch(int batchSize, size_t numSequences, const int *seqStarts, bool reversed, bool prevBatchState = false)
void copy(Matrix &seqValue, Matrix &batchValue, bool seq2batch)
void add(Matrix &seqValue, Matrix &batchValue, bool seq2batch)
MatrixPtr getBatchValue(Matrix &batchValue, int batchId, int numRows = 0)
size_t getNumBatch() const
void resizeOrCreate(Matrix &seqValue)
void copyFromSeq(Matrix &seqValue)
void copyBackSeq(Matrix &seqValue)
MatrixPtr getBatchValue(int batchId, int numRows = 0)
MatrixPtr getBatchValue()
void prevOutput2Batch(Matrix &src, Matrix &dst)
void getSeqOutputFromBatch(Matrix &sequence, Matrix &batch)
void shareIndexWith(const SequenceToBatch &seq2batch)

Protected Functions

void sequence2BatchCopy(Matrix &batch, Matrix &sequence, IVector &seq2BatchIdx, bool seq2batch)
void sequence2BatchAdd(Matrix &batch, Matrix &sequence, IVector &seq2BatchIdx, bool seq2batch)

Protected Attributes

IVectorPtr batchStartPositions_
IVectorPtr seq2BatchIdx_
IVectorPtr cpuSeq2BatchIdx_
IVectorPtr cpuSeqIdx_
IVectorPtr cpuSeqEndIdxInBatch_
IVectorPtr seqIdx_
IVectorPtr seqEndIdxInBatch_
size_t numBatch_
bool useGpu_
MatrixPtr batchValue_

LSTM

LstmLayer

class paddle::LstmLayer

LstmLayer takes 1 input layer with size * 4. Input layer is diveded into 4 equal parts: (input_s, input_ig, input_fg, input_og)

For each sequence [start, end] it performs the following computation:

output_{i} = actState(state_{i}) * actGate(outputGate_{i})
state_{i} = actInput(input_s_{i} + bias_s +
            output_{i-1} * recurrIW) * actGate(inputGate_{i}) +
            actGate(forgetGate_{i}) * state_{i-1}
inputGate = input_ig_{i} + bias_ig + output_{i-1} * recurrIGW +
            state_{i-1} * inputCheck
ouputGate = input_og_{i} + bias_og + output_{i-1} * recurrOGW +
            state_{i} * outputCheck
forgetGate = input_fg_{i} + bias_fg + output_{i-1} * recurrFGW +
             state_{i-1} * forgetCheck

  • parameter[0] consists of (recurrIW, recurrIGW, recurrFGW, recurrOGW)
  • baisParameter consists of (bias_s, bias_ig, bias_og, bias_fg, inputCheck, forgetCheck, outputCheck)
  • actInput is defined by config active_type.
  • actState is defined by config active_state_type.
  • actGate is defined by config actvie_gate_type.

There are two ways to compute, namely one sequence by one sequence or one batch by one batch. By default and no setting pre_batch_state true, it will compute batch by batch.

The formula in the paper is as follows:

\[\begin{split} i_t = \sigma(W_{xi}x_{t} + W_{hi}h_{t-1} + W_{ci}c_{t-1} + b_i) \\ f_t = \sigma(W_{xf}x_{t} + W_{hf}h_{t-1} + W_{cf}c_{t-1} + b_f) \\ \tilde{c_t} = tanh (W_{xc}x_t+W_{hc}h_{t-1} + b_c) \\ o_t = \sigma(W_{xo}x_{t} + W_{ho}h_{t-1} + W_{co}c_t + b_o) \\ c_t = f_t * c_{t-1} + i_t * \tilde{c_t} \\ h_t = o_t tanh(c_t) \end{split}\]

The weight ([size, 4*size]) contains \(W_{hi}, W_{hf}, W_{hc}, W_{ho}\). The bias contains \(b_i, b_f, b_c, b_o\) and \(W_{ci}, W_{cf}, W_{co}\).

Note
These \(W_{xi}x_{t}, W_{xf}x_{t}, W_{xc}x_{t}, W_{xo}x_{t}\) operations on the input sequence were NOT included in LstmLayer. So users should use fc_layer or mixed_layer before lstm_later.

Inherits from paddle::Layer, paddle::LstmCompute

Subclassed by paddle::MDLstmLayer

Public Functions

LstmLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

virtual void resetState()

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Get layer state.

Return
A copy of internal state.

Protected Functions

void forwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)

Compute lstm forward one sequence by one sequence.

Parameters
  • batchSize -

    The batchSize is not equal to the batch_size in the config file. It is the total words number of all samples in this forward batch.

  • numSequences -

    The sample number. It is equal to the batch_size in the config file.

  • starts -

    Each start position of each samples.

  • inputValue -

    The input values.

void backwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)

Compute lstm backward one sequence by one sequence.

void forwardBatch(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)

Compute lstm forward one batch by one batch. The batch value is reorganized by SequenceToBatch class. The batch output value will be convert into sequence value after finishing forward. Here, one batch contains one word of each sample. If the length of each sample is not equality, the batch will not pads zero and contains less words. The total batch numbers are the max length of the sequence. The details can refer to SequenceToBatch class. On GPU mode, it will launch GPU kernel for loop.

for (int i = 0; i < numBatch(max_sequence_length); ++i) {
  compute one batch.
}

void backwardBatch(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)

Compute lstm backward one batch by one batch.

void forwardSeqParallel(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)

This function only supports GPU. It not need to reorganize input into batch value. It will launch one kernel to parallelly compute forward propagation in sequence level.

void backwardSeqParallel(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)

Backward propagation corresponding to forwardSeqParallel.

void getPrevBatchOutput(size_t numSequences)

This function is used for sequence generation and get output after forwardBatch.

void getPrevBatchState(size_t numSequences)

This function is used for sequence generation and get state after forwardBatch.

Protected Attributes

std::unique_ptr<Weight> weight_

Learned parameters, shape: (size, 4*size). The weight ([size, 4*size]) contains \(W_{hi}, W_{hf}, W_{hc}, W_{ho}\).

std::unique_ptr<Weight> bias_

Learned bias parameter, shape: (1, 7 * size). The bias contains \(b_i, b_f, b_c, b_o\) and \(W_{ci}, W_{cf}, W_{co}\).

MatrixPtr localBias_

The reeal bias, point to \(b_i, b_f, b_c, b_o\).

MatrixPtr checkIg_

The peephole connection for input gate.

MatrixPtr checkFg_

The peephole connection for forget gate.

MatrixPtr checkOg_

The peephole connection for output gate.

MatrixPtr localBiasGrad_

The gradient of real bias.

MatrixPtr checkIgGrad_

The gradient of peephole connection for input gates.

MatrixPtr checkFgGrad_

The gradient of peephole connection for forget gates.

MatrixPtr checkOgGrad_

The gradient of peephole connection for output gates.

Argument state_

Stores the cell state of previous time step, namely \(c_{t-1}\).

Argument preOutput_

Stores the hidden of previous time step, namely \(h_{t-1}\).

Argument gate_

Stores the value and gradient of four gates, namely \(i_t, f_t, o_t, c_t\).

bool reversed_

Whether it is reversed lstm.

bool useBatch_

Whether to use batch method to compute.

bool useSeqParallel_

Whether to use sequence parallell method to compute.

std::unique_ptr<SequenceToBatch> batchValue_

batchValue_ is used in method of batch calculation. It stores the batch value after reorganized input.

std::unique_ptr<SequenceToBatch> batchGrad_

The gradient of batchValue_.

MatrixPtr prevState_

Used in generation and stores the state of previous time step.

MatrixPtr prevOutput_

Used in generation and stores the output of previous time step.

MatrixPtr prevBatchOutput2_
MatrixPtr totalState_

The total state.

LstmStepLayer

class paddle::LstmStepLayer

Inherits from paddle::Layer, paddle::LstmCompute

Public Functions

LstmStepLayer(const LayerConfig &config)
~LstmStepLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

Argument state_
Argument gate_
Argument stateActive_
MatrixPtr checkIg_
MatrixPtr checkFg_
MatrixPtr checkOg_
MatrixPtr checkIgGrad_
MatrixPtr checkFgGrad_
MatrixPtr checkOgGrad_
std::unique_ptr<Weight> weight_

LstmCompute

class paddle::LstmCompute

Subclassed by paddle::LstmLayer, paddle::LstmStepLayer

Public Functions

void init(LayerConfig &config)
template <bool useGpu>
void forwardBatch(hl_lstm_value value, int frameSize, int batchSize)

LstmLayer batch compute API (forwardBatch, backwardBatch). If use batch compute api, lstm value(and grad) need to be batch structure. Compute order: forwardBatch: for 0 <= id < numBatch backwardBatch: for numBatch > id >= 0

template <bool useGpu>
void backwardBatch(hl_lstm_value value, hl_lstm_grad grad, int frameSize, int batchSize)
template <bool useGpu>
void forwardOneSequence(hl_lstm_value value, int frameSize)

LstmLayer sequence compute API (forwardOneSequence, backwardOneSequence). Compute order(for each sequence): forwardOneSequence: if (!reversed) for 0 <= seqId < seqLength if (reversed) for seqLength > seqId >= 0 backwardOneSequence: if (!reversed) for seqLength > seqId >= 0 if (reversed) for 0 <= seqId < seqLength

template <bool useGpu>
void backwardOneSequence(hl_lstm_value value, hl_lstm_grad grad, int frameSize)
template <>
void forwardOneSequence(hl_lstm_value value, int frameSize)
template <>
void backwardOneSequence(hl_lstm_value value, hl_lstm_grad grad, int frameSize)
template <>
void forwardBatch(hl_lstm_value value, int frameSize, int batchSize)
template <>
void backwardBatch(hl_lstm_value value, hl_lstm_grad grad, int frameSize, int batchSize)
template <>
void forwardBatch(hl_lstm_value value, int frameSize, int batchSize)
template <>
void backwardBatch(hl_lstm_value value, hl_lstm_grad grad, int frameSize, int batchSize)
template <>
void forwardOneSequence(hl_lstm_value value, int frameSize)
template <>
void backwardOneSequence(hl_lstm_value value, hl_lstm_grad grad, int frameSize)

Public Members

hl_activation_mode_t activeNode_
hl_activation_mode_t activeGate_
hl_activation_mode_t activeState_

MDLSTM

MDLstmLayer

class paddle::MDLstmLayer

Inherits from paddle::LstmLayer

Public Functions

MDLstmLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

void forwardOneSequence(int start, CoordIterator &coordIter)
void backwardOneSequence(int start, CoordIterator &coordIter)
void forwardGate2OutputSequence(int start, CoordIterator &coordIter)
void backwardGate2OutputSequence(int start, CoordIterator &coordIter)

Protected Attributes

std::vector<Argument> frameInputGate_
std::vector<Argument> frameForgetGate_
std::vector<Argument> frameOutputGate_
std::vector<Argument> frameInputNode_
std::vector<Argument> frameGate_
std::vector<Argument> frameState_
std::vector<Argument> framePreOutput_
std::vector<Argument> frameOutput_
std::unique_ptr<ActivationFunction> activationGate_
std::unique_ptr<ActivationFunction> activationState_
int numDims_
size_t numBlocks_
std::vector<bool> directions_
std::vector<int> delays_
std::vector<std::vector<int>> dimsV_

CoordIterator

class paddle::CoordIterator

Public Functions

void step(size_t d, bool reversed)
CoordIterator(std::vector<int> dim, std::vector<bool> directions)
CoordIterator &operator++()
CoordIterator &operator--()
std::vector<int> &curPos()
int offset()
int offset(const std::vector<int> &pos)
std::vector<int> &begin()
std::vector<int> &rbegin()
bool end()
bool getPrePos(const std::vector<int> &delays, int idx, std::vector<int> &prePos)
bool getNextPos(const std::vector<int> &delays, int idx, std::vector<int> &nextPos)

Public Members

std::vector<int> dims_
std::vector<bool> directions_
std::vector<int> curPos_
bool end_

GRU

GatedRecurrentLayer

class paddle::GatedRecurrentLayer

Please refer to “Junyoung Chung, Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling”.

GatedRecurrentLayer takes 1 input layer with size * 3. Input layer is diveded into 3 equal parts: (xz_t, xr_t, xi_t). parameter and biasParameter is also diveded into 3 equal parts:

  • parameter consists of (U_z, U_r, U)
  • baisParameter consists of (bias_z, bias_r, bias_o)

\[\begin{split} update \ gate: z_t = actGate(xz_t + U_z * h_{t-1} + bias_z) \\ reset \ gate: r_t = actGate(xr_t + U_r * h_{t-1} + bias_r) \\ output \ candidate: {h}_t = actNode(xi_t + U * dot(r_t, h_{t-1}) + bias_o) \\ hidden \ activation: h_t = dot((1-z_t), h_{t-1}) + dot(z_t, {h}_t) \\ \end{split}\]

The config file is grumemory.

Note
  • dot denotes “element-wise multiplication”.
  • actNode is defined by config active_type
  • actGate is defined by config actvie_gate_type

Inherits from paddle::Layer, paddle::GruCompute

Public Functions

GatedRecurrentLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

virtual void resetState()

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

virtual void setState(LayerStatePtr state)

Set layer state.

virtual LayerStatePtr getState()

Get layer state.

Return
A copy of internal state.

Protected Functions

void forwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)
void backwardSequence(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputGrad)
void forwardBatch(int batchSize, size_t numSequences, const int *starts, MatrixPtr inputValue)
void backwardBatch(int batchSize, MatrixPtr inputGrad)

Protected Attributes

std::unique_ptr<Weight> weight_
std::unique_ptr<Weight> gateWeight_
std::unique_ptr<Weight> stateWeight_
std::unique_ptr<Weight> bias_
Argument gate_
Argument resetOutput_
bool reversed_
bool useBatch_
std::unique_ptr<SequenceToBatch> batchValue_
std::unique_ptr<SequenceToBatch> batchGrad_
std::unique_ptr<ActivationFunction> activationGate_
MatrixPtr prevOutput_

GruStepLayer

class paddle::GruStepLayer

GruStepLayer is like GatedRecurrentLayer, but used in recurrent layer group. GruStepLayer takes 2 input layer.

  • input[0] with size * 3 and diveded into 3 equal parts: (xz_t, xr_t, xi_t).
  • input[1] with size: {prev_out}.

parameter and biasParameter is also diveded into 3 equal parts:

  • parameter consists of (U_z, U_r, U)

  • baisParameter consists of (bias_z, bias_r, bias_o)

    \[\begin{split} update \ gate: z_t = actGate(xz_t + U_z * prev_out + bias_z) \\ reset \ gate: r_t = actGate(xr_t + U_r * prev_out + bias_r) \\ output \ candidate: {h}_t = actNode(xi_t + U * dot(r_t, prev_out) + bias_o) \\ output: h_t = dot((1-z_t), prev_out) + dot(z_t, prev_out) \end{split}\]

The config file api if gru_step_layer.

Note
  • dot denotes “element-wise multiplication”.
  • actNode is defined by config active_type
  • actGate is defined by config actvie_gate_type

Inherits from paddle::Layer, paddle::GruCompute

Public Functions

GruStepLayer(const LayerConfig &config)
~GruStepLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

Argument gate_
Argument resetOutput_
std::unique_ptr<Weight> weight_
std::unique_ptr<Weight> bias_

GruCompute

class paddle::GruCompute

Subclassed by paddle::GatedRecurrentLayer, paddle::GruStepLayer

Public Functions

void init(LayerConfig &config)
template <bool useGpu>
void forward(hl_gru_value value, int frameSize, int batchSize = 1)
template <bool useGpu>
void backward(hl_gru_value value, hl_gru_grad grad, int frameSize, int batchSize = 1)
template <>
void forward(hl_gru_value value, int frameSize, int batchSize)
template <>
void backward(hl_gru_value value, hl_gru_grad grad, int frameSize, int batchSize)
template <>
void forward(hl_gru_value value, int frameSize, int batchSize)
template <>
void backward(hl_gru_value value, hl_gru_grad grad, int frameSize, int batchSize)

Public Members

hl_activation_mode_t activeNode_
hl_activation_mode_t activeGate_

Recurrent Layer Group

AgentLayer

class paddle::AgentLayer

AgentLayer use as a virtual input of another layer in config, before execute forward/backward, setRealLayer() should be called to set one and only one real layer

Inherits from paddle::Layer

Subclassed by paddle::SequenceAgentLayer

Public Functions

AgentLayer(const LayerConfig &config)
~AgentLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void setRealLayer(LayerPtr layer, int numSamples = 0)
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

LayerPtr realLayer_
int numSamples_

SequenceAgentLayer

class paddle::SequenceAgentLayer

like AgentLayer, but use first numSamples sequences

Inherits from paddle::AgentLayer

Public Functions

SequenceAgentLayer(const LayerConfig &config)
~SequenceAgentLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

GatherAgentLayer

class paddle::GatherAgentLayer

Like AgentLayer, but it can gather many real layers. Each real layer give a few rows of a sequence, after gather all real layers, GatherAgentLayer collect a complete sequence.

Inherits from paddle::Layer

Subclassed by paddle::SequenceGatherAgentLayer

Public Functions

GatherAgentLayer(const LayerConfig &config)
virtual ~GatherAgentLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void copyIdAndSequenceInfo(const Argument &input, const IVectorPtr &allIds, const std::vector<int> &idIndex)
void addRealLayer(LayerPtr layer)
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::vector<LayerPtr> realLayers_
std::vector<IVectorPtr> idsVec_
IVectorPtr allIds_
std::vector<int> idIndex_

SequenceGatherAgentLayer

class paddle::SequenceGatherAgentLayer

Like GatherAgentLayer, but select a few sequence in real layer. ids in addRealLayer() are the ids of selected sequence. It’s used to reorder sequence output.

Inherits from paddle::GatherAgentLayer

Public Functions

SequenceGatherAgentLayer(const LayerConfig &config)
virtual ~SequenceGatherAgentLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

ScatterAgentLayer

class paddle::ScatterAgentLayer

Like AgentLayer, but only select a few rows in real layer. [idIndex, idIndex + idSize) of ids in setRealLayerAndOutput() are the selected row ids. It’s used to scatter one layer’s output to many small submodels. ScatterAgentLayer can support ids real layer, if it is, the agent will select a few ids in real layer.

Inherits from paddle::Layer

Subclassed by paddle::SequenceScatterAgentLayer

Public Functions

ScatterAgentLayer(const LayerConfig &config)
virtual ~ScatterAgentLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void setRealLayer(LayerPtr layer, const std::vector<int> &ids, bool copyId = false)

set real layer in generation

Parameters

void setRealLayerAndOutput(LayerPtr layer, const Argument &outArg, const IVectorPtr &ids, int idIndex, int idSize)
void setSequenceStartPositions(const ICpuGpuVectorPtr &sequenceStartPositions, int seqStartPosIndex, int numSequences)
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

LayerPtr realLayer_
IVectorPtr ids_
IVectorPtr cpuIds_
Argument realOutArg_
int idIndex_
int idSize_
int seqStartPosIndex_
int numSequences_

SequenceScatterAgentLayer

class paddle::SequenceScatterAgentLayer

Like ScatterAgentLayer, but select a few sequence in real layer. ids in setRealLayer() or setRealLayerAndOutput() are the ids of selected sequence. It’s used to reorder sequence input.

Inherits from paddle::ScatterAgentLayer

Public Functions

SequenceScatterAgentLayer(const LayerConfig &config)
virtual ~SequenceScatterAgentLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

ICpuGpuVectorPtr inputStartPos_

GetOutputLayer

class paddle::GetOutputLayer

Inherits from paddle::Layer

Public Functions

GetOutputLayer(const LayerConfig &config)
~GetOutputLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Mixed Layer

class paddle::MixedLayer

A mixed layer has multiple input layers. Each input layer was processed by a Projection or Operator. The results of all projections or Operators are summed together with bias (if configured), and then go through an activation function and dropout (if configured).

The config file api is mixed_layer.

Inherits from paddle::Layer

Public Functions

MixedLayer(const LayerConfig &config)
~MixedLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void prefetch()

If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

virtual void resetState()

Reset the internal state variables. Allocate them if they have not been allocated. This function need to called before Layer::forward() for generating sequence.

This is used for sequence generation. When generating sequence, the calculation at current timestamp depends on the state from previous timestamp. The model needs to keep the information about the previous timestamp in the state variables. Layers such as RecurrentLayer, LstmLayer and ContextLayer have state variables.

virtual void setState(LayerStatePtr state)

setState() should be called after getState(). Argument state consists of all projections states.

virtual LayerStatePtr getState()

Return state which consists of all projections states.

Protected Attributes

std::vector<std::unique_ptr<Projection>> projections_
std::vector<std::unique_ptr<Operator>> operators_
std::vector<int> projectionStateMatrixSize_

the matrix size of projection state

std::unique_ptr<Weight> biases_

DotMulProjection

class paddle::DotMulProjection

DotMulProjection performs element-wise multiplication with weight:

\[ out.row[i] += in.row[i] .* weight \]
where \(.*\) means element-wise multiplication.

The config file api is dotmul_projection.

Inherits from paddle::Projection

Public Functions

DotMulProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)
virtual void forward()
virtual void backward(const UpdateCallback &callback)

Protected Attributes

std::unique_ptr<Weight> weight_

shared memory with parameter

DotMulOperator

class paddle::DotMulOperator

DotMulOperator takes two inputs, performs element-wise multiplication:

\[ out.row[i] += scale * (in1.row[i] .* in2.row[i]) \]
where \(.*\) means element-wise multiplication, and scale is a config scalar, its default value is one.

The config file api is dotmul_operator.

Inherits from paddle::Operator

Public Functions

DotMulOperator(const OperatorConfig &config, bool useGpu)
virtual void forward()
virtual void backward()

FullMatrixProjection

class paddle::FullMatrixProjection

FullMatrixProjection performs full matrix multiplication:

\[ out.row[i] += in.row[i] * weight \]

The config file api is full_matrix_projection.

Inherits from paddle::Projection

Public Functions

FullMatrixProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)
virtual void forward()
virtual void backward(const UpdateCallback &callback)

Protected Attributes

std::unique_ptr<Weight> weight_

IdentityProjection

class paddle::IdentityProjection

IdentityProjection performs addition:

\[ out.row[i] += in.row[i] \]

The config file api is identity_projection.

Inherits from paddle::Projection

Public Functions

IdentityProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)

Constructed function.

Note
IdentityProjection should not have any parameter.

virtual void forward()
virtual void backward(const UpdateCallback &callback)

IdentityOffsetProjection

class paddle::IdentityOffsetProjection

IdentityOffsetProjection likes IdentityProjection, but layer size may be smaller than input size. It selects dimensions [offset, offset+layer_size) from input to perform addition:

\[ out.row[i] += in.row[i + \textrm{offset}] \]

The config file api is identity_projection.

Inherits from paddle::Projection

Public Functions

IdentityOffsetProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)

Constructed function.

Note
IdentityOffsetProjection should not have any parameter.

virtual void forward()
virtual void backward(const UpdateCallback &callback)

TableProjection

class paddle::TableProjection

Table projection takes index data input. It select rows from parameter where row_id is in input_ids:

\[ out.row[i] += table.row[ids[i]] \]
where \(out\) is out, \(table\) is parameter, \(ids\) is input_ids, and \(i\) is row_id.

The config file api is table_projection.

Note
If \(ids[i] = -1\), it will be ignored.

Inherits from paddle::Projection

Public Functions

TableProjection(const ProjectionConfig &config, const ParameterPtr &parameter, bool useGpu)
virtual void prefetch(const Argument *in)

If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual void forward()
virtual void backward(const UpdateCallback &callback)

Protected Attributes

std::unique_ptr<Weight> table_

TransposedFullMatrixProjection

class paddle::TransposedFullMatrixProjection

TransposedFullMatrixProjection performs full matrix multiplication: out.row[i] += in.row[i] * weight.transpose.

The config file api is trans_full_matrix_projection.

Inherits from paddle::Projection

Public Functions

TransposedFullMatrixProjection(const ProjectionConfig &config, ParameterPtr parameter, bool useGPu)
virtual void forward()
virtual void backward(const UpdateCallback &callback)

Protected Attributes

std::unique_ptr<Weight> weight_

Aggregate Layers

Aggregate

AverageLayer

class paddle::AverageLayer

A layer for “internal average” for sequence input. Input: one or more sequences. Each sequence contains some instances. If AverageLevel = kNonSeq: Output: output size is the number of input sequences (NOT input instances) output[i] = average_{for each instance in this sequence}{input[i]} If AverageLevel = kSeq: Check input sequence must has sub-sequence Output: output size is the number of input sub-sequences output[i] = average_{for each instance in this sub-sequence}{input[i]}

Inherits from paddle::Layer

Public Types

enum AverageStrategy

Values:

kAverage = 0
kSum = 1
kAverageSquareRootN = 2
enum AverageLevel

Values:

kNonSeq = 0
kSeq = 1

Public Functions

AverageLayer(const LayerConfig &config)
~AverageLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_
MatrixPtr outMtx_
MatrixPtr dataMtx_
int mode_
int type_

MaxLayer

class paddle::MaxLayer

A layer for “internal max” for sequence input. Input: one or more sequences. Each sequence contains some instances. If MaxLevel = kNonSeq: Output: output size is the number of input sequences (NOT input instances) output[i] = max_{for each instance in this sequence}{input[i]} If MaxLevel = kSeq: Check input sequence must has sub-sequence Output: output size is the number of input sub-sequences output[i] = max_{for each instance in this sub-sequence}{input[i]}

Inherits from paddle::Layer

Public Types

enum MaxLevel

Values:

kNonSeq = 0
kSeq = 1

Public Functions

MaxLayer(const LayerConfig &config)
~MaxLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_
IVectorPtr maxIndex_
int type_

SequenceLastInstanceLayer

class paddle::SequenceLastInstanceLayer

A layer for extracting the last instance of the input sequence. Input: a sequence If SequenceLevel = kNonseq: Output: a sequence containing only the last instance of the input sequence If SequenceLevel = kSeq: Check input sequence must has sub-sequence Output: a sequence containing only the last instance of each sub-sequence of the input sequence

Inherits from paddle::Layer

Public Functions

SequenceLastInstanceLayer(const LayerConfig &config)
~SequenceLastInstanceLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Types

enum SequenceLevel

Values:

kNonSeq = 0
kSeq = 1

Protected Attributes

std::unique_ptr<Weight> biases_
MatrixPtr tmpSrc_
MatrixPtr tmpDest_
int type_

Concat

ConcatenateLayer

class paddle::ConcatenateLayer

A concatenate layer has multiple input layers. It concatenates rows of each input as one row for the output of this layer and apply activation.

Inherits from paddle::Layer

Public Functions

ConcatenateLayer(const LayerConfig &config)
~ConcatenateLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

ConcatenateLayer2

class paddle::ConcatenateLayer2

concat2 layer is like concat layer, but each input layer was processed by a Projection.

Inherits from paddle::Layer

Public Functions

ConcatenateLayer2(const LayerConfig &config)
~ConcatenateLayer2()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::vector<std::unique_ptr<Projection>> projections_
std::vector<Argument> projOutput_
std::vector<std::pair<size_t, size_t>> projCol_

SequenceConcatLayer

class paddle::SequenceConcatLayer

A layer for concatenating the first sequence with the second sequence following the first Input: two sequences each containing some instances Output: a concatenated sequence of the two input sequences

Inherits from paddle::Layer

Public Functions

SequenceConcatLayer(const LayerConfig &config)
~SequenceConcatLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_

Subset

SubSequenceLayer

class paddle::SubSequenceLayer

A layer for taking the subsequence according to given offset and size Input: original sequence, offset, size Output: subsequence

Inherits from paddle::Layer

Public Functions

SubSequenceLayer(const LayerConfig &config)
~SubSequenceLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_
MatrixPtr tmpSrc_
MatrixPtr tmpDest_

Reshaping Layers

BlockExpandLayer

class paddle::BlockExpandLayer

Expand feature map to minibatch matrix.

  • matrix width is: blockH_ * blockW_ * channels_

  • matirx height is: outputH_ * outputW_

    \[\begin{split} outputH\_ = 1 + (2 * paddingH\_ + imgSizeH\_ - blockH\_ + strideH\_ - 1) / strideH\_ \\ outputW\_ = 1 + (2 * paddingW\_ + imgSizeW\_ - blockW\_ + strideW\_ - 1) / strideW\_ \end{split}\]

The expand method is the same with ExpandConvLayer, but saved the transposed value. After expanding, output_.sequenceStartPositions will store timeline. The number of time steps are outputH_ * outputW_ and the dimension of each time step is blockH_ * blockW_ * channels_. This layer can be used after convolution neural network, and before recurrent neural network.

The config file api is block_expand_layer.

Inherits from paddle::Layer

Public Functions

BlockExpandLayer(const LayerConfig &config)
~BlockExpandLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

size_t getBlockNum()

Calculate outputH_ and outputW_ and return block number which actually is time steps.

Return
time steps, outoutH_ * outputW_.

Protected Attributes

size_t blockH_
size_t blockW_
size_t strideH_
size_t strideW_
size_t paddingH_
size_t paddingW_
size_t imgSizeH_
size_t imgSizeW_
size_t outputH_
size_t outputW_
size_t channels_
MatrixPtr outVTrans_

auxiliary variable, which saves the transposed output value.

ExpandLayer

class paddle::ExpandLayer

A layer for “Expand Dense data or (sequence data where the length of each sequence is one) to sequence data.”

It should have exactly 2 input, one for data, one for size:

  • first one for data
    • If ExpandLevel = kNonSeq: dense data
    • If ExpandLevel = kSeq: sequence data where the length of each sequence is one
  • second one only for sequence info
    • should be sequence data with or without sub-sequence.

And the output size is the batch size(not instances) of second input.

The config file api is expand_layer.

Inherits from paddle::Layer

Public Functions

ExpandLayer(const LayerConfig &config)
~ExpandLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Types

enum ExpandLevel

if input[0] is dense data, ExpandLevel=kNonSeq; if input[0] is sequence data, ExpandLevel=kSeq

Values:

kNonSeq = 0
kSeq = 1

Protected Attributes

std::unique_ptr<Weight> biases_
int type_

store the ExpandLevel

ICpuGpuVectorPtr expandStartsPos_

expanded sequenceStartPositions or subSequenceStartPositions of input[1]

FeatureMapExpandLayer

class paddle::FeatureMapExpandLayer

A layer for expanding a batch of images to feature maps. Each data of the input is a 2 dimensional matrix. Each element of the matrix is replicated num_filters times to create a feature map with num_filters channels.

  • Input: Input one should be dense image data.
  • Output: expanded fature maps.
    \[ y.row[i] = x.row[i \mod x.width], i = 0,1,..., (x.width * num\_filters - 1) \]
    For example, num_filters = 4:
    x = [a1,a2;
         b1,b2]
    y = [a1, a2, a1, a2, a1, a2, a1, a2;
         b1, b2, b1, b2, b1, b2, b1, b2;]
    

Inherits from paddle::Layer

Public Functions

FeatureMapExpandLayer(const LayerConfig &config)
~FeatureMapExpandLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

ResizeLayer

class paddle::ResizeLayer

A layer for resizing a minibatch matrix h*w to h’*w’.

Note
origin matrix height * witdth) resize matrix: (height * width / size) * size

Inherits from paddle::Layer

Public Functions

ResizeLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

SequenceReshapeLayer

class paddle::SequenceReshapeLayer

A layer for reshaping the sequence Input: a sequence Output: a sequence

Inherits from paddle::Layer

Public Functions

SequenceReshapeLayer(const LayerConfig &config)
~SequenceReshapeLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<Weight> biases_
MatrixPtr reshapedOutputGrad

Math Layers

AddtoLayer

class paddle::AddtoLayer

This layer just simply add all input layers together, then activate the sum inputs. Each input of this layer should be the same size, which is also the output size of this layer.

\[ y=f(\sum_{i}x_i + b) \]
where \(y\) is output, \(x\) is input, \(b\) is bias, and \(f\) is activation function.

The config file api is addto_layer.

Inherits from paddle::Layer

Public Functions

AddtoLayer(const LayerConfig &config)
~AddtoLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization of AddtoLayer.

virtual void forward(PassType passType)

Forward propagation.

Note
There is no weight matrix for each input, because it just a simple add operation.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation.

Protected Attributes

std::unique_ptr<Weight> biases_

ConvexCombinationLayer

class paddle::ConvexCombinationLayer

A layer for weighted sum of vectors, which is used in NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE.

  • Input: the the size of the first input is weightDim, and the size of the second input is weightdim * dataDim.
  • Output: the sizeof the output is dataDim
    \[ out(j) = \sum_{i}(in0(i) * in1(i,j + i * dataDim)), i = 0,1,...,(weightDim-1); j = 0, 1,...,(dataDim-1) \]
    Note that the above computation is for one sample. Multiple samples are processed in one batch.

The config file api is linear_comb_layer.

Inherits from paddle::Layer

Public Functions

ConvexCombinationLayer(const LayerConfig &config)
~ConvexCombinationLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx0

A matrix pointer pointing to second input.

MatrixPtr tmpRow0

A matrix pointer pointing to first input.

MatrixPtr tmpRow1

A matrix pointer pointing to output.

InterpolationLayer

class paddle::InterpolationLayer

A layer for linear interpolation with two inputs, which is used in NEURAL TURING MACHINE.

\[ y.row[i] = w[i] * x_1.row[i] + (1 - w[i]) * x_2.row[i] \]
where \(x_1\) and \(x_2\) are two (batchSize x dataDim) inputs, \(w\) is (batchSize x 1) weight vector, and \(y\) is (batchSize x dataDim) output.

The config file api is interpolation_layer.

Inherits from paddle::Layer

Public Functions

InterpolationLayer(const LayerConfig &config)
~InterpolationLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr weightLast_

weightLast = 1 - weight

MatrixPtr tmpMatrix

MultiplexLayer

class paddle::MultiplexLayer

This layer multiplex multiple layers according to the index, which is provided by the first input layer.

  • Input[0]: the index of the layer to output of size batchSize.
  • Input[1:N]; the candidate output data. For each index i from 0 to batchSize -1, the output is the i-th row of the (index[i] + 1)-th layer.

For each i-th row of output:

\[ y[i][j] = x_{x_{0}[i] + 1}[i][j], j = 0,1, ... , (x_{1}.width - 1) \]
where, y is output. \(x_{k}\) is the k-th input layer and \(k = x_{0}[i] + 1\).

Inherits from paddle::Layer

Public Functions

MultiplexLayer(const LayerConfig &config)
~MultiplexLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::vector<CopyInfo> copySchedule_

A list of CopyInfo used to save copy information.

MatrixPtr tmpSrc_

Temporary matrix pointer to point to input data.

MatrixPtr tmpDest_

Temporary matrix pointer to point to output data.

struct CopyInfo

A struct is used to save the copy information, includes input layer index and copy size.

Public Functions

CopyInfo(int inStartIdx, int inLength, int inCopyIdx)

Public Members

int startIdx

The start row of input.

int length

Number of rows. If the layer index in Input[0] is not consecutive, the length is one. Otherwise, the length is > 1 and copy multi rows once.

int copyIdx

The copied layer index, which needs to add 1.

OuterProdLayer

class paddle::OuterProdLayer

A layer for computing the outer product of two vectors.

Note
used in NEURAL TURING MACHINE Input1: vector (batchSize * dim1) Input2: vector (batchSize * dim2) Output: a matrix: (batchSize * (dim1*dim2))

Inherits from paddle::Layer

Public Functions

OuterProdLayer(const LayerConfig &config)
~OuterProdLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx0
MatrixPtr tmpRow0
MatrixPtr tmpRow1

PowerLayer

class paddle::PowerLayer

This layer applys a power function to a vector element-wise, which is used in NEURAL TURING MACHINE.

\[ y = x^w \]
where \(x\) is a input vector, \(w\) is scalar weight, and output \(y\) is a vector.

The config file api is power_layer.

Inherits from paddle::Layer

Public Functions

PowerLayer(const LayerConfig &config)
~PowerLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx

ScalingLayer

class paddle::ScalingLayer

A layer for each row of a matrix, multiplying with a element of a vector, which is used in NEURAL TURING MACHINE.

\[ y.row[i] = w[i] * x.row[i] \]
where \(x\) is (batchSize x dataDim) input, \(w\) is (batchSize x 1) weight vector, and \(y\) is (batchSize x dataDim) output.

The config file api is scaling_layer.

Inherits from paddle::Layer

Public Functions

ScalingLayer(const LayerConfig &config)
~ScalingLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

SlopeInterceptLayer

class paddle::SlopeInterceptLayer

A layer for applying a slope and an intercept to the input element-wise. This layer is used in NEURAL TURING MACHINE.

\[ y = ax + b \]
Note
There is no activation and weight in this layer.

Here, a is scale and b is offset, which are provided as attributes of the layer.

The config file api is slope_intercept_layer.

Inherits from paddle::Layer

Public Functions

SlopeInterceptLayer(const LayerConfig &config)
~SlopeInterceptLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

TensorLayer

class paddle::TensorLayer

TensorLayer takes two input vectors.

\[ y_{i} = x_{1} * W_{i} * x_{2}^{\rm T}, i=0, 1, ...,K-1 \]
.

  • \(x_{1}\): the first input, size is M.
  • \(x_{2}\): the second input, size is N.
  • y: output, size is K.
  • \(y_{i}\): i-th element of y.
  • \(W_{i}\): the i-th learned weight, dimensions: [M, N].
  • \(x_{2}^{\rm T}\): the transpose of \(x_{2}\).

The config file api is tensor_layer.

Inherits from paddle::Layer

Public Functions

TensorLayer(const LayerConfig &config)
~TensorLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

Weight &getWeight(int idx)
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

WeightList weights_
std::unique_ptr<Weight> biases_

TransLayer

class paddle::TransLayer

A layer for transposition.

\[ y = x^\mathrm{T} \]
where \(x\) is (M x N) input, and \(y\) is (N x M) output.

The config file api is trans_layer.

Inherits from paddle::Layer

Public Functions

TransLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Sampling Layers

MultinomialSampler

class paddle::MultinomialSampler

Given the probability of N objects, the sampler random select one of the object.

The space requirement is O(N)=O(N * sizeof(Interval)). The computational complexity of generate one sample is O(1).

Note
: prob does not have to be unnormalized.

Public Functions

MultinomialSampler(const real *prob, int size)
template <typename URNG>
int gen(URNG &g)

Generate a random sample.

Return
Random integer.
Parameters
  • g -

    is a random number engine. See <random>.

Protected Functions

template <typename Rand>
int gen1(Rand rand)

Generation.

Return
random int number or intervals_[random_int_number].otherId.
Parameters
  • rand -

    rand is a real random number distribution for the range [0, size).

Protected Attributes

std::vector<Interval> intervals_

The probability of each interval will be 1./size.

std::uniform_real_distribution<double> rand_
struct Interval

Public Members

int otherId
real thresh

MaxIdLayer

class paddle::MaxIdLayer

A layer for finding the id which has the maximal value for each sample. The result is stored in output_.ids.

The config file api is maxid_layer.

Inherits from paddle::Layer

Public Functions

MaxIdLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

SamplingIdLayer

class paddle::SamplingIdLayer

A layer for sampling id from multinomial distribution from the input layer. Sampling one id for one sample. The result is stored in output_.ids.

The config file api is sampling_id_layer.

Inherits from paddle::Layer

Public Functions

SamplingIdLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

void forwardImp(const Argument &input)
virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Cost Layers

CostLayer

class paddle::CostLayer

Base class for a particular type of cost layer. This type of cost should have one data layer, one label layer and an optional weight layer as input. The derived class should implemnt forwardImp() and backwardImp() which calculate the cost for data and label. The weight is automatically handled by the base class.

Inherits from paddle::Layer

Subclassed by paddle::HuberTwoClass, paddle::MultiBinaryLabelCrossEntropy, paddle::MultiClassCrossEntropy, paddle::MultiClassCrossEntropyWithSelfNorm, paddle::SoftBinaryClassCrossEntropy, paddle::SumOfSquaresCostLayer

Public Functions

CostLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer()
LayerPtr getLabelLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

virtual void forwardImp(Matrix &outputValue, Argument &label, Matrix &cost) = 0
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad) = 0

Protected Attributes

LayerPtr weightLayer_
real coeff_

HuberTwoClass

class paddle::HuberTwoClass

Huber loss for robust 2-classes classification.

For label={0, 1}, let y=2*label-1. Given output f, the loss is:

\[\begin{split} Loss = \left\{\begin{matrix} 4 * y * f & \textit{if} \ \ y* f < -1 \\ (1 - y * f)^2 & \textit{if} \ \ -1 < y * f < 1 \\ 0 & \textit{otherwise} \end{matrix}\right. \end{split}\]

Inherits from paddle::CostLayer

Public Functions

HuberTwoClass(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forwardImp(Matrix &output, Argument &label, Matrix &cost)
void forwardImpIn(Matrix &output, Argument &label, Matrix &cost)
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)
void backwardImpIn(Matrix &outputValue, Argument &label, Matrix &outputGrad)

LambdaCost

class paddle::LambdaCost

LambdaRank os a method for learning arbitrary information retrieval measures. It can be applied to any algorithm that learns through gradient descent. LambdaRank is a listwise method, in that the cost depends on the sorted order of the documents. LambdaRank gives the gradient of cost function:

\[ \lambda_{ij} = \frac{1}{1 + e^{o_i - o_j}} \left| \Delta_{NDCG} \right| \]

[1] Christopher J.C. Burges, Robert Ragno, Quoc Viet Le. Learning to Rank with Nonsmooth Cost Functions.

Inherits from paddle::Layer

Public Functions

LambdaCost(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer()
LayerPtr getScoreLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

virtual void onPassEnd()

One pass is finished.

real calcNDCG(const real *outputScore, const real *score, int size)
void calcGrad(const real *outputScore, const real *score, real *gradData, int size)

MultiBinaryLabelCrossEntropy

class paddle::MultiBinaryLabelCrossEntropy

Cross entropy for multi binary labels.

\[ cost[i] = -sum(label[i][j]*log(output[i][j]) + (1-label[i][j])*log(1-output[i][j])) \]

Inherits from paddle::CostLayer

Public Functions

MultiBinaryLabelCrossEntropy(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forwardImp(Matrix &output, Argument &label, Matrix &cost)
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)

Protected Attributes

MatrixPtr targetPerDim_

MultiClassCrossEntropy

class paddle::MultiClassCrossEntropy

The cross-entropy loss for multi-class classification task. The loss function is:

\[ L = - \sum_{i}{t_{k} * log(P(y=k))} \]

Inherits from paddle::CostLayer

Public Functions

MultiClassCrossEntropy(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forwardImp(Matrix &output, Argument &label, Matrix &cost)
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)

MultiClassCrossEntropyWithSelfNorm

class paddle::MultiClassCrossEntropyWithSelfNorm

The cross-entropy with self-normalization for multi-class classification.

The loss function is:

\[ L = \sum_{i}[-log(P(x_{i})) + alpha * log(Z(x_{i})^2)] \]

The \(Z(x)\) is the softmax normalizer.

[1] Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. Fast and robust neural network joint models for statistical machine translation. In Proceedings of the ACL 2014 Conference.

Inherits from paddle::CostLayer

Public Functions

MultiClassCrossEntropyWithSelfNorm(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forwardImp(Matrix &output, Argument &label, Matrix &cost)
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)

Protected Attributes

MatrixPtr sftMaxSum_
MatrixPtr sumInv_

RankingCost

class paddle::RankingCost

A cost layer for learning to rank (LTR) task. This layer contains at leat three inputs.

\[\begin{split} C_{i,j} = -\tilde{P_{ij}} * o_{i,j} + log(1 + e^{o_{i,j}}) \\ o_{i,j} = o_i - o_j \\ \tilde{P_{i,j}} = \left \{0, 0.5, 1 \right \} \ or \ \left \{0, 1 \right \} \end{split}\]

[1]. Chris Burges, Tal Shaked, Erin Renshaw, et al. Learning to Rank useing Gradient Descent.

Inherits from paddle::Layer

Public Functions

RankingCost(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer(size_t i)
LayerPtr getLabelLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

virtual void onPassEnd()

One pass is finished.

void forwardImp(Matrix &output, Argument &label, Matrix &cost)
void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)

SoftBinaryClassCrossEntropy

class paddle::SoftBinaryClassCrossEntropy

The cross-entropy for soft binary class.

\[ L = \sum_i (\sum_j -y_j(i)*log(x_j(i))-(1-y_j(i))*log(1-x_j(i))) \]

Inherits from paddle::CostLayer

Public Functions

SoftBinaryClassCrossEntropy(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forwardImp(Matrix &output, Argument &label, Matrix &cost)
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)

Protected Attributes

MatrixPtr targetPerDim_

SumOfSquaresCostLayer

class paddle::SumOfSquaresCostLayer

This cost layer compute Euclidean (L2) loss for real-valued regression tasks.

\[ L = \frac{1}{2N} \sum_{i=1}^N {|| \hat{y}_i - y_i||_2^2} \]

Inherits from paddle::CostLayer

Public Functions

SumOfSquaresCostLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forwardImp(Matrix &output, Argument &label, Matrix &cost)
virtual void backwardImp(Matrix &outputValue, Argument &label, Matrix &outputGrad)

CosSimLayer

class paddle::CosSimLayer

A layer for calculating cosine similarity between two vector

\[ f(x,y)=scale\frac{x_1y_1+x_2y_2+...+x_ny_n}{\sqrt{x_1^2+x_2^2+... +x_n^2}\sqrt{y_1^2+y_2^2+...+y_n^2}} \]
.

  • Input1: A vector (batchSize * dataDim) *
  • Input2: A vector (batchSize * dataDim) or (1 * dataDim) *
  • Output: A vector (dataDim * 1)

The config file api is cos_sim.

Inherits from paddle::Layer

Public Functions

CosSimLayer(const LayerConfig &config)
~CosSimLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

CosSimVecMatLayer

class paddle::CosSimVecMatLayer

A layer for computing cosine similarity between a vector and each row of a matrix out[i] = cos_scale * cos(in1, in2(i,:));.

Input1: a vector (batchSize * dataDim)

Note
used in NEURAL TURING MACHINE

Input2: a matrix in vector form (batchSize * (weightDim*dataDim))

Output: a vector (batchSize * weightDim)

Inherits from paddle::Layer

Public Functions

CosSimVecMatLayer(const LayerConfig &config)
~CosSimVecMatLayer()
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

MatrixPtr tmpMtx0
MatrixPtr tmpMtx1
MatrixPtr tmpRow0
MatrixPtr tmpRow1
MatrixPtr tmpRow2
MatrixPtr tmpRow3

CRFDecodingLayer

class paddle::CRFDecodingLayer

A layer for calculating the decoding sequence of sequential conditional random field model. The decoding sequence is stored in output_.ids It also calculate error, output_.value[i] is 1 for incorrect decoding or 0 for correct decoding) See LinearChainCRF.h for the detail of the CRF formulation.

Inherits from paddle::CRFLayer

Public Functions

CRFDecodingLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

std::unique_ptr<LinearChainCRF> crf_

CRFLayer

class paddle::CRFLayer

A layer for calculating the cost of sequential conditional random field model. See class LinearChainCRF for the detail of the CRF formulation.

Inherits from paddle::Layer

Subclassed by paddle::CRFDecodingLayer

Public Functions

CRFLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Attributes

size_t numClasses_
ParameterPtr parameter_
std::vector<LinearChainCRF> crfs_
LayerPtr weightLayer_
real coeff_

CTCLayer

class paddle::CTCLayer

Inherits from paddle::Layer

Public Functions

CTCLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

void forwardImp(const Argument &softmaxSeqs, const Argument &labelSeqs)
virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

void backwardImp(const UpdateCallback &callback, const Argument &softmaxSeqs, const Argument &labelSeqs)

Protected Attributes

size_t numClasses_
bool normByTimes_
std::vector<LinearChainCTC> ctcs_
std::vector<Argument> tmpCpuInput_

HierarchicalSigmoidLayer

class paddle::HierarchicalSigmoidLayer

Organize the classes into a binary tree. At each node, a sigmoid function is used to calculate the probability of belonging to the right branch. This idea is from “F. Morin, Y. Bengio (AISTATS 05): Hierarchical Probabilistic Neural Network Language Model.”

Here we uses a simple way of making the binary tree. Assuming the number of classes C = 6, The classes are organized as a binary tree in the following way:

*-*-*- 2
| | |- 3
| |
| |-*- 4
|   |- 5
|
|-*- 0
|- 1

where * indicates an internal node, and each leaf node represents a class.

  • Node 0 ... C-2 are internal nodes.
  • Node C-1 ... 2C-2 are leaf nodes.
  • Class c is represented by leaf node \(c+C-1\).

We assign an id for each node:

  • the id of root be 0.
  • the left child of a node i is 2*i+1.
  • the right child of a node i is 2*i+2.

It’s easy to see that:

  • the parent of node i is \(\left\lfloor(i-1)/2\right\rfloor\).
  • the j-th level ancestor of node i is \(\left\lfloor(i+1)/2^{j+1}\right\rfloor - 1\).
  • A node i is a left child of its parent if \((i-1)\%2==0\).

The config file api is hsigmod_layer.

Inherits from paddle::Layer

Public Functions

HierarchicalSigmoidLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

Protected Functions

LayerPtr getLabelLayer()

The last of inputs is label layer.

Protected Attributes

WeightList weights_
std::unique_ptr<Weight> biases_
size_t numClasses_

number of classes

int codeLength_

codeLength_ = \(1 + \left\lfloor log_{2}(numClasses-1)\right\rfloor\)

Argument preOutput_

temporary result of output_

LinearChainCRF

class paddle::LinearChainCRF

Public Functions

LinearChainCRF(int numClasses, real *para, real *grad)

The size of para and grad must be \((numClasses + 2) * numClasses\). The first numClasses values of para are for starting weights ( \(a\)). The next numClasses values of para are for ending weights ( \(b\)), The remaning values are for transition weights ( \(w\)).

The probability of a state sequence s of length \(L\) is defined as: \(P(s) = (1/Z) exp(a_{s_1} + b_{s_L} + \sum_{l=1}^L x_{s_l} + \sum_{l=2}^L w_{s_{l-1},s_l})\) where \(Z\) is a normalization value so that the sum of \(P(s)\) over all possible sequences is \(1\), and \(x\) is the input feature to the CRF.

real forward(real *x, int *s, int length)

Calculate the negative log likelihood of s given x. The size of x must be length * numClasses. Each consecutive numClasses values are the features for one time step.

void backward(real *x, real *dx, int *s, int length)

Calculate the gradient with respect to x, a, b, and w. The gradient of x will be stored in dx. backward() can only be called after a corresponding call to forward() with the same x, s and length.

Note
The gradient is added to dx and grad (provided at constructor).

void decode(real *x, int *s, int length)

Find the most probable sequence given x. The result will be stored in s.

Protected Attributes

int numClasses_
MatrixPtr a_
MatrixPtr b_
MatrixPtr w_
MatrixPtr da_
MatrixPtr db_
MatrixPtr dw_
MatrixPtr ones_
MatrixPtr expX_
MatrixPtr alpha_
MatrixPtr beta_
MatrixPtr maxX_
MatrixPtr expW_
IVectorPtr track_

LinearChainCTC

class paddle::LinearChainCTC

Public Functions

LinearChainCTC(int numClasses, bool normByTimes)
real forward(real *softmaxSeq, int softmaxSeqLen, int *labelSeq, int labelSeqLen)
void backward(real *softmaxSeq, real *softmaxSeqGrad, int *labelSeq, int labelSeqLen)

Protected Functions

void segmentRange(int &start, int &end, int time)

Protected Attributes

int numClasses_
int blank_
int totalSegments_
int totalTime_
bool normByTimes_
bool isInvalid_
MatrixPtr logActs_
MatrixPtr forwardVars_
MatrixPtr backwardVars_
MatrixPtr gradTerms_
real logProb_

NCELayer

class paddle::NCELayer

Noise-contrastive estimation. Implements the method in the following paper: A fast and simple algorithm for training neural probabilistic language models.

The config file api is nce_layer.

Inherits from paddle::Layer

Public Functions

NCELayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

void prepareSamples()
virtual void prefetch()

If use sparse row matrix as parameter, prefetch feature ids in input label.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.

void forwardBias()
void backwardBias(const UpdateCallback &callback)
void forwardOneInput(int layerId)
void backwardOneInput(int layerId, const UpdateCallback &callback)
void forwardCost()
void backwardCost()

Validation Layers

ValidationLayer

class paddle::ValidationLayer

Inherits from paddle::Layer

Subclassed by paddle::AucValidation, paddle::PnpairValidation

Public Functions

ValidationLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

LayerPtr getOutputLayer()
LayerPtr getLabelLayer()
LayerPtr getInfoLayer()
virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback = nullptr)

Backward propagation. Should only be called after Layer::forward() function.

virtual void validationImp(MatrixPtr outputValue, IVectorPtr label) = 0
virtual void onPassEnd() = 0

One pass is finished.

AucValidation

class paddle::AucValidation

Inherits from paddle::ValidationLayer

Public Functions

AucValidation(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void validationImp(MatrixPtr outputValue, IVectorPtr label)
virtual void onPassEnd()

One pass is finished.

Public Members

std::vector<PredictionResult> predictArray_
struct PredictionResult

Public Functions

PredictionResult(real __out, int __label)

Public Members

real out
int label

PnpairValidation

class paddle::PnpairValidation

Inherits from paddle::ValidationLayer

Public Functions

PnpairValidation(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void validationImp(MatrixPtr outputValue, IVectorPtr label)
virtual void onPassEnd()

One pass is finished.

Check Layers

EosIdCheckLayer

class paddle::EosIdCheckLayer

A layer for checking EOS for each sample:

  • output_id = (input_id == conf.eos_id)

The result is stored in output_.ids. It is used by recurrent layer group.

Inherits from paddle::Layer

Public Functions

EosIdCheckLayer(const LayerConfig &config)
virtual bool init(const LayerMap &layerMap, const ParameterMap &parameterMap)

Intialization. For example, adding input layers from layerMap and parameterMap.

virtual void forward(PassType passType)

Forward propagation. All inherited implementation should call Layer::foward() function.

virtual void backward(const UpdateCallback &callback)

Backward propagation. Should only be called after Layer::forward() function.