Gradient Machines

GradientMachine

class paddle::GradientMachine

Subclassed by paddle::MultiGradientMachine, paddle::NeuralNetwork

Public Types

enum CreateMode

Values:

kNormal = 0
kSgdSparseCpuTraining = 3
kTesting = 4
kCustom = 10

Public Functions

virtual ~GradientMachine()
virtual void prefetch(const std::vector<Argument> &inArgs)

Prefetch row ids of sparse parameter.

virtual void forward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType) = 0

Forward propagation.

Calculate outputs (outArgs) based the inputs (inArgs)

Note
: if passType==PASS_TEST, then backward() should not be called

virtual void backward(const UpdateCallback &callback = nullptr) = 0

Backward propagation.

Calculate the gradient of inArgs and parameter.

This function should only be called after a corresponding forward() call. The caller is responsible for filling the correct grad for the outArgs obtained using forward().

It may also change the grad field for the inArgs supplied at forward()

virtual void forwardBackward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType, const UpdateCallback &callback = nullptr)

Combine forward() and backward(). For multithread training, this may be faster.

Note
: passType PASS_TEST is not allowed for forwardBackward().

virtual void resetState()
virtual void setState(const MachineState &machineState)
virtual void getState(MachineState &machineState)
virtual void onPassEnd() = 0
virtual Evaluator *makeEvaluator() = 0

Create an evaluator which can be used for eval()

virtual void eval(Evaluator *evaluator) = 0

evaluate using the given evaluator

std::vector<ParameterPtr> &getParameters()
std::vector<ParameterPtr> &getNonStaticParameters()
bool hasStaticParameters()
virtual void start(const TrainerConfig &config, DataProviderPtr dataProvider)

Used before formal training, start work-threads and set trainer Parameters;.

Note
This function will only been implemented and used in a multithreaded environment.

virtual void finish()

check each work-thread whether is failed/error/finish, if not, return ture, and yes return false.

Note
This function will only been implemented and used in a multithreaded environment.

virtual bool trainIsOn()

set the training status a “finished” value, the sub_work_threads will option the change, and then exit.

Note
This function will only been implemented and used in a multithreaded environment.

virtual void restart()

when all or some of the sub-workThreads are suspended to waiting controller’s instructions, and after some processing done in the controller, it will call this function to wake up all the pending thread.

Note
This function will only been implemented and used in a multithreaded environment.

virtual void setOutputGrad(const std::vector<Argument> &args)

Set the gradient of the output from outside.

void saveParameters(const std::string &dir) const
void loadParameters(const std::string &dir)
void randParameters()
virtual void getStats(real &cost, int64_t &numProcessed)

Public Static Functions

GradientMachine * GradientMachine::create(const ModelConfig & config, int mode = kNormal, const std::vector< ParameterType > & parameterTypes = std::vector< ParameterType >{PARAMETER_VALUE, PARAMETER_GRADIENT, PARAMETER_MOMENTUM})

Create a gradient machine from ModelConfig Parameter will have parameterTypes

GradientMachine *create(const std::string &modelFile, DataConfig *dataConfig)

Create a gradient machine from the merged model file. The merged model file can be generated using tools/merge_model If dataConfig is not null, it will be filled with the DataConfig from the TrainerConfig

GradientMachine *create(std::istream &is, DataConfig *dataConfig)

Create a gradient machine from a stream which contains the merged model file. The merged model file can be generated using tools/merge_model If dataConfig is not null, it will be filled with the DataConfig from the TrainerConfig

GradientMachine *create(const std::string &modelFile, TrainerConfig *trainerConfig)

Create a gradient machine from the merged model file. The merged model file can be generated using tools/merge_model If trainerConfig is not null, it will be filled with the TrainerConfig

GradientMachine *create(std::istream &is, TrainerConfig *trainerConfig)

Create a gradient machine from a stream which contains the merged model file. The merged model file can be generated using tools/merge_model If trainerConfig is not null, it will be filled with the TrainerConfig

Protected Functions

virtual void onLoadParameter()

Protected Attributes

std::vector<ParameterPtr> parameters_
std::vector<ParameterPtr> nonStaticParameters_

GradientMachineModel

class paddle::IGradientMachineMode

Public Functions

virtual ~IGradientMachineMode()
virtual GradientMachine *create(const ModelConfig &config) = 0

create current mode’s gradient machine by model config.

Parameters
  • config: model config

virtual bool shouldBeMe(const std::string &algo, size_t trainerCount, bool isLocal, bool isGpu) const = 0

shouldBeMe the current mode of GradientMachine should be this mode.

Return
true if mode should be this mode.
Parameters
  • algo: training algorithm name.
  • trainerCount: trainer count.
  • isLocal: is local mode (without pserver)
  • isGpu: is using gpu.

virtual bool isDataMustInCpu(size_t trainerCount) const = 0

Is data must be in cpu even if using gpu mode.

Return
true if data must be gpu.
Parameters
  • trainerCount: trainer count

virtual bool needTrainWholeDataInOneBatch() const = 0

Need not to use mini-batch method, and should train all data in one batch in one pass.

Public Static Functions

static void regGradientMachineMode(int32_t mode, std::unique_ptr<IGradientMachineMode> &&ptr)

register a custom gradient machine mode.

Note
For user to register a custom gradient machine mode, id should >= kCustom.
Parameters
  • mode: mode id.
  • ptr: mode description object.

static IGradientMachineMode *mode(int32_t mode)

get custom mode from mode id.

Return
mode description object.
Parameters
  • mode: mode id

static bool trainWholeDataInOneBatch(int32_t mode)

helper function to test trainWholeDataInOneBatch or not for mode

static bool tryGetMode(int *mode, const std::string &algo, int32_t trainerCount, bool isLocal, bool isGpu)

Try to get custom mode if we can.

Return
true if there is a custom mode fit these conditions.
Parameters
  • mode: the custom mode id.
  • algo: algorithm name
  • trainerCount: trainer count.
  • isLocal: is local or not
  • isGpu: using gpu or not.

static bool dataMustInCpu(int32_t mode, size_t trainerCount)

helper function for data must in cpu

static GradientMachine *tryCreateGradientMachine(int32_t mode, const ModelConfig &config)

try to create gradient machine by mode & config.

Return
nullptr if we cannot create a gradient machine by such mode.

MultiGradientMachine

class paddle::MultiGradientMachine

A MultiGradientMachine is a synchronous GradientMachine which devides one data batch into several smaller batches and assign each one small batch to one computint thread for computation. After each thread finishes computation, it merges result (including output Argument and gradient during backward()). It basically is the same as single thread gradient machine, except that it uses multi-thread to do the computation.

It handles GPU and Cpu parameters differently. In GPU, one computing thread generally corresponds to one GPU device. Thus, each thread keeps a separate copy of the parameter in its own device’s memory. In CPU, we only need to keep one copy of the parameters in the main memory. After, each computing thread computes its own parameter gradient, the update process needs to accumulate the parameter gradients from all the computing threads, and update the accumulated parameter gradient to the corresponding parameter value.

Each GPU parameter is assigned to a thread called its main thread. For each parameter, the accumulation of its gradients and the update of its value happens in its main thread. The main thread first gather the parameter gradients from all the computing thread. Then, it performs parameter update. After a gradient is updated by the main thread, it is scattered to all the computing thread so that the parameters in all the computing threads are synchronized. The scatter and gather process are implemented by ring-style communication. Assume we have N computing threads, its thread ids will be 0, 1, ..., N-1. For each parameter, the id of the main thread is specified in paraMainThread_[pid], where pid is the id of the parameter. Each thread i only sends data to its partner thread (i - 1) % N. For example, for a parameter gradient that is computed in thread 4, and its main thread is 2. Its traveling process would be 4, 5,..., N-1, 0, 1, 2. In each step, the gradient buffer is added to the local gradient, and the local gradient is then copied to the gradient buffer of the next thread. At last, its main thread 2 will get the accumulated parameter gradient. For the same parameter, after its value is updated, the value’s traveling process would be 2, 1, 0, N-1, ... 3. At the end, all the computing threads would have the updated parameter value.

A computing thread (TrainerThread) uses 4 threads to do different jobs:

  1. computeThread(): performing forward(), backward(), prefetch().
  2. valueDispatchThread(): copying parameter values to partner thread.
  3. copyGradToBufferThread(): copying parameter gradient to partner thread.
  4. gradCollectThread(): merging the gradient from step 3 with local gradient and call the callback supplied by the user to update parameter value.

CPU parameter value has only one copy. And their gradients are merged at the end of backward().

  • Handling of sparse update Currently, sparse update is only supported for CPU parameters.

Sparse updates refers to gradient caculation where the gradient is sparse. For example, if the input argument to a ‘fc’ layer is sparse, the gradient of the weight matrix of this layer will be sparse. It is usually more efficient to treat the gradient explicitly as sparse vector during the parameter update.

There are two types of sparse updates called local sparse update and remote sparse update.

For both types of sparse updates, there is one copy of parameter value and gradient called main parameter value and gradient, and there is a copy of parameter value and gradient for each computing thread called slave parameter value and gradient. The slave parameter values are always shared with the corresponding main parameter value. The slave parameter grad is a sparse row matrix. The sparse pattern for slave parameter grads are different, because the small batches for each computing thread might have different sparsity pattern.

  1. Local sparse update

    Main parameter value type is MAT_NORMAL. It is a dense matrix.

    Main parameter grad type is MAT_SPARSE_ROW_IDS (SparseRowIdsCpuMatrix) It is also a dense matrix, but the updated values are specified by IDS.

    Slave parameter value shares with main parameter value.

    Slave parameter grad type is MAT_SPARSE_ROW_AUTO_GROW (SparseAutoGrowRowCpuMatrix). It is a sparse row matrix.

    During backward() of each TrainerThread, SparseAutoGrowRowCpuMatrix will gather all the non-zero gradient. And After backward(), they will be merged into main parameter grad (SparseRowIdsCpuMatrix), with indices indicating which rows have nonzero gradient.

  2. Remote sparse update

    Main parameter value type is MAT_SPARSE_ROW_PREFETCH(_FULL_SIZE) (SparsePrefetchRowCpuMatrix). MAT_SPARSE_ROW_PREFETCH is a sparse matrix. MAT_SPARSE_ROW_PREFETCH_FULL_SIZE is a dense matrix. However, only the parameter values that are prefetched is up-to-date.

    Main parameter grad type is MAT_SPARSE_ROW (SparseRowCpuMatrix). And it shares sparse pattern with value by sharing indexDictHandle_, which is an internal data structure used by SparseRowCpuMatrixto specify the sparsity pattern of Slave parameter value shares with main parameter value.

    Slave parameter grad type is MAT_SPARSE_ROW_AUTO_GROW (SparsePrefetchRowCpuMatrix). It is a sparse row matrix

    During prefetch(), all the layers will indicates which rows of each parameter are needed. Then the framework will retrieve those rows from parameter server.

    During backward() of each TrainerThread, SparseAutoGrowRowCpuMatrix will gather all the non-zero gradient. And After backward(), they will be merged into main parameter grad (SparseRowCpuMatrix). And the framework will send the merged gradient to parameter server.

Inherits from paddle::GradientMachine

Public Types

enum TaskType

Values:

TASK_FORWARD_BACKWARD = 0
TASK_FORWARD = 1
TASK_BACKWARD = 2
TASK_COPY_IN_ARGS = 3

Public Functions

MultiGradientMachine(const ModelConfig &config, bool useGpu)
void prefetch(const std::vector<Argument> &inArgs)

Prefetch row ids of sparse parameter.

void forward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType)

Forward propagation.

Calculate outputs (outArgs) based the inputs (inArgs)

Note
: if passType==PASS_TEST, then backward() should not be called

void backward(const UpdateCallback &callback = nullptr)

Backward propagation.

Calculate the gradient of inArgs and parameter.

This function should only be called after a corresponding forward() call. The caller is responsible for filling the correct grad for the outArgs obtained using forward().

It may also change the grad field for the inArgs supplied at forward()

void forwardBackward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType, const UpdateCallback &callback)

Combine forward() and backward(). For multithread training, this may be faster.

Note
: passType PASS_TEST is not allowed for forwardBackward().

void onPassEnd()
void finish()

check each work-thread whether is failed/error/finish, if not, return ture, and yes return false.

Note
This function will only been implemented and used in a multithreaded environment.

Evaluator *makeEvaluator()

Create an evaluator which can be used for eval()

void eval(Evaluator *evaluator)

evaluate using the given evaluator

bool useGpu() const
bool isPassGrad()

Return
whether to pass the gradients in outArgs_ to each threads.

void setPassGrad(bool isPass)

set whether to pass the gradient in outArgs_ to each threads.

void setOutputGrad(const std::vector<Argument> &args)

Set the gradients of the outputs. The gradietns will be copied to each thread in the computing threads.

Protected Functions

std::vector<TrainerThreadPtr> &getAllThreads()
int logicalDeviceId2RealDeviceId(int logicalId, int threadId = 0) const

Calculate the real device id based on the logical device id and the thread id.

int realDeviceId2LogicalDeviceId(int realId, int threadId = 0) const

Calculate the logical device id based on the real device id and the thread id.

std::vector<const std::vector<ParameterPtr> *> getSlaveParameters()
bool hasNonstaticCpuParamters() const
void waitBeforeMerge()

Called TrainerThread to wait before merging CPU parameter gradients.

void waitAfterMerge()

called by MultiGradientMachine and TrainerThread to wait after merging CPU parameter graidents.

void waitForCopyInArgs()

called by MultiGradientMachine and TrainerThread to wait for copyInArgs() finishing

TrainerThreadPtr &getThread(int threadId)
std::vector<GradBuffer> &getGradBuf(int threadId)
PassType getPassType() const
void notifyGradientTransfer(int paramId)

Called by TrainerThread to notify MultiGradientMachine that the gradient for paramId is ready

const std::vector<Argument> &getInArgs()
TaskType getTaskType() const
const UpdateCallback &getBackwardCallback() const
int getNumDevices() const
int getNumLogicalDevices() const
int getNumThreads() const
int paraMainThread(int pid) const
void forwardImp(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType, TaskType taskType)
void backwardImp(const UpdateCallback &callback = NULL)
void updateThreadParameters()

update all parameters

void startTask(TaskType taskType)
void getOutArgs(std::vector<Argument> *outArgs, PassType passType)
void allocGradBufs()

Protected Attributes

bool useGpu_
bool hasNonstaticCpuParamters_
std::unique_ptr<GradientMachine> gradientMachine_

store main parameter only

std::vector<TrainerThreadPtr> threads_
std::vector<int> paraMainThread_
std::vector<std::vector<GradBuffer>> gradBufs_
std::vector<size_t> bufferSizes_
PassType passType_
TaskType taskType_
PidQueue gradQueue_
std::vector<Argument> inArgs_
std::vector<Argument> outArgs_
hl_stream_t outArgStream_
std::vector<ParameterType> mergeTypes_

ParameterType which needs to be merged from each GPU.

int numDevices_
int numLogicalDevices_
int numThreads_
UpdateCallback backwardCallback_
ThreadBarrier trainerBarrier_

barrrier for threads_

ThreadBarrier allBarrier_

barrier for both MultiGradientMachine and threds_

bool inArgsCopied_

indicate whether inArgs is copied before forward()

bool isPassGrad_

Whether to copy the gradient back from an external input.

Friends

friend paddle::MultiGradientMachine::TrainerThread

TrainerThread

class paddle::TrainerThread

Public Functions

TrainerThread(const ModelConfig &config, int threadId, MultiGradientMachine *multiMachine)
~TrainerThread()
void start()
void onPassEnd()
void waitOutArgsReady()
void notifyTaskReady()
int getDeviceId() const
GradientMachine *getGradientMachine()
const std::vector<ParameterPtr> &getParameters()
void stop()
void notifyValueReady(int paramId)
const VectorPtr &getValueBuf(int paramId)
const std::vector<Argument> &getOutArgs()
void incUpdateCounter(int n = 1)
void notifyGradientCollect(int paramId)
void notifyCopyGradToBuffer(int paramId)
void notifyValueDispatch(int paramId)
void prefetch()
void copyOutputGrad()

copy the output gradient from the main GradientMachine.

Protected Functions

void mergeCpuGradients()
void mergeGradSparse(Parameter *para, std::vector<const std::vector<ParameterPtr> *> &slaveParameters)
void mergeGradSparseRemote(Parameter *para, std::vector<const std::vector<ParameterPtr> *> &slaveParameters)
void mergeGradDense(Parameter *para, std::vector<const std::vector<ParameterPtr> *> &slaveParameters)
void computeThread()
void valueDispatchThread()
void copyGradToBufferThread()
void gradCollectThread()
void copyInArgs()
void forward()
void backward()
void backwardCallback(Parameter *para)
void doCallback(int pid)

call the actuall callback supplied by the caller of GradientMachine::backward

Protected Attributes

MultiGradientMachine *multiMachine_
ModelConfig config_
bool stopping_

whether the thread should stop

int partnerId_

the threads form which to collect gradient

int threadId_

from 0 to threads-1

int deviceId_
std::unique_ptr<GradientMachine> gradientMachine_
std::vector<ParameterPtr> parameters_
std::vector<ParameterType> mergeTypes_

ParameterType which needs to be merged from each GPU.

std::unique_ptr<std::thread> computeThread_

compute thread

std::vector<Argument> inArgs_
std::vector<Argument> outArgs_
Semaphore taskReadySem_
Semaphore outArgsReadySem_
std::unique_ptr<std::thread> copyThread_

copy thread

PidQueue gradBufQueue_

queue of gradient needs to be copied to partner

hl_stream_t gradStream_
std::unique_ptr<std::thread> gradCollectThread_

grad merge thread

PidQueue gradQueue_

queue of gradient needs to be merged with gradient coopied by copyGradToBufferThread

UpdateCallback backwardCallback_
std::unique_ptr<std::thread> valueDispatchThread_

value dispatch thread

PidQueue valueReadyQueue_

queue of the parameter whose the vale are ready for copy

LockedCondition valueReadyCond_

used to notify all the parameter values are ready

hl_stream_t valueStream_
std::atomic<int> updateCounter_

how many parameters are updated

bool parameterUpdated_
bool inArgsCopied_

indicate whether inArgs is copied before forward()

Recurrent Gradient Machines

class paddle::RecurrentGradientMachine

Inherits from paddle::NeuralNetwork

Public Types

typedef std::function<void(const std::vector<std::vector<int> *>&, NeuralNetwork *, const int)> BeamSearchCandidatesAdjustCallback

BeamSearchCandidatesAdjustCallback.

Adjust searching candidates to restrict beam search searching within a limited subset of all possibile paths.

The first parameter is the prefixes of all formed paths in current beam search step, whose type is basically int[][].

The second parameter is a pointer to the network used to generate sequence, user can use this pointer to tranverse each layer in the network to modify behaivors of a particular layer.

The third parameter is an integer to indicate the iteration number of beam search, so that user can customize different operations in different beam search iterations.

typedef std::function<bool(int seqId, const std::vector<int>&, const std::vector<real>&)> DropCallback

DropCallback.

Drop a whole prefix or one candidate in beam search or not.

The first parameter is sequence index in a batch

The second parameter is one path in beam search, which is made up of node indices.

The third parameter is probabilites for each node in this path.

Return true if this prefix or candidate is expected to be dropped.

typedef std::function<void(int seqId, const std::vector<int>&, std::vector<real>&, real *)> NormOrDropNodeCallback

NormOrDropNodeCallback.

Normalize a path’s probabilities or just drop it by modifying path.logProb

The first parameter is sequence index in a batch

The second parameter is path.ids

The third parameter is probabilites for each node in this path.

The fourth parameter is the probability of the whole path.

typedef std::function<void(int)> EachStepCallback

EachStepCallback.

Invoke with beam search step.

Public Functions

RecurrentGradientMachine(const std::string &subModelName, NeuralNetwork *rootNetwork)
RecurrentGradientMachine(const RecurrentGradientMachine &other)
RecurrentGradientMachine &operator=(const RecurrentGradientMachine &other)
virtual ~RecurrentGradientMachine()
void init(const ModelConfig &config, ParamInitCallback callback, const std::vector<ParameterType> &parameterTypes, bool useGpu)
void prefetch(const std::vector<Argument> &inArgs)

Prefetch row ids of sparse parameter.

void forward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType)

Forward propagation.

Calculate outputs (outArgs) based the inputs (inArgs)

Note
: if passType==PASS_TEST, then backward() should not be called

void backward(const UpdateCallback &callback = nullptr)

Backward propagation.

Calculate the gradient of inArgs and parameter.

This function should only be called after a corresponding forward() call. The caller is responsible for filling the correct grad for the outArgs obtained using forward().

It may also change the grad field for the inArgs supplied at forward()

void forwardBackward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType, const UpdateCallback &callback)

Combine forward() and backward(). For multithread training, this may be faster.

Note
: passType PASS_TEST is not allowed for forwardBackward().

virtual void resetState()
void eval(Evaluator *evaluator)

evaluate using the given evaluator

const std::vector<int> &getParameterIds()
void registerBeamSearchControlCallbacks(const BeamSearchCandidatesAdjustCallback &adjustBeamSearch, const NormOrDropNodeCallback &normOrDropNode, const DropCallback &stopBeamSearch)

Register beam search control callbacks. Used for prediction.

Parameters
  • queryBeamSearch: Give the sequences already formed, return the nodes expected to be expanded. Input: A pointer to an array holding pathes which have been expanded Return: A pointer to an array holding nodes wanted to be expanded.
  • dropOneNode: Early drop a node in one beam search step. Given the path formed and probability history, decide whether a node should be dropped or not.
  • stopBeamSearch: Early stop a path in one beam search step. Given the path and probability history, decide whether a path should be dropped or not.

void removeBeamSearchControlCallbacks()

Remove user costumized beam search callbacks,.

make sequence generation acts like normal beam search.

void registerBeamSearchStatisticsCallbacks(const EachStepCallback &onEachStepStarted, const EachStepCallback &onEachStepStoped)

register statistics methods for performance profile of beam search.

Parameters
  • onEachStepStarted: invoke once a beam search step starts. Its input is index of the beam search step.
  • onEachStepStoped: invoke once a beam search step ends. Its input is index of the beam search step.

void removeBeamSearchStatisticsCallbacks()

Remove beam search callbacks.

void stopBeamSearch()

Stop beam search for current source.

Will restart beam search in the next forward

const std::vector<std::vector<Path>> &getFinalPaths() const

access beam search results.

Return
beam search results.

Protected Functions

void resizeOrCreateFrames(int numFrames)
void resizeBootFrame(int numSequences)
void generateSequence()
void oneWaySearch(size_t batchSize)
void beamSearch(size_t batchSize)
void createInFrameInfo(int inlinks_id, const Argument &input, PassType passType)
void createMemoryFrameInfo(MemoryFrameLine *memoryFrameLine, PassType passType)
void copyScattedId(std::vector<int> &srcIds, IVectorPtr *dstIds, int size)
void selectRowsOneTime(LayerPtr layer, const IVectorPtr &allIds, Argument *arg, PassType passType)
void createSeqPos(const std::vector<int> &sequenceStartPosition, ICpuGpuVectorPtr *sequenceStartPositions)

Protected Attributes

std::vector<InFrameLine> inFrameLines_
std::vector<OutFrameLine> outFrameLines_
std::vector<MemoryFrameLine> memoryFrameLines_
std::vector<Info> info_
std::vector<int> numSeqs_
std::vector<std::vector<Argument::SeqInfo>> seqInfos_
int targetInfoInlinkId_
std::unique_ptr<EosFrameLine> eosFrameLine_
Generator generator_
std::vector<std::unique_ptr<NeuralNetwork>> frames_
NeuralNetwork *rootNetwork_
bool reversed_
int maxSequenceLength_
bool useGpu_
bool stopBeamSearch_
std::vector<int> parameterIds_
std::unique_ptr<Evaluator> evaluator_
std::vector<Argument> dataArgs_
std::vector<std::vector<Argument>> dataArgsFrame_
size_t dataArgsSize_
IVectorPtr cpuId_
MatrixPtr cpuProb_
IVectorPtr cpuEos_
struct EosFrameLine

Public Members

std::vector<LayerPtr> layers
struct Generator

Public Members

GeneratorConfig config
std::vector<int> ids
Argument outArg
struct Info

Public Members

IVectorPtr allIds
std::vector<int> idIndex
ICpuGpuVectorPtr sequenceStartPositions
std::vector<int> seqStartPosIndex
struct InFrameLine

Public Members

std::string linkName
LayerPtr inLayer
std::vector<LayerPtr> agents
bool hasSubseq
Argument outArg
struct MemoryFrameLine

Public Members

std::string layerName
std::string linkName
LayerPtr bootLayer
LayerPtr biasLayer
LayerPtr rootLayer
LayerPtr rootAgent
std::vector<LayerPtr> frames
std::vector<LayerPtr> agents
std::vector<LayerPtr> scatterAgents
Argument outArg
bool is_sequence
IVectorPtr allIds
ICpuGpuVectorPtr sequenceStartPositions
struct OutFrameLine

Public Members

std::string layerName
LayerPtr agentLayer
std::vector<LayerPtr> frames
struct Path

Public Functions

Path()

Path default ctor, first logProb is 0.

Path(size_t seqId)
Path(Path &old, int newId, real logProb, int machineId, int topIndex)

Create a new path based on an old path and a new node with probability.

Parameters
  • old: old path
  • newId: index of the new node
  • logProb: probability of the new node.
  • machineId: sample index of a frame in RNN
  • topIndex: index of MaxIdLayer output in one sample

bool operator<(const Path &other) const

operator <

Path a < Path b means log probability of a is smaller than that of b

void recordHistory()

Start recording history in this path.

void adjustProb(int calc_id, bool atEos = false)

Adjust probability for DIY beam search interface. In normal situation, it will do nothing.

Parameters
  • calc_id: the object id for DIY beam search interface.
  • atEos: at end of sequence or not.

bool isDropable() const

isDropable indacating whether the current node will be dropped or not in beam search.

Note
: if logProb is -inf, current node will be dropped.
Return
true to drop the current node.

Public Members

std::vector<int> ids

ids, path of beam search.

real logProb

logProb, current probability of path.

int machineId
int topIndex
int seqId
std::vector<int> machineIdVec
std::vector<real> probHistory

A record of each node’s probality in a formed path in beam search.

Note
It could be empty when history is not recorded. If the history is wanted to be recorded, recordHistory() MUST be invoked first.

Public Static Functions

static bool greaterPath(const Path &a, const Path &b)

Networks

NeuralNetwork

class paddle::NeuralNetwork

Inherits from paddle::GradientMachine

Subclassed by paddle::MultiNetwork, paddle::ParallelNeuralNetwork, paddle::RecurrentGradientMachine

Public Functions

void paddle::NeuralNetwork::init(const ModelConfig & config, ParamInitCallback callback = nullptr, const std::vector< ParameterType > & parameterTypes = std::vector< ParameterType >{PARAMETER_VALUE, PARAMETER_GRADIENT, PARAMETER_MOMENTUM}, bool useGpu = FLAGS_use_gpu)
void connect(std::string agentLayerName, NeuralNetwork *srcNN, std::string realLayerName)
void prefetch(const std::vector<Argument> &inArgs)

Prefetch row ids of sparse parameter.

void forward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType)

Forward propagation.

Calculate outputs (outArgs) based the inputs (inArgs)

Note
: if passType==PASS_TEST, then backward() should not be called

void backward(const UpdateCallback &callback = nullptr)

Backward propagation.

Calculate the gradient of inArgs and parameter.

This function should only be called after a corresponding forward() call. The caller is responsible for filling the correct grad for the outArgs obtained using forward().

It may also change the grad field for the inArgs supplied at forward()

MatrixPtr getLayerOutput(const std::string &layerName)
const LayerPtr &getLayer(const std::string &layerName) const
void onPassEnd()
Evaluator *makeEvaluator()

Create an evaluator which can be used for eval()

void eval(Evaluator *evaluator)

evaluate using the given evaluator

void resetState()
void setOutputGrad(const std::vector<Argument> &args)

Set the gradient of the output from outside.

void setState(const MachineState &machineState)

set machine state

void getState(MachineState &machineState)

get machine state

ParameterMap *getParameterMap()
template <typename T>
void forEachLayer(T callback)

Access each layer as a for each loop.

Parameters
  • callback: invoke with each layer.

Public Static Functions

void connect(LayerPtr agentLayer, LayerPtr realLayer, int height = 0)

Connect two submodels and down-submodel’s output become up-submodel’s input. By default, connection is one by one, If the agent height is smaller than real layer, height has to be filled.

Parameters
  • realLayer: The down-submodel’s output layer.
  • agentLayer: The up-submodel’s input agent layer.

NeuralNetwork *create(const ModelConfig &config)
NeuralNetwork *newNeuralNetwork(const std::string &name = "", NeuralNetwork *rootNetwork = nullptr)

Protected Functions

NeuralNetwork(std::string subModelName = "", NeuralNetwork *rootNetwork = nullptr)

The constructor of NeuralNetwork. The sub networks can get parameters_ and parameterMap_ from base NeuralNetwork.

Parameters
  • subModelName: The name of sub-model.
  • rootNetwork: It used in MultiNetwork.

Protected Attributes

std::string subModelName_
ModelConfig config_
std::vector<LayerPtr> layers_
ParameterMap parameterMap_
LayerMap layerMap_
std::vector<DataLayerPtr> dataLayers_
std::vector<LayerPtr> outputLayers_
NeuralNetwork *rootNetwork_
bool paramSelfInited_

Whether parameter of this NN is initialized by its own (i.e., not by callback supplied with the caller)

Protected Static Attributes

std::map<std::string, bool> dllInitMap

ParallelNeuralNetwork

class paddle::ParallelNeuralNetwork

A ParallelNeuralNetwork is capable of calculating a neural network through multiple threads in parallel.

Inherits from paddle::NeuralNetwork

Public Functions

ParallelNeuralNetwork(std::string subModelName = "", NeuralNetwork *rootNetwork = nullptr)
void paddle::ParallelNeuralNetwork::init(const ModelConfig & config, ParamInitCallback callback = nullptr, const std::vector< ParameterType > & parameterTypes = std::vector< ParameterType >{PARAMETER_VALUE, PARAMETER_GRADIENT, PARAMETER_MOMENTUM}, bool useGpu = FLAGS_use_gpu)
void forward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType)

Forward propagation.

Calculate outputs (outArgs) based the inputs (inArgs)

Note
: if passType==PASS_TEST, then backward() should not be called

void backward(const UpdateCallback &callback = nullptr)

Backward propagation.

Calculate the gradient of inArgs and parameter.

This function should only be called after a corresponding forward() call. The caller is responsible for filling the correct grad for the outArgs obtained using forward().

It may also change the grad field for the inArgs supplied at forward()

void forwardBackward(const std::vector<Argument> &inArgs, std::vector<Argument> *outArgs, PassType passType, const UpdateCallback &callback = NULL)

Combine forward() and backward(). For multithread training, this may be faster.

Note
: passType PASS_TEST is not allowed for forwardBackward().

void start(const TrainerConfig &config, DataProviderPtr dataProvider)

Used before formal training, start work-threads and set trainer Parameters;.

Note
This function will only been implemented and used in a multithreaded environment.

void addComputeThread(int deviceId)
void dispatchByDeviceId(int deviceId, LayerPtr layer, TaskType task)
void waitAllThread()

Protected Attributes

bool useGpu_
int numDevices_

number of gpu devices

std::vector<std::unique_ptr<ParallelThread>> threads_