Optimizer¶
- 
namespace paddle¶
- 
class SgdOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
SgdOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 
- 
 - 
class SparseMomentumParameterOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
SparseMomentumParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual ParameterOptimizer::TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 
- 
 - 
class AdagradParameterOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
AdagradParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual ParameterOptimizer::TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - Protected Attributes - 
int64_t numUpdates_¶
 - Protected Static Attributes - 
const int64_t kMaxNumAccumulates¶
 
- 
 - 
class AdaDeltaParameterOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
AdaDeltaParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 
- 
 - 
class RMSPropParameterOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
RMSPropParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - Protected Attributes - 
real rou_¶
 - 
real epsilon_¶
 - 
int64_t timer_¶
- counting batches, donot need catch up with t(timer_) is current time, t0(t0Vec_) are last occur time of i rows. if one block is update by multi threads, caller should hash sparse ids to avoid write conflict in t0Vec_. 
 - 
std::vector<int64_t> t0Vec_¶
 
- 
 - 
class DecayedAdagradParameterOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
DecayedAdagradParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - Protected Attributes - 
real rou_¶
 - 
real epsilon_¶
 - 
int64_t timer_¶
- counting batches, donot need catch up with t(timer_) is current time, t0(t0Vec_) are last occur time of i rows. if one block is update by multi threads, caller should hash sparse ids to avoid write conflict in t0Vec_. 
 - 
std::vector<int64_t> t0Vec_¶
 
- 
 - 
class AdamParameterOptimizer¶
- #include <FirstOrderOptimizer.h>Adam Optimizer. Reference Paper: http://arxiv.org/abs/1412.6980 Algorithm 1 Inherits from paddle::ParameterOptimizer Public Functions - 
AdamParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 
- 
 - 
class AdamaxParameterOptimizer¶
- #include <FirstOrderOptimizer.h>AdaMax Optimizer. Reference Paper: http://arxiv.org/abs/1412.6980 Algorithm 2 Inherits from paddle::ParameterOptimizer Public Functions - 
AdamaxParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 
- 
 - 
class AddOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
AddOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 
- 
 - 
class DummyOptimizer¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
DummyOptimizer(const OptimizationConfig &optConfig)¶
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 
- 
 - 
class OptimizerWithGradientClipping¶
- Inherits from paddle::ParameterOptimizer - Public Functions - 
OptimizerWithGradientClipping(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startPass()¶
 - 
virtual void finishPass()¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual void setNoDecay()¶
 - Protected Attributes - 
std::unique_ptr<ParameterOptimizer> optimizer_¶
 
- 
 
- 
class 
- 
namespace paddle¶
- 
class AverageOptimizer¶
- Inherits from paddle::ParameterOptimizer - Subclassed by paddle::AverageSparseOptimizer - Public Functions - 
AverageOptimizer(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer, bool useParameterApply)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startPass()¶
 - 
virtual void finishPass()¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual ParameterOptimizer::TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - 
virtual TraverseCallback startCatchUpWith() const¶
- following hooks catch up with current time for sparse update, In the beginning, call startCatchUpWith() and check return. In the end, call finishCatchUpWith() to finish state. callback do the actual works, can call many times for sparse data. e.g. Trainer call like this: - auto callback = startCatchUpWith(); if (callback) { // do catch up with, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : rows_in_block) {callback();} } // finish catch up with, main thread finishCatchUpWith(); } - Return
- callback if need catch up with, else return nullptr. It should be no state change.
 
 - 
virtual void finishCatchUpWith()¶
 - 
virtual ParameterOptimizer::TraverseCallback apply()¶
- following two hooks used by averager, apply to final parameter value (PARAMETER_VALUE or PARAMETER_APPLY). - restore() will restore orginal value if it apply to PARAMETER_VALUE. Caller must ensure it’s catched up with current time before apply. - Use returned callback same way as callback returned by ParameterOptimizer::needSpecialTraversal() 
 - 
virtual ParameterOptimizer::TraverseCallback restore()¶
 - 
virtual void setNoDecay()¶
 - Public Static Functions - 
ParameterOptimizer *create(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer, bool isParameterSparse = false, bool useParameterApply = false)¶
 - Protected Attributes - 
std::unique_ptr<ParameterOptimizer> optimizer_¶
 - 
bool useApply_¶
 - 
int64_t numUpdates_¶
 - 
int64_t prevNumUpdates_¶
 - 
int64_t numAccumulates_¶
 - 
int64_t oldNumAccumulates_¶
 - 
int64_t minAverageWindow_¶
 - 
int64_t maxAverageWindow_¶
 - Protected Static Attributes - 
const int64_t kMaxNumAccumulates¶
 
- 
 - 
class AverageSparseOptimizer¶
- Inherits from paddle::AverageOptimizer - Public Functions - 
AverageSparseOptimizer(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer, bool useParameterApply)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
void catchUpWith(const VectorPtr vecs[], const ParameterConfig ¶Config, size_t sparseId) const¶
 - 
virtual ParameterOptimizer::TraverseCallback startCatchUpWith() const¶
- following hooks catch up with current time for sparse update, In the beginning, call startCatchUpWith() and check return. In the end, call finishCatchUpWith() to finish state. callback do the actual works, can call many times for sparse data. e.g. Trainer call like this: - auto callback = startCatchUpWith(); if (callback) { // do catch up with, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : rows_in_block) {callback();} } // finish catch up with, main thread finishCatchUpWith(); } - Return
- callback if need catch up with, else return nullptr. It should be no state change.
 
 - 
virtual void finishCatchUpWith()¶
 
- 
 
- 
class 
- 
namespace paddle
- 
class ParameterOptimizer¶
- #include <ParameterOptimizer.h>Some member functions are set to const for two reasons: - For sparse update thread safe: update(), traverse callback(const this) may be called many times, each time one row, and these function can be called parallelly by multi worker, to speed up large block.
- For predicate functions, needSpecialTraversal(), startCatchUpWith() may be called many times, should be no state change between calls.
 Subclassed by paddle::AdaDeltaParameterOptimizer, paddle::AdagradParameterOptimizer, paddle::AdamaxParameterOptimizer, paddle::AdamParameterOptimizer, paddle::AddOptimizer, paddle::AverageOptimizer, paddle::DecayedAdagradParameterOptimizer, paddle::DummyOptimizer, paddle::OptimizerWithGradientClipping, paddle::OptimizerWithRegularizer, paddle::RMSPropParameterOptimizer, paddle::SgdOptimizer, paddle::SparseMomentumParameterOptimizer Public Types - 
typedef std::function<void(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId)> TraverseCallback¶
 Public Functions - 
ParameterOptimizer(const OptimizationConfig &optConfig)¶
 - 
real calcLearningRate(int64_t numSamplesProcessed, int64_t pass)¶
 - 
virtual ~ParameterOptimizer()¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startPass()¶
 - 
virtual void finishPass()¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId = -1LU) const = 0¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual TraverseCallback startCatchUpWith() const¶
- following hooks catch up with current time for sparse update, In the beginning, call startCatchUpWith() and check return. In the end, call finishCatchUpWith() to finish state. callback do the actual works, can call many times for sparse data. e.g. Trainer call like this: - auto callback = startCatchUpWith(); if (callback) { // do catch up with, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : rows_in_block) {callback();} } // finish catch up with, main thread finishCatchUpWith(); } - Return
- callback if need catch up with, else return nullptr. It should be no state change.
 
 - 
virtual void finishCatchUpWith()¶
 - 
virtual TraverseCallback apply()¶
- following two hooks used by averager, apply to final parameter value (PARAMETER_VALUE or PARAMETER_APPLY). - restore() will restore orginal value if it apply to PARAMETER_VALUE. Caller must ensure it’s catched up with current time before apply. - Use returned callback same way as callback returned by ParameterOptimizer::needSpecialTraversal() 
 - 
virtual TraverseCallback restore()¶
 - 
const std::vector<ParameterType> &getParameterTypes() const¶
- return the parameter types used by this updater 
 - 
void addParameterType(ParameterType type)¶
 - 
real getLearningRate() const¶
 - 
virtual void setNoDecay()¶
 Public Static Functions - 
ParameterOptimizer *create(const OptimizationConfig &optConfig, bool inPserver = false)¶
 Protected Types - 
typedef std::vector<ParameterOptimizer::TraverseCallback> TraverseCallbackVec¶
 Protected Attributes - 
bool applyDecay_¶
 - 
const OptimizationConfig &optConfig_¶
 - 
std::vector<ParameterType> parameterTypes_¶
 - 
real learningRate_¶
- global learning rate, init value is opt_config.learning_rate, sparse regularizer get this value per batch, after StartBatch() called so, if lr change in StartBatch, please assign to learningRate_ 
 - 
std::unique_ptr<LearningRateScheduler> learningRateScheduler_¶
 - 
int64_t pass_¶
 - 
bool firstTime_¶
 Protected Static Functions - 
static TraverseCallback composeCallbacks(const TraverseCallbackVec &callbacks)¶
 
 
- 
class 
- 
namespace paddle
- 
class OptimizerWithRegularizer¶
- Inherits from paddle::ParameterOptimizer - Subclassed by paddle::OptimizerWithRegularizerEveryNumBatches, paddle::OptimizerWithRegularizerSparse - Public Functions - 
OptimizerWithRegularizer(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer, Regularizer *regularizer)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void startPass()¶
 - 
virtual void finishPass()¶
 - 
virtual void startBatch(int64_t numSamplesProcessed)¶
- called by Trainer before forward() of a batch. 
 - 
virtual void finishBatch()¶
- called by Trainer after backward() of a batch 
 - 
virtual TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - Public Static Functions - 
ParameterOptimizer *create(const OptimizationConfig &optConfig, const ParameterConfig ¶Config, bool isParameterSparse, bool inPserver)¶
 - Protected Attributes - 
std::unique_ptr<ParameterOptimizer> optimizer_¶
 - 
Regularizer *regularizer_¶
 - 
int timer_¶
- counting batches, clear after catch up with t(timer_) is current time, t0(t0Vec_) are last occur time of i rows. if one block is update by multi threads, caller should hash sparse ids to avoid write conflict in t0Vec_. 
 
- 
 - 
class OptimizerWithRegularizerEveryNumBatches¶
- Inherits from paddle::OptimizerWithRegularizer - Public Functions - 
OptimizerWithRegularizerEveryNumBatches(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer, Regularizer *regularizer)¶
 - 
virtual void startPass()¶
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
virtual ParameterOptimizer::TraverseCallback needSpecialTraversal(const ParameterConfig &config) const¶
- following hooks useful for sparse update, because the traversal in block costs. called by Trainer after update and before finishBatch e.g. Trainer call like this: - startBatch(); if (dense) { update(blockVec); } else {//sparse for (row : rows_in_block) {update(rowVec)} } auto callback = needSpecialTraversal(); if (callback) { // do traverse, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : all_rows_in_block) {callback();} } } finishBatch(); - Return
- callback if need traverse, else return nullptr. It should be no state change.
 
 - 
void doTraversal(const VectorPtr vecs[], const ParameterConfig &config) const¶
 - 
void catchUpWith(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
 - 
virtual ParameterOptimizer::TraverseCallback startCatchUpWith() const¶
- following hooks catch up with current time for sparse update, In the beginning, call startCatchUpWith() and check return. In the end, call finishCatchUpWith() to finish state. callback do the actual works, can call many times for sparse data. e.g. Trainer call like this: - auto callback = startCatchUpWith(); if (callback) { // do catch up with, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : rows_in_block) {callback();} } // finish catch up with, main thread finishCatchUpWith(); } - Return
- callback if need catch up with, else return nullptr. It should be no state change.
 
 - 
virtual void finishCatchUpWith()¶
 - Protected Functions - 
bool isRegularizationBatch(const ParameterConfig &config) const¶
 - Protected Attributes - 
int baseTimer_¶
- recored the timer_ value while catchUpWith called. 
 
- 
 - 
class OptimizerWithRegularizerSparse¶
- Inherits from paddle::OptimizerWithRegularizer - Public Functions - 
OptimizerWithRegularizerSparse(const OptimizationConfig &optConfig, ParameterOptimizer *optimizer, Regularizer *regularizer)¶
 - 
virtual void init(size_t numRows, const ParameterConfig *config)¶
- For sparse update, optimizer can maintain numRows of timer(t0). Some sparse optimizer depends on parameter config in functions such as startBatch(). Optimizer can get it here. But notice that, not all callers can pass config here, so the optimizer should check config passed in is not null ptr. 
 - 
virtual void update(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
- between startBatch() and finishBatch(), update() will be called by the trainer multiple times, each time for updating one Parameter with its gradient in PARAMETER_GRADIENT. sparseId is row id, when sparseId set, update is sparse, each time one row. 
 - 
void catchUpWith(const VectorPtr vecs[], const ParameterConfig &config, size_t sparseId) const¶
 - 
virtual ParameterOptimizer::TraverseCallback startCatchUpWith() const¶
- following hooks catch up with current time for sparse update, In the beginning, call startCatchUpWith() and check return. In the end, call finishCatchUpWith() to finish state. callback do the actual works, can call many times for sparse data. e.g. Trainer call like this: - auto callback = startCatchUpWith(); if (callback) { // do catch up with, maybe multi-thread if (dense) { callback(); } else {//sparse for (row : rows_in_block) {callback();} } // finish catch up with, main thread finishCatchUpWith(); } - Return
- callback if need catch up with, else return nullptr. It should be no state change.
 
 - 
virtual void finishCatchUpWith()¶
 - Protected Attributes - 
std::vector<int32_t> t0Vec_¶
- t0Vec_ are last occur time of i rows if one block is update by multi threads, caller should hash sparse ids to avoid write conflict in t0Vec_. 
 
- 
 
- 
class