- 12 8月, 2019 3 次提交
-
-
由 joanna.wozna.intel 提交于
test=develop
-
由 chengduo 提交于
test=develop
-
由 gongweibao 提交于
Polish fleet API to support cuda collective mode and nccl2 mode
-
- 11 8月, 2019 1 次提交
-
-
由 yaoxuefeng 提交于
add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871) * add ctr related metric layer test=develop * add save cache and slots shuffle test=develop * add save cache and slots shuffle test=develop * fix error * fix error * fix style for ci * fix for comments * change SlotsShuffle input to std::strinf for generality * fix style * fix style * fix style * fix style * fix style * fix style * fix stylr * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * fix style * change non-const reference to pointer * fix style * fix style * fix style test=develop * fix style test=develop * add return ins num in ctr metric op * change dtype to float in metric_op.py * fix error test=develop * fix style test=develop * fix API spec * fix API spec * fix API spec test=develop * add UT test=develop
-
- 10 8月, 2019 1 次提交
-
-
由 hutuxian 提交于
* add a place field in DataFeed to denote which place it will feed data to. * abstract the copy process in CopyToFeedTensor function * add UT for float32 type and for CUDAPlace
-
- 09 8月, 2019 2 次提交
- 08 8月, 2019 2 次提交
-
-
由 jiaqi 提交于
* fix QueueDataset queue size,set queue size = batch size * 100, to avoid too many instances in channel when training is much slower than reading data.
-
由 Leo Chen 提交于
* fix memory overlapping of fetch var (return of executor.run), test=develop * fix wrong usage of ParallelExecutor in op_test, test=develop * remove useless parameter and simplify code * avoid tensor destruct untimely, test=develop * add testcase independent of OpTest, test=develop
-
- 06 8月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 02 8月, 2019 4 次提交
-
-
由 Zeng Jinle 提交于
* open gc by default, test=develop * fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop * fix conditional_block op eager deletion bug, test=develop * add some comments to reviewers, test=develop
-
由 jiaqi 提交于
* support filelist size < trainer num * pull dense when stop, to make sure local dense params are same as pserver, so save paddle model will save dense model same as pserver * enable QueueDataset train same filelist for serveral times
-
由 chengduo 提交于
* Disable fuse optimization test=develop
-
由 石晓伟 提交于
* add fusion_seqpool_cvm_concat test=develop * simplify pass, test=develop * fix code style, test=develop
-
- 01 8月, 2019 1 次提交
-
-
由 jiaqi 提交于
adjust ins weight according to nid slot , user can specify adjust_ins_weight in strategy
-
- 30 7月, 2019 1 次提交
-
- 29 7月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* remove legacy memory optimization codes, test=develop * follow huihuang's comments,test=develop * follow luotao's comments, test=develop
-
由 Thunderbrook 提交于
* dump slot * test * proto * dump slot * test * proto * code style * code style * code style * style * add delete after unseen days * add unseen days * code style * conflict solve test=develop * add clear model * code style test=develop * code style test=develop
-
- 27 7月, 2019 1 次提交
-
-
由 chengduo 提交于
* open fuse optimization ops test=develop
-
- 26 7月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* first version memory optimize pass, test=develop * remove move_tensor_sharing_pass, test=develop * refine code comments, add unittests, test=develop * turn off memory_optimize by default, test=develop * follow huihuang's comments, test=develop * follow chengduoZH's comments, test=develop * fix grammar error, add const qualifier, fix pass_test exception message, test=develop * follow chengduoZH's comments 2nd, test=develop
-
- 25 7月, 2019 1 次提交
-
-
由 fuyinno4 提交于
Fix FleetWrapper: 1. fix shrink dense: just scale show 2. add datanorm scale: divide datanorm's gradient by batch_size
-
- 24 7月, 2019 2 次提交
-
-
由 Zhaolong Xing 提交于
* update paddle-trt for: 1. fix bug: when batch > 2, core in split plugin. 2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.) 3. add new attr to dropout. 4. shuffle channel, swish, relu6 support test=develop * 1. fix ci test=develop
-
由 Thunderbrook 提交于
The change includes 2 things: 1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table. 2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta. test=develop
-
- 23 7月, 2019 2 次提交
-
-
由 jiaqi 提交于
(1)support patch data (merge slots of instances of same line id, modify dense layer which changes its size) (2)add fleet load_one_table interface, support load from paddle model and load from pslib model (3)fix push sparse bug which cause push sparse cost more time(about 10% in my testcase) (4)when some slots are not in one of your network (join/update, etc.),data feed、collect label info、push/pull sparse will skip these slots, instead of throw error. (5)add more debug info in TrainFilesWithProfiler
-
由 chengduo 提交于
* support sparse gradients test=develop
-
- 19 7月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)
-
- 18 7月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* feature/auto_growth_allocator, test=develop * add unittest of AlignedAllocator, test=develop * try to turn on auto_growth to test on CI, test=develop * fix segmentation fault in mixed_vector.h, test=develop * add unittests, test=develop
-
- 17 7月, 2019 1 次提交
-
-
由 guru4elephant 提交于
* remove async executor and add data_feed.proto to the deps of train demo
-
- 16 7月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 12 7月, 2019 2 次提交
-
-
由 Leo Zhao 提交于
* not use transferscope cache in cpu case test=develop * adjust variable name and add comments test=develop * use correct format for class member in operator.h * use correct format for class member in operator.cc test=develop
-
由 123malin 提交于
* fix int64_t * update fill constant op unittest * add empty line
-
- 11 7月, 2019 2 次提交
-
-
由 gongweibao 提交于
-
由 Zeng Jinle 提交于
* feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop
-
- 10 7月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* clean code of dim and place, test=develop * fix failed unittests, test=develop
-
- 09 7月, 2019 1 次提交
-
-
由 Jiabin Yang 提交于
* test=develop, fix docker with paddle nccl problem * test=develop, fix/gcc_4.8_ubt_link_error * test=develop, fix code format
-
- 08 7月, 2019 3 次提交
-
-
由 Zhaolong Xing 提交于
* Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop
-
由 Leo Zhao 提交于
-
由 gongweibao 提交于
-
- 04 7月, 2019 1 次提交
-
-
由 chengduo 提交于
-
- 03 7月, 2019 1 次提交
-
-
由 pkpk 提交于
test=develop
-