- 24 9月, 2019 6 次提交
-
-
由 whs 提交于
1. Support customize eval function instead of eval program. 2. Fix loading checkpoint in quantization strategy. 3. Support saving eval model when saving a checkpoint. 4. Fix decoder of loading context in PaddleSlim. 5. Fix restoring from the checkpoint of uniform prune strategy. 6. Support saving eval model and infer model during training. 7. Add ‘unitest’ for saving eval model, saving infer model and uniform pruning restoring from the checkpoint. 8. Fix pruning of depthwise_conv_grad op by updating the groups.
-
由 xujiaqi01 提交于
* support change shuffle thread num * support change train thread num * fix receive shuffle data of each channel * data norm stop gradient * add check thread_tensor type and root_tensor type when merge metric * remove sleep in shuffle, add config * add config of pslib client to client communication * fix xbox str * add data norm op testcase * add flush in trainer finalize
-
由 Kaipeng Deng 提交于
-
由 Ghost Under Moon 提交于
* give warnings when save a model without any parameters test=develop * delete one line comment test=develop
-
由 Zeng Jinle 提交于
* add py_reader combination unittest,test=develop * follow huihuang's comments, test=develop
-
由 Leo Chen 提交于
* make OpTest check grad inplace even if forward has no inplace, test=develop * do not run PE when enable_inplace is False, test=develop * add conv3d cuda kernel for float16 type, test=develop * refactor OpTest for inplace, test=develop * add comments, test=develop
-
- 23 9月, 2019 10 次提交
-
-
由 juncaipeng 提交于
* add fake_quant_dequant_op for average pool2d * add test
-
由 Zhang Ting 提交于
-
由 mapingshuo 提交于
* add recompute based checkpoints methods for large batch training test=develop * add append_backward_with_forward_recomputation test=develop * refine optimizer test=develop * update backward and optimizer test=develop * make Variable usable test=develop * add recompute code * refine optimizer test=develop * refine addup _append_backward_ops_with_checkpoints_ 1) for recompute part, just cache the grad_op_desc without appending to block 2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch test=develop * make method private * add recompute strategy into DistributedStrategy test=develop * checkpoint version3 test=develop * remove some print information test=develop * remove unused sumop test=develop * try to fix recompute with graph building modules * add input names to vars should be held * add memory debug tool * backup backward * Fix bugs * add backward desc for op not in any segments * add exception info for sub_block test=develop * modify code style test=develop * modify code style test=develop * remove print functions test=develop * add API spec test=develop test=document_preview * make Recompute a child class of Optimizer test=develop test=document_preview * add API spec test=develop test=document_preview * modify API spec test=develop test=document_preview * add document for Recompute test=develop test=document_preview * change API doc of Rcompute test=develop test=document_preview * code cleaning test=develop test=document_preview * modify API spec * fix bugs when segments hold no element * add testcase for Recompute Optimizer test=develop test=document_preview * add test for apply_gradient, and code cleaning test=develop test=document_preview * add test case for load function * enable CI test=develop test=document * add test case test=develop test=document_preview * add sample code for 4 function of recompute optimizer test=develop test=document_preview
-
由 chengduo 提交于
* Add RecordHistoryLocalExecScopes test=develop
-
由 Ghost Under Moon 提交于
-
由 wopeizl 提交于
* optimize the error information when the input for while op has a wrong shape test=develop
-
由 ruri 提交于
* add mse_loss op
-
由 Tao Luo 提交于
* move tree_conv to fluid.contrib.layers test=develop * update API.spec for tree_conv test=develop * update tree_conv api to increase unit coverage test=develop
-
由 Zeng Jinle 提交于
* unify DataLoader APIs, test=develop * integrate iterable CPU Dataset, test=develop add GPU dataset supporting, test=develop * add unittests for dataset, test=develop * add more docs to dataloader apis, test=develop, test=document_preview * refine doc, test=develop * refine doc again, test=develop * increase coverage, test=develop
-
由 tangwei12 提交于
* optimize cloud rolemaker, test=develop
-
- 22 9月, 2019 1 次提交
-
-
由 lvmengsi 提交于
* add instance norm op
-
- 21 9月, 2019 4 次提交
-
-
由 Adam 提交于
* Initial, functional commit * Clean commit related files test=develop
-
由 Jiabin Yang 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * test=develop, refoctor name to make it easier to understand * test=develop, refoctor name to make it easier to understand * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ * test=develop, fix ut failed on parallel se-resnext * test=develop, change one more PADDLE_ENFORCE * support auto prune in dygraph mode * test=develop, support auto prune * test=develop, merge develop conflict * test=develop, fix test_layer and test_tracer ut * test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs
-
由 Aurelius84 提交于
-
由 Zeng Jinle 提交于
-
- 20 9月, 2019 7 次提交
-
-
由 Zeng Jinle 提交于
-
由 Aurelius84 提交于
* support 2-level lod of input in sequence_pool test=develop * fix lod level bug in .cu test=develop
-
由 chengduo 提交于
test=developt
-
由 Zhang Ting 提交于
1. group_norm support data_layout=NHWC 2. modified doc of group_norm
-
由 Zhang Ting 提交于
modified interpolate_op to support tensor attribute 1. the parameter out_shape of image_resize、resize_nearest/bilinear/trilinear can be a list or a 1-D tensor variable. If a list, each element can be an integer or a tensor variable with shape: [1]. 2. the parameter scale of above Ops can be a 1-D tensor variable. modified document of image_resize, resize_nearest, resize_bilinear, resize_trilinear and add some code example.
-
由 Zhang Ting 提交于
add crop_tensor op. The main difference with crop is : 1. If the argument shape is a list, each element is an integer or a tensor variable with shape: [1]. This way is suitable for the case that the shape may be changed each iteration. 2. If the argument shape is a variable. Its rank must be 1. In crop op, the rank of shape must be the same as x offsets can be a list, in which each element is an integer or a tensor variavle with shape: [1].
-
由 chengduo 提交于
test=develop
-
- 19 9月, 2019 9 次提交
-
-
由 flame 提交于
-
由 Aurelius84 提交于
* Remove constraint that last dimension is forced to be 1 in cross_entropy test=develop * modify labels last dims test=develop
-
由 gongweibao 提交于
change _origin_program test=develop
-
由 wopeizl 提交于
* add precise roi pooling op test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * detail the description test=develop * test=develop * elaborate the doc for return type test=develop * test=develop
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
由 WangXi 提交于
distribute.launch use poll to query subprocess
-
由 chengduo 提交于
* Fix std::ostream& operator<<(std::ostream& os, const Tensor& t) test=develop * Fix test_dygraph_mnist_fp16 test=develop * disable test_dygraph_mnist_fp16 test=develop * revert tensor_util.cc fix test=develop
-
由 Jie Fang 提交于
Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
-
由 wangchaochaohu 提交于
* strided_slice op basic function test=develop * test=develop rewrite and fix * fix bug test=develop * fix for the PADDLE_ENFORCE usage * add some unit testw * fix for the aip test and copright and fix test=develop * fix API.spec test=develop * fix API.spec test=develop * add axis parameter test=develop * fix for the build error test=develop * fix python api test=develop * fix the build test=develop * fix build test=develop * fix API spec test=develop * test=develop add some comment and single op test * fix API spece test=develop * fix test=develop * fix test=develop * fix api test=develop * fix api test=develop * fix API.spec test=develop * fix typo test=develop * fix API.spec test=develop * fix API typo test=develop * fix doc and API.spec test=develop
-
- 18 9月, 2019 3 次提交
-
-
由 Zeng Jinle 提交于
-
由 Huihuang Zheng 提交于
-
由 123malin 提交于
* rpc retry for asycsend/get/prefetch * test=develop, change retry vlog level to 3 * test=develop, set default grpc_retry_times is 3
-