1. 24 9月, 2019 8 次提交
    • Y
      Add float16 support to `sync_batch_norm_op` (#19681) · ebff68fa
      Yang Zhang 提交于
      * Add float16 support to `sync_batch_norm_op`
      
      test=develop
      
      * Add test for sync_bn with FP16 input
      
      test=develop
      ebff68fa
    • A
      Remove constraint that last dimension is forced to be 1 by adding lookup_table_v2 (#19735) · 039b9710
      Aurelius84 提交于
      * Remove constraint that last dimension is forced to be 1 by add
      lookup_table_v2 test=develop
      
      * modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop
      
      * Revert "modify into PADDLE_ENFORCE_CUDA_SUCCESS test=develop"
      
      This reverts commit 8a960bfc61e51aa27c3c529df8fb90b93ebd19f9.
      
      * move api into fluid.embedding test=develop
      
      * fix example code test=develop
      
      * move one_hot into fluid.one_hot
      
      * modify api.spec test=develop
      
      * fix loss shape test=develop
      039b9710
    • W
      [PaddleSlim] Enhence compressor api in PaddleSlim (#19894) · bdb3e376
      whs 提交于
      
      1. Support customize eval function instead of eval program.
      2. Fix loading checkpoint in quantization strategy.
      3. Support saving eval model when saving a checkpoint.
      4. Fix decoder of loading context in PaddleSlim.
      5. Fix restoring from the checkpoint of uniform prune strategy.
      6. Support saving eval model and infer model during training.
      7. Add ‘unitest’ for saving eval model, saving infer model and uniform pruning restoring from the checkpoint.
      8. Fix pruning of depthwise_conv_grad op by updating the groups.
      bdb3e376
    • X
      support change shuffle and train thread num (#19841) · cedc0477
      xujiaqi01 提交于
      * support change shuffle thread num
      * support change train thread num
      * fix receive shuffle data of each channel
      * data norm stop gradient
      * add check thread_tensor type and root_tensor type when merge metric
      * remove sleep in shuffle, add config
      * add config of pslib client to client communication
      * fix xbox str
      * add data norm op testcase
      * add flush in trainer finalize
      cedc0477
    • K
      14625ffe
    • G
      give warnings when save a model without any parameters (#19931) · 790d5226
      Ghost Under Moon 提交于
      * give warnings when save a model without any parameters test=develop
      
      * delete one line comment test=develop
      790d5226
    • Z
      Add py_reader combination unittest (#19923) · f254b477
      Zeng Jinle 提交于
      * add py_reader combination unittest,test=develop
      
      * follow huihuang's comments, test=develop
      f254b477
    • L
      Make OpTest check grad inplace even if forward has no inplace (#19847) · 57606205
      Leo Chen 提交于
      * make OpTest check grad inplace even if forward has no inplace, test=develop
      
      * do not run PE when enable_inplace is False, test=develop
      
      * add conv3d cuda kernel for float16 type, test=develop
      
      * refactor OpTest for inplace, test=develop
      
      * add comments, test=develop
      57606205
  2. 23 9月, 2019 10 次提交
    • J
      add fake_quant_dequant_op for average pool2d, test=develop (#19880) · b0ceed6f
      juncaipeng 提交于
      * add fake_quant_dequant_op for average pool2d
      * add test
      b0ceed6f
    • Z
    • M
      Forward recompute3 (#19913) · 9901f696
      mapingshuo 提交于
      * add recompute based checkpoints methods for large batch training
      test=develop
      
      * add append_backward_with_forward_recomputation
      test=develop
      
      * refine optimizer
      test=develop
      
      * update backward and optimizer
      test=develop
      
      * make Variable usable
      test=develop
      
      * add recompute code
      
      * refine optimizer
      test=develop
      
      * refine addup _append_backward_ops_with_checkpoints_
      1) for recompute part, just cache the grad_op_desc without appending to block
      2) before appending grad_op_desc to backward part, addup_repetitive_vars, remove unused branch
      test=develop
      
      * make method private
      
      * add recompute strategy into DistributedStrategy
      test=develop
      
      * checkpoint version3
      test=develop
      
      * remove some print information
      test=develop
      
      * remove unused sumop
      test=develop
      
      * try to fix recompute with graph building modules
      
      * add input names to vars should be held
      
      * add memory debug tool
      
      * backup backward
      
      * Fix bugs
      
      * add backward desc for op not in any segments
      
      * add exception info for sub_block
      
      test=develop
      
      * modify code style
      
      test=develop
      
      * modify code style
      
      test=develop
      
      * remove print functions
      
      test=develop
      
      * add API spec
      
      test=develop
      test=document_preview
      
      * make Recompute a child class of Optimizer
      
      test=develop
      test=document_preview
      
      * add API spec
      
      test=develop
      test=document_preview
      
      * modify API spec
      
      test=develop
      test=document_preview
      
      * add document for Recompute
      
      test=develop
      test=document_preview
      
      * change API doc of Rcompute
      
      test=develop
      test=document_preview
      
      * code cleaning
      
      test=develop
      test=document_preview
      
      * modify API spec
      
      * fix bugs when segments hold no element
      
      * add testcase for Recompute Optimizer
      
      test=develop
      test=document_preview
      
      * add test for apply_gradient, and code cleaning
      
      test=develop
      test=document_preview
      
      * add test case for load function
      
      * enable CI
      
      test=develop
      test=document
      
      * add test case
      
      test=develop
      test=document_preview
      
      * add sample code for 4 function of recompute optimizer
      
      test=develop
      test=document_preview
      9901f696
    • C
      Delete local execution scopes (#19749) · d7251a8e
      chengduo 提交于
      * Add RecordHistoryLocalExecScopes
      test=develop
      d7251a8e
    • G
    • W
      optimize the error information when the input for while op has a wron… (#19872) · e606b175
      wopeizl 提交于
      * optimize the error information when the input for while op has a wrong shape test=develop
      e606b175
    • R
      add mse_loss (#19759) · d31c92a2
      ruri 提交于
      * add mse_loss op
      d31c92a2
    • T
      move tree_conv to fluid.contrib.layers (#19918) · a4919d36
      Tao Luo 提交于
      * move tree_conv to fluid.contrib.layers
      
      test=develop
      
      * update API.spec for tree_conv
      
      test=develop
      
      * update tree_conv api to increase unit coverage
      
      test=develop
      a4919d36
    • Z
      Unify DataLoader APIs (#19305) · 0436efd6
      Zeng Jinle 提交于
      * unify DataLoader APIs, test=develop
      
      * integrate iterable CPU Dataset, test=develop
      add GPU dataset supporting, test=develop
      
      * add unittests for dataset, test=develop
      
      * add more docs to dataloader apis, test=develop, test=document_preview
      
      * refine doc, test=develop
      
      * refine doc again, test=develop
      
      * increase coverage, test=develop
      0436efd6
    • T
      paddle cloud role maker fix (#19646) · 278dd003
      tangwei12 提交于
      * optimize cloud rolemaker, test=develop
      278dd003
  3. 22 9月, 2019 1 次提交
  4. 21 9月, 2019 4 次提交
    • A
      Add support for other axes in MKLDNN softmax op (#19907) · cb65439d
      Adam 提交于
      * Initial, functional commit
      
      * Clean commit related files
      test=develop
      cb65439d
    • J
      Feature/auto prune in dygraph (#19757) · 45425411
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      
      * support auto prune in dygraph mode
      
      * test=develop, support auto prune
      
      * test=develop, merge develop conflict
      
      * test=develop, fix test_layer and test_tracer ut
      
      * test=develop, fix bug which may cause stop_gradient disabled with a list of backward inputs
      45425411
    • A
    • Z
      e2372750
  5. 20 9月, 2019 8 次提交
  6. 19 9月, 2019 9 次提交
    • F
      hide with inference optim API (#17355) · fe18cfdb
      flame 提交于
      fe18cfdb
    • A
      Remove constraint that last dimension is forced to be 1 in cross_entropy (#19606) · b125e327
      Aurelius84 提交于
      * Remove constraint that last dimension is forced to be 1 in cross_entropy
      test=develop
      
      * modify labels last dims test=develop
      b125e327
    • G
      change _origin_program test=develop (#19863) · e8d3745c
      gongweibao 提交于
      change _origin_program test=develop
      e8d3745c
    • W
      add precise roi pooling op test=develop (#18960) · a7c440d3
      wopeizl 提交于
      * add precise roi pooling op test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * test=develop
      
      * detail the description test=develop
      
      * test=develop
      
      * elaborate the doc for return type test=develop
      
      * test=develop
      a7c440d3
    • Y
      Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6
      Yiqun Liu 提交于
      * Add fc_elementwise_layernorm_fuse pass and unittest.
      
      * Add fused_fc_elementwise_layernorm op and its GPU kernel.
      test=develop
      
      * Apply fc_elementwise_layernorm_fuse_pass to GPU inference.
      
      * Add the setting of attrs in the definition of binary_op.
      test=develop
      
      * Add comment.
      
      * Implement the unittest.
      test=develop
      
      * Change the unittest name of layer_norm.
      test=develop
      3cd985a6
    • W
      distribute.launch use poll to query subprocess (#19853) · 8c2c8dc6
      WangXi 提交于
      distribute.launch use poll to query subprocess
      8c2c8dc6
    • C
      Disable test_dygraph_mnist_fp16.py (#19844) · 8e927327
      chengduo 提交于
      * Fix std::ostream& operator<<(std::ostream& os, const Tensor& t)
      test=develop
      
      * Fix test_dygraph_mnist_fp16
      test=develop
      
      * disable test_dygraph_mnist_fp16
      test=develop
      
      * revert tensor_util.cc fix
      test=develop
      8e927327
    • J
      Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714) · d9db94d7
      Jie Fang 提交于
      Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
      d9db94d7
    • W
      Strided slice (#19642) · 47af618f
      wangchaochaohu 提交于
      * strided_slice op basic function test=develop
      
      * test=develop rewrite and fix
      
      * fix bug test=develop
      
      * fix for the PADDLE_ENFORCE usage
      
      * add some unit testw
      
      * fix for the aip  test and copright and fix test=develop
      
      * fix API.spec test=develop
      
      * fix API.spec test=develop
      
      * add axis parameter test=develop
      
      * fix for the build error test=develop
      
      * fix python api  test=develop
      
      * fix the build test=develop
      
      * fix build test=develop
      
      * fix API spec test=develop
      
      * test=develop add some comment and single op test
      
      * fix API spece test=develop
      
      * fix test=develop
      
      * fix test=develop
      
      * fix api test=develop
      
      * fix api test=develop
      
      * fix API.spec test=develop
      
      * fix typo test=develop
      
      * fix API.spec test=develop
      
      * fix API typo test=develop
      
      * fix doc and API.spec test=develop
      47af618f