1. 19 9月, 2019 5 次提交
    • Y
      Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6
      Yiqun Liu 提交于
      * Add fc_elementwise_layernorm_fuse pass and unittest.
      
      * Add fused_fc_elementwise_layernorm op and its GPU kernel.
      test=develop
      
      * Apply fc_elementwise_layernorm_fuse_pass to GPU inference.
      
      * Add the setting of attrs in the definition of binary_op.
      test=develop
      
      * Add comment.
      
      * Implement the unittest.
      test=develop
      
      * Change the unittest name of layer_norm.
      test=develop
      3cd985a6
    • W
      distribute.launch use poll to query subprocess (#19853) · 8c2c8dc6
      WangXi 提交于
      distribute.launch use poll to query subprocess
      8c2c8dc6
    • C
      Disable test_dygraph_mnist_fp16.py (#19844) · 8e927327
      chengduo 提交于
      * Fix std::ostream& operator<<(std::ostream& os, const Tensor& t)
      test=develop
      
      * Fix test_dygraph_mnist_fp16
      test=develop
      
      * disable test_dygraph_mnist_fp16
      test=develop
      
      * revert tensor_util.cc fix
      test=develop
      8e927327
    • J
      Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714) · d9db94d7
      Jie Fang 提交于
      Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus
      d9db94d7
    • W
      Strided slice (#19642) · 47af618f
      wangchaochaohu 提交于
      * strided_slice op basic function test=develop
      
      * test=develop rewrite and fix
      
      * fix bug test=develop
      
      * fix for the PADDLE_ENFORCE usage
      
      * add some unit testw
      
      * fix for the aip  test and copright and fix test=develop
      
      * fix API.spec test=develop
      
      * fix API.spec test=develop
      
      * add axis parameter test=develop
      
      * fix for the build error test=develop
      
      * fix python api  test=develop
      
      * fix the build test=develop
      
      * fix build test=develop
      
      * fix API spec test=develop
      
      * test=develop add some comment and single op test
      
      * fix API spece test=develop
      
      * fix test=develop
      
      * fix test=develop
      
      * fix api test=develop
      
      * fix api test=develop
      
      * fix API.spec test=develop
      
      * fix typo test=develop
      
      * fix API.spec test=develop
      
      * fix API typo test=develop
      
      * fix doc and API.spec test=develop
      47af618f
  2. 18 9月, 2019 11 次提交
  3. 17 9月, 2019 20 次提交
    • A
      Add MKLDNNhandlerT templatized class (#19801) · dfdd73cb
      Adam 提交于
      test=develop
      dfdd73cb
    • Z
      cabb9501
    • P
      zerocopytensor support uint8, analysis config support profile, analysis... · 9cbc1eff
      Pei Yang 提交于
      zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
      
      9cbc1eff
    • C
      add deformable conv v1 op and cpu version of deformable conv v2 (#18500) · 00efd1d8
      chengjuntao 提交于
      * add deformable conv v1 op, test=develop
      00efd1d8
    • T
      rm return in vfork (#19734) · 40c66f8d
      Thunderbrook 提交于
      * rm return in vfork
      
      * rm return in vfork
      test=develop
      40c66f8d
    • C
      Add fp16 support for dygraph (#19828) · b99fc38c
      chengduo 提交于
      * Add fp16 support for dygraph
      test=develop
      
      * Add unit test
      test=develop
      b99fc38c
    • Z
      fix memory optimization type (#19781) · 110be57c
      Zhaolong Xing 提交于
      test=develop
      110be57c
    • L
      Enhance OpTest to support double grad inplace check (#19826) · 5fbf03d6
      Leo Chen 提交于
      * update OpTest to support double grad inplace check, test=develop
      
      * keep consistency of _calc_output function, test=develop
      5fbf03d6
    • X
      fix libps.so path problem (#19768) · 6045541e
      xujiaqi01 提交于
      * fix libps.so path problem of  1/2/3 dir and third_party
      * test = develop
      6045541e
    • L
      fix pow op, support tensor for agument factor. (#19313) · 677e7144
      liym27 提交于
      improve pow op according to reviews:
      1. Delete unnecessary judgement statements in PowGradOpDescMaker;
      2. Improve test of test_api;
      
      overload GetKernelTypeForVar
      
      add stop_gradient=True when attr(factor) is tensor Variable, change examples in API pow.
      test=develop,test=document_preview
      677e7144
    • L
      add tensor support for argument shape in reshape op; (#19268) · bd89a273
      liym27 提交于
      add support parameter inference when argument shape is a list containing integer and tensor variable;
      test=develop
      
      fix reshape op according to reviews:
      1. improve or message;
      2. improve test of test_api.
      test=develop,test=document_preview
      
      fix reshape op: Add error message in nn.py, test=develop
      
      add stop_gradient=True when attr(shape) is tensor Variable.
      change examples in API reshape.
      test=develop,test=document_preview
      bd89a273
    • L
      add tensor(tensor and tensor in list) support for argument starts and ends in slice op; (#19208) · 88628016
      liym27 提交于
      add support parameter inference when arguments starts or ends is a list containing integer and tensor variable;
      test=develop,test=document_preview
      
      improve slice op according to review(from hongyu). test=develop
      
      fix slice op according to review: infer_flags, test=develop
      
      fix slice op: improve overload operator __getitem__ to support attrs(starts and ends) are Variable.
      test=develop,test=document_preview
      
      fix test_slice_op: add TestSliceOp_decs_dim_6 to resolve conflict with test_slice_ngraph_op. test=develop
      
      add stop_gradient=True when attr(starts) or attr(ends) is tensor Variable.
      test=develop,test=document_preview
      88628016
    • L
      fix expand op: (#19302) · e9e3c087
      liym27 提交于
      1. add tensor support for argument expand_times in expand op;
      2. add support parameter inference when argument expand_times is a list containing integer and tensor variable;
      
      improve expand op according to reviews:
      1. add doc of ExpandTimes in expand_op.cc;
      2. improve the test of test_api.
      
      add stop_gradient=True when attr(expand_times) is tensor Variable, change code examples.
      test=develop,test=document_preview
      e9e3c087
    • X
      support preload thread, optimize hdfs log, fix master+patch bug (#19695) · 6bf298bf
      xujiaqi01 提交于
      * support preload thread
      * sleep before fleet wrapper exit for pslib core dump
      * optimize hdfs log
      * fix master+patch bug
      6bf298bf
    • H
    • J
      Feature/add transform data dygraph (#19707) · cc311bdf
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * add transform_data to dygraph
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * add test and change input to const ref for safety
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * add ut for data transform
      
      * refine ut for data_transform
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      
      * add test_tracer on multiple devices
      
      * test=develop, change place to mutable for data transform
      
      * test=develop, add transform data on same place test and remove useless log
      
      * test=develop, Add to do for data layout and and ut for conv2d with no bias
      cc311bdf
    • L
      cpu Conv double grad (#19672) · b76343c3
      lvmengsi 提交于
      * cpu conv_grad_grad
      b76343c3
    • Z
    • Implement FusedEmbeddingSeqPoolGradKernel with cblas_saxpy (#19770) · 93c85c93
      翟飞跃 提交于
      * Implement the operator with sprase matrix multiply
      
      * Update the URL of mklml library.
      
      test=develop
      
      * Disable MKLML implematation when using no-linux.
      
      test=develop
      
      * optimize bp with mkl sparse matrix
      test=develop
      
      * tmp add fused_emb_seq layer
      
      * Add the support of padding_idx attribute.
      
      test=develop
      
      * add padding_idx support
      test=develop
      
      * implement grad refer lego
      test=develop
      93c85c93
    • C
      Fix example error of Variable and Operator (#19821) · 2729c174
      chengduo 提交于
      * fix example error
      test=develop
      
      * Remove set_desc
      test=develop
      2729c174
  4. 16 9月, 2019 4 次提交
    • C
      Fix warning info of build_strategy (#19805) · 82814970
      chengduo 提交于
      * fix warning info
      test=develop
      
      * fix bug of all_reduce_deps_pass
      test=develop
      82814970
    • R
      add unittest for square error cost op (#19746) · a0e9b7b9
      ruri 提交于
      * add unit test for square error cost op
      a0e9b7b9
    • Z
      fix retry allocator bug, test=develop (#19794) · b34933d9
      Zeng Jinle 提交于
      b34933d9
    • Y
      Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      
      * Enhance fc_fuse_pass to enable fusing relu.
      
      * Allow print the shapes of var_desc in graph.
      test=develop
      
      * Enhance fc_fuse_pass_tester.
      
      * Remove the use of PADDLE_ENFORCE.
      test=develop
      
      * Correct the number of ops after fusing.
      test=develop
      
      * Fix a typo.
      test=develop
      
      * Set activation_type to null when there is no relu in fc.
      test=develop
      
      * Refine fc_fuse_pass's codes.
      
      * Enable the set of shape for tensor.
      
      * Refine repeated_fc_relu_pass and add unittest.
      test=develop
      c67c8758