1. 28 9月, 2019 2 次提交
    • Q
      Enable users to create custom cpp op outside framework. (#19256) · 1a3eef02
      qingqing01 提交于
      * How to write custom op needs to follow framework OP spec.
      * Package fluid_framework.so and headers into whl.
      * Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir.
      * Export some C-APIs to merge OpInfo between core.so and custom_op.so.
      * Add unit testing.
      * Update API.spec.
      1a3eef02
    • B
      Follow comment of Merged QAT PR 18970 (#19979) · 9de67725
      bingyanghuang 提交于
      * Follow Wangzhen's comment in PR 18970, test=develop
      
      * Review comments, test=develop
      
      * Leave fake quantization around mul
      
      test=develop
      
      * Replace Fake with Real Quantized Mul
      
      test=develop
      
      * Fix bug in quantize placement pass
      
      Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop
      9de67725
  2. 27 9月, 2019 5 次提交
    • update operator compatible info, test=develop (#19978) · 01b9d079
      石晓伟 提交于
      * update operator compatible info, test=develop
      
      * revert cmake/version.cmake, test=develop
      
      * add unit_tests and fix bugs, test=develop
      
      * update ../paddle/fluid/framework/framework.proto, test=develop
      
      * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop
      
      * update paddle/fluid/framework/version_test.cc, test=develop
      
      * add comments and rename interfaces, test=develop
      01b9d079
    • J
      Disable conv requant squash (#20041) · f5221ac1
      joanna.wozna.intel 提交于
      * Fix conv2d+dequantize squash for residual fusion
      
      test=develop
      
      * Disable conv-requant squash
      
      test=develop
      f5221ac1
    • W
      codegen code for reconstruction (#19728) · c9ea317b
      wangchaochaohu 提交于
      * codegen code for reconstruction test=develop
      
      * fix the cmake test=develop
      
      * fix review advice test=develop
      c9ea317b
    • T
      the integrated communicator (#19849) · 8f0b3c05
      tangwei12 提交于
      * add a base class for the Communicator
      * add AsyncCommunicator Impl for async distributed training
      8f0b3c05
    • C
      Paddle error message stack shaping and optimization (#19895) · b9163350
      Chen Weihang 提交于
      * shape and optimize paddle error message stack, test=develop
      
      * limit exception type & add unittest, test=develop
      
      * fix multi-platform problem, test=develop
      
      * fix related unnitest failed, test=develop
      
      * add doc & fix unittest errors, test=develop
      
      * fix function name error, test=develop
      
      * update tensor test exception msg compare, test=develop
      
      * remove unittest on win32, the dir format is different, test=develop
      
      * remove useless package, test=develop
      
      * add paddle enforce handler unittest, test=develop
      
      * add exception checkout, test=develop
      
      * fix coverage failed, test=develop
      
      * fix op registry test failed, test=develop
      
      * refactor whole pr, test=develop
      
      * remove test in CMakelist, test=develop
      
      * fix coverage, test=develop
      b9163350
  3. 26 9月, 2019 3 次提交
    • C
      disable fuse_all_optimizer_ops (#19966) · 2450d15b
      chengduo 提交于
      test=develop
      2450d15b
    • C
      Add dtype for coalesce_tensor_op (#20016) · 101a2b61
      chengduo 提交于
      Add dtype for coalesce_tensor_op
      101a2b61
    • H
      Add new data layer (#19916) · 88af4ab6
      Huihuang Zheng 提交于
      The new "fluid.data" changes old "fluid.layers.data":
      
      1. Add shape and dtype check.
      2. Remove "append_batch_size" parameter. We won't offer this in the new data layer because other deep learning platforms don't have this kind of data layer pre-processing. It may confuse users.
      3. Remove "stop gradient" parameter because the data layer doesn't do back-propagation
      
      TODO:
      Now data layer feeded by executor is checked, will we want to check the feed data of readers in the future?
      88af4ab6
  4. 25 9月, 2019 1 次提交
  5. 24 9月, 2019 3 次提交
  6. 23 9月, 2019 3 次提交
  7. 20 9月, 2019 2 次提交
  8. 19 9月, 2019 4 次提交
  9. 18 9月, 2019 3 次提交
  10. 17 9月, 2019 4 次提交
    • T
      rm return in vfork (#19734) · 40c66f8d
      Thunderbrook 提交于
      * rm return in vfork
      
      * rm return in vfork
      test=develop
      40c66f8d
    • X
      support preload thread, optimize hdfs log, fix master+patch bug (#19695) · 6bf298bf
      xujiaqi01 提交于
      * support preload thread
      * sleep before fleet wrapper exit for pslib core dump
      * optimize hdfs log
      * fix master+patch bug
      6bf298bf
    • J
      Feature/add transform data dygraph (#19707) · cc311bdf
      Jiabin Yang 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * add transform_data to dygraph
      
      * test=develop, refoctor name to make it easier to understand
      
      * test=develop, refoctor name to make it easier to understand
      
      * add test and change input to const ref for safety
      
      * test=develop, fix multi-gpu failed problem , add Tracer tests, change PADDLEENFORCE to PADDLEENFORCE_EQ
      
      * add ut for data transform
      
      * refine ut for data_transform
      
      * test=develop, fix ut failed on parallel se-resnext
      
      * test=develop, change one more PADDLE_ENFORCE
      
      * add test_tracer on multiple devices
      
      * test=develop, change place to mutable for data transform
      
      * test=develop, add transform data on same place test and remove useless log
      
      * test=develop, Add to do for data layout and and ut for conv2d with no bias
      cc311bdf
    • Z
  11. 16 9月, 2019 3 次提交
    • C
      Fix warning info of build_strategy (#19805) · 82814970
      chengduo 提交于
      * fix warning info
      test=develop
      
      * fix bug of all_reduce_deps_pass
      test=develop
      82814970
    • Y
      Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733) · c67c8758
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      
      * Enhance fc_fuse_pass to enable fusing relu.
      
      * Allow print the shapes of var_desc in graph.
      test=develop
      
      * Enhance fc_fuse_pass_tester.
      
      * Remove the use of PADDLE_ENFORCE.
      test=develop
      
      * Correct the number of ops after fusing.
      test=develop
      
      * Fix a typo.
      test=develop
      
      * Set activation_type to null when there is no relu in fc.
      test=develop
      
      * Refine fc_fuse_pass's codes.
      
      * Enable the set of shape for tensor.
      
      * Refine repeated_fc_relu_pass and add unittest.
      test=develop
      c67c8758
    • C
  12. 14 9月, 2019 1 次提交
  13. 13 9月, 2019 1 次提交
    • C
      Open fuse all reduce option (#19765) · 056fdedd
      chengduo 提交于
      * Open fuse all reduce op
      test=develop
      
      * Add Fuse optimization op log
      
      * Add log in fuse_optimizer op pass and fuse all_reduce op pass
      
      * replace with boost::optional<bool>
      test=develop
      
      * Polish code
      test=develop
      
      * fix code coverage
      test=develop
      056fdedd
  14. 11 9月, 2019 5 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
    • Z
      Make leaky relu inplacable (#19676) · 0daa5c97
      Zeng Jinle 提交于
      * make leaky relu inplacable, test=develop
      
      * force add unittests to pass coverage, test=develop
      0daa5c97
    • C
      Open fuse broadcast option (#18833) · e506c99c
      chengduo 提交于
      * fix vlog level and fuse option type
      test=develop
      e506c99c
    • Y
      Implement the GPU kernel of fc operator (#19687) · a65c728e
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      a65c728e
    • C
      Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418) · 5866a7a5
      chengduo 提交于
      * Enable fused_all_reduce_op_handle support GPU and CPU Gradients
      5866a7a5