1. 20 3月, 2018 1 次提交
    • S
      CMake refine for HIP support. · e50205e7
      sabreshao 提交于
      1. Add option WITH_AMD_GPU.
      2. Add cmake/hip.cmake for HIP toolchain.
      3. Some external module such as eigen may need HIP port.
      4. Add macro hip_library/hip_binary/hip_test to cmake/generic.cmake.
      5. Add one HIP source concat.hip.cu as an example. Each .cu may have its corresponding .hip.cu.
      e50205e7
  2. 16 3月, 2018 2 次提交
  3. 15 3月, 2018 1 次提交
    • T
      Implement Select OP (#9088) · 1e4c504e
      Thuan Nguyen 提交于
      * Fix old documentation for channel_recv
      
      * Initial design of CSP select
      
      * Redesign channel implementation for Select Op
      
      * Remove unecessary header
      
      * Initial checkin of select op, currently will read all the conditional_op in the cases block and also pull out all channels involved in the select.
      
      * Init python select op API
      
      * Python select bug fix when checking op creates block
      
      * Add case_to_execute as (a) input to select, (b) into the passed inputs into the select op
      
      * Add in addition code for select op
      
      * Init fibonacci test from python
      
      * implement fibonnaci sequence test
      
      * update fib unit test
      
      * Improve select test cases
      
      * Shorten non-pep-8-ed lines
      
      * Add methods on channel needed by select op
      
      * Fix compile issues, finish implementation, still need to debug code
      
      * Fix issue with fibonncci test, it works now!
      
      * Change QueueMessage callback to take in an ChannelAction enum, fix select unit test
      
      * Fix case attributes
      
      * Fix issue with select control flow
      
      * Make cases - previously on each selectcase conditional_block - attributes to select
      
      * Use class constants for type of channel
      
      * Change select op to take in "cases" attribute
      
      * return boolean from select callback function to tell Channel if this RECV or SEND should be executed
      
      * Improve attributes and inputs comments on select op
      
      * Fix issues with python unit test
      
      * Assert fibonacci final output
      
      * Fix issue when channel name / channel var is null for "default" case in select op
      
      * Assert base select test output
      
      * Make QueueMessage use shared pointer and modify the order of the callback
      
      * Fixing the order in which the callback is called
      
      * Move channel utility methods to paddle/fluid/operators/concurrency/channel_util
      
      * Create channel_util and move channel util methods
      
      * Fix crash when calling select_op
      
      * Fix deadlock
      
      * Fix issue of channel destructor deadlock
      
      * Fix precommit issues
      
      * Accidentally checked in changes to beam_search_op, reverting change.
      
      * Fix dependency issue in concurrency cmake
      
      * add device_context dependency for concurrency target
      1e4c504e
  4. 13 3月, 2018 1 次提交
    • Q
      Repair nccl op test (#8575) · 7287630e
      QI JUN 提交于
      * fix nccl op unit test
      
      * fix build error
      
      * format code
      
      * refine nccl related unit test
      
      * fix build error
      
      * add setGPUData
      
      * clean up
      
      * follow comments
      
      * rm test_nccl.cu
      
      * follow comment
      
      * rm wait
      7287630e
  5. 07 3月, 2018 3 次提交
  6. 06 3月, 2018 1 次提交
  7. 02 3月, 2018 3 次提交
  8. 27 2月, 2018 3 次提交
  9. 16 2月, 2018 1 次提交
  10. 10 2月, 2018 2 次提交
  11. 07 2月, 2018 1 次提交
  12. 01 2月, 2018 1 次提交
  13. 31 1月, 2018 1 次提交
    • S
      Add variant of new load and save ops for storing model params in a single file (#7909) · 2e907c36
      Siddharth Goyal 提交于
      * Add save_combine_op
      
      * Add load_combine_op and test
      
      * Add unit-test
      
      * Add a delete to free buffer memory
      
      * Add new variant of load/save
      
      * Fix unit-test
      
      * Add another unit test for compatibility with original save/load
      
      * Address review comments and simplify logic
      
      * Address review comments and simplify code - part 2
      
      * Fix naming issues and CMake problems
      
      * Address review comments
      
      * Fix LoD information in tests
      
      * Address review comments: round 2
      2e907c36
  14. 30 1月, 2018 1 次提交
  15. 28 1月, 2018 1 次提交
  16. 23 1月, 2018 1 次提交
  17. 22 1月, 2018 1 次提交
  18. 18 1月, 2018 1 次提交
  19. 14 1月, 2018 1 次提交
    • D
      "cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0
      dzhwinter 提交于
      * "unified operators"
      
      * "add CUDNN register"
      
      * "add use cudnn attribute"
      
      * "add attribute"
      
      * "test conv tranpose op"
      
      * "remove duplicated attr"
      
      * "fix op test"
      
      * "add attribute to set cudnn"
      
      * "add more log"
      
      * "need layout op register support"
      
      * "add more log"
      
      * "change GetExpectedKernelType "
      
      * "fix Get attr in conv_op"
      
      * "fix CI"
      
      * "fix tests"
      
      * "removed kernel priority fallback"
      
      * "fix CI"
      
      * "fix stack pointer bug"
      
      * "refine buggy interface"
      
      * "add const cast to save life"
      
      * "fix get_output_with_grad"
      
      * "fix op test with dataformat"
      
      * ""fix pooling
      
      * "fix pooling test"
      
      * "fix CI"
      
      * "fix with_gpu error"
      
      * "add transform needed functional check"
      
      * "fix unpack list error"
      
      * "comment out parallel.do temporary"
      
      * "fix CI"
      
      * "fix compile doc error"
      
      * "make threshold larger"
      5ad1aef0
  20. 12 1月, 2018 1 次提交
  21. 11 1月, 2018 1 次提交
  22. 09 1月, 2018 1 次提交
    • Y
      Port WarpCTC Operator (#5107) · b5fda272
      Yiqun Liu 提交于
      * Add Seq2BatchFunctor, which will be used in WarpCTCOp.
      
      * Implement WrapCTCFunctor and WrapCTCKernel.
      
      * Add unittest of warpctc_op.
      
      * Modify the check_output inferface in python unittest framework to allow check a subset of outputs.
      
      * Use absolute offset lod in warpctc_op and related functors.
      
      * Refine the comments of warpctc_op.
      
      * The new python unittest supports checking a subset of the outputs, so revoke the previous change.
      
      * Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.
      
      * Update to the newest codes.
      
      * Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
      b5fda272
  23. 03 1月, 2018 2 次提交
  24. 02 1月, 2018 4 次提交
  25. 29 12月, 2017 1 次提交
  26. 27 12月, 2017 2 次提交
  27. 25 12月, 2017 1 次提交