1. 31 1月, 2018 1 次提交
    • S
      Add variant of new load and save ops for storing model params in a single file (#7909) · 2e907c36
      Siddharth Goyal 提交于
      * Add save_combine_op
      
      * Add load_combine_op and test
      
      * Add unit-test
      
      * Add a delete to free buffer memory
      
      * Add new variant of load/save
      
      * Fix unit-test
      
      * Add another unit test for compatibility with original save/load
      
      * Address review comments and simplify logic
      
      * Address review comments and simplify code - part 2
      
      * Fix naming issues and CMake problems
      
      * Address review comments
      
      * Fix LoD information in tests
      
      * Address review comments: round 2
      2e907c36
  2. 30 1月, 2018 1 次提交
  3. 28 1月, 2018 1 次提交
  4. 23 1月, 2018 1 次提交
  5. 22 1月, 2018 1 次提交
  6. 18 1月, 2018 1 次提交
  7. 14 1月, 2018 1 次提交
    • D
      "cudnn operators change to cudnn kernel" (#6660) · 5ad1aef0
      dzhwinter 提交于
      * "unified operators"
      
      * "add CUDNN register"
      
      * "add use cudnn attribute"
      
      * "add attribute"
      
      * "test conv tranpose op"
      
      * "remove duplicated attr"
      
      * "fix op test"
      
      * "add attribute to set cudnn"
      
      * "add more log"
      
      * "need layout op register support"
      
      * "add more log"
      
      * "change GetExpectedKernelType "
      
      * "fix Get attr in conv_op"
      
      * "fix CI"
      
      * "fix tests"
      
      * "removed kernel priority fallback"
      
      * "fix CI"
      
      * "fix stack pointer bug"
      
      * "refine buggy interface"
      
      * "add const cast to save life"
      
      * "fix get_output_with_grad"
      
      * "fix op test with dataformat"
      
      * ""fix pooling
      
      * "fix pooling test"
      
      * "fix CI"
      
      * "fix with_gpu error"
      
      * "add transform needed functional check"
      
      * "fix unpack list error"
      
      * "comment out parallel.do temporary"
      
      * "fix CI"
      
      * "fix compile doc error"
      
      * "make threshold larger"
      5ad1aef0
  8. 12 1月, 2018 1 次提交
  9. 11 1月, 2018 1 次提交
  10. 09 1月, 2018 1 次提交
    • Y
      Port WarpCTC Operator (#5107) · b5fda272
      Yiqun Liu 提交于
      * Add Seq2BatchFunctor, which will be used in WarpCTCOp.
      
      * Implement WrapCTCFunctor and WrapCTCKernel.
      
      * Add unittest of warpctc_op.
      
      * Modify the check_output inferface in python unittest framework to allow check a subset of outputs.
      
      * Use absolute offset lod in warpctc_op and related functors.
      
      * Refine the comments of warpctc_op.
      
      * The new python unittest supports checking a subset of the outputs, so revoke the previous change.
      
      * Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.
      
      * Update to the newest codes.
      
      * Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.
      b5fda272
  11. 03 1月, 2018 2 次提交
  12. 02 1月, 2018 4 次提交
  13. 29 12月, 2017 1 次提交
  14. 27 12月, 2017 2 次提交
  15. 25 12月, 2017 1 次提交
  16. 19 12月, 2017 1 次提交
  17. 12 12月, 2017 2 次提交
    • S
      modify for some update in trunk · a3addcdc
      sweetsky0901 提交于
      a3addcdc
    • Q
      Refine device context (#6433) · 61ec0b95
      QI JUN 提交于
      There are mainly following fixes:
      
      - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
      - remove `eigen_device` interface in base class  `DeviceContext`
      - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
      - remove unused `platform::EigenDeviceConverter`
      - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
      - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
      61ec0b95
  18. 01 12月, 2017 3 次提交
  19. 28 11月, 2017 1 次提交
    • Send recv op (#5520) · 0a8a86e0
      武毅 提交于
      * WIP send recv op
      
      * WIP send recv
      
      * put grpc impl in details
      
      * put grpc impl in details
      
      * update wip
      
      * update proto
      
      * update proto
      
      * update proto
      
      * clean cmake
      
      * wip on op implementations
      
      * wip on op implementations
      
      * compile ok adding ut
      
      * wip unitest
      
      * add extern cares for linking
      
      * wip add ut
      
      * working version send recv
      
      * revert optimizer.py
      
      * update test cmake
      
      * add libtool to dockerfile
      
      * update cmake dependency
      
      * update cmake depends
      
      * update cmake grpc depends
      
      * fix cmake dependency
      
      * fix compile error
      
      * fix compile
      
      * follow comments
      
      * update
      
      * update copyfrom
      0a8a86e0
  20. 27 11月, 2017 2 次提交
  21. 26 11月, 2017 1 次提交
    • D
      Feature/copytensor (#5455) · 45062fe5
      dzhwinter 提交于
      * "make global tensor function independently"
      
      * "replace functor"
      
      * "fix inline template error"
      
      * "fix tensor array with CopyFrom"
      
      * "fix other case use CopyFrom"
      
      * "move the op interface hardly"
      
      * "fix operators"
      
      * "fix typo"
      
      * "delete dynamic recurrent rnn and fix gru_unit in debugmode"
      
      * "fix unique_ptr copy"
      
      * "fix cuda copy"
      
      * "fix namespace error"
      
      * "removed nccl python test"
      
      * "fix include error"
      
      * "fix typo"
      
      * fix copy util test
      45062fe5
  22. 22 11月, 2017 1 次提交
  23. 21 11月, 2017 2 次提交
  24. 18 11月, 2017 1 次提交
  25. 16 11月, 2017 1 次提交
  26. 13 11月, 2017 3 次提交
    • C
      add conv3d_trans_cudnn_op · 3a507b44
      chengduoZH 提交于
      3a507b44
    • Q
      BeamSearchDecodeOp (#5498) · a4106278
      Qiao Longfei 提交于
      * init trieconcat_op
      
      * add basic implementation
      
      * add test
      
      * add more test
      
      * update unit test
      
      * add PackAllSteps test
      
      * fix PackAllSteps
      
      * all test passed
      
      * clean code
      
      * remove state inside helper
      
      * rename prob to score
      
      * optimize RemoveFromEnd
      
      * use deconstructor to delete BeamNode recursively
      
      * optimize interface
      
      * add comment to interface
      
      * optimizer data structure
      
      * use template to define the type of score
      
      * use template parameter for BeamHelper
      
      * change father to parent
      
      * rename TrieConcat to BeamSearchOutConcat
      
      * use LoDTensorArray
      
      * rename BeamSearchOutConcat to BeamSearchDecode
      
      * refine code
      
      * remain all candidate sentence in beam_search_decode_op, do not consider endid
      
      * use unique_ptr
      
      * fix compare bug
      
      * fix lod compile problem
      a4106278
    • D
      Fix compling for softmax_with_cross_entropy_op. · 91d4fc69
      dangqingqing 提交于
      91d4fc69
  27. 11 11月, 2017 2 次提交