1. 12 12月, 2017 1 次提交
    • Q
      Refine device context (#6433) · 61ec0b95
      QI JUN 提交于
      There are mainly following fixes:
      
      - take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
      - remove `eigen_device` interface in base class  `DeviceContext`
      - remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
      - remove unused `platform::EigenDeviceConverter`
      - rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
      - rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`
      61ec0b95
  2. 01 12月, 2017 3 次提交
  3. 28 11月, 2017 1 次提交
    • Send recv op (#5520) · 0a8a86e0
      武毅 提交于
      * WIP send recv op
      
      * WIP send recv
      
      * put grpc impl in details
      
      * put grpc impl in details
      
      * update wip
      
      * update proto
      
      * update proto
      
      * update proto
      
      * clean cmake
      
      * wip on op implementations
      
      * wip on op implementations
      
      * compile ok adding ut
      
      * wip unitest
      
      * add extern cares for linking
      
      * wip add ut
      
      * working version send recv
      
      * revert optimizer.py
      
      * update test cmake
      
      * add libtool to dockerfile
      
      * update cmake dependency
      
      * update cmake depends
      
      * update cmake grpc depends
      
      * fix cmake dependency
      
      * fix compile error
      
      * fix compile
      
      * follow comments
      
      * update
      
      * update copyfrom
      0a8a86e0
  4. 27 11月, 2017 2 次提交
  5. 26 11月, 2017 1 次提交
    • D
      Feature/copytensor (#5455) · 45062fe5
      dzhwinter 提交于
      * "make global tensor function independently"
      
      * "replace functor"
      
      * "fix inline template error"
      
      * "fix tensor array with CopyFrom"
      
      * "fix other case use CopyFrom"
      
      * "move the op interface hardly"
      
      * "fix operators"
      
      * "fix typo"
      
      * "delete dynamic recurrent rnn and fix gru_unit in debugmode"
      
      * "fix unique_ptr copy"
      
      * "fix cuda copy"
      
      * "fix namespace error"
      
      * "removed nccl python test"
      
      * "fix include error"
      
      * "fix typo"
      
      * fix copy util test
      45062fe5
  6. 22 11月, 2017 1 次提交
  7. 21 11月, 2017 2 次提交
  8. 18 11月, 2017 1 次提交
  9. 16 11月, 2017 1 次提交
  10. 13 11月, 2017 3 次提交
    • C
      add conv3d_trans_cudnn_op · 3a507b44
      chengduoZH 提交于
      3a507b44
    • Q
      BeamSearchDecodeOp (#5498) · a4106278
      Qiao Longfei 提交于
      * init trieconcat_op
      
      * add basic implementation
      
      * add test
      
      * add more test
      
      * update unit test
      
      * add PackAllSteps test
      
      * fix PackAllSteps
      
      * all test passed
      
      * clean code
      
      * remove state inside helper
      
      * rename prob to score
      
      * optimize RemoveFromEnd
      
      * use deconstructor to delete BeamNode recursively
      
      * optimize interface
      
      * add comment to interface
      
      * optimizer data structure
      
      * use template to define the type of score
      
      * use template parameter for BeamHelper
      
      * change father to parent
      
      * rename TrieConcat to BeamSearchOutConcat
      
      * use LoDTensorArray
      
      * rename BeamSearchOutConcat to BeamSearchDecode
      
      * refine code
      
      * remain all candidate sentence in beam_search_decode_op, do not consider endid
      
      * use unique_ptr
      
      * fix compare bug
      
      * fix lod compile problem
      a4106278
    • D
      Fix compling for softmax_with_cross_entropy_op. · 91d4fc69
      dangqingqing 提交于
      91d4fc69
  11. 11 11月, 2017 2 次提交
  12. 08 11月, 2017 3 次提交
    • Y
      Feature/rnn to array to lod tensor (#5411) · f72729d4
      Yu Yang 提交于
      * Add LoDRankTable
      
      LoD Rank Table stores the `level` of `lod` which is ordered by sequence
      length in descending order. It is useful when implement dynamic RNN and
      is shared by dynamic RNN memory, dynamic RNN slice input and dynamic
      RNN slice output operators.
      
      * Add skeleton for array_to_lod_tensor and lod_tensor_to_array
      
      * Add VarType::LoDTensorArray
      
      * Add PyBind of LoDTensorArray
      
      * Add InferVarType
      
      * Add first unittest
      
      * Add ut
      
      * Add unittest
      
      * Add unittest
      
      * Add unittests
      
      * update
      
      * init
      
      * add infershape for lod_tensor_to_array_op
      
      * compelete array_to_lod_tensor_op
      
      * copy data
      
      * clean code
      
      * clean code
      
      * Fix unittest data
      
      * fix bugs
      
      * fix compile error
      
      * Refine TensorToArrayOp
      
      * refactor array_to_lod_tensor
      
      * Unittest
      
      * fix bugs
      
      * Fix unittest
      
      * Fix unittest
      
      * debug
      
      * Debug
      
      * Fix unittest
      
      * clean code
      
      * refactor
      
      * use ostream
      
      * update test
      
      * fix gpu build error
      
      * make gpu test pass
      f72729d4
    • Y
      Add gtest for drnn · db3b49fe
      Yu Yang 提交于
      db3b49fe
    • Y
      Compare Operator (#5325) · f74fb790
      Yu Yang 提交于
      * Compare Operator
      
      * Follow comments
      f74fb790
  13. 07 11月, 2017 1 次提交
    • Y
      ReadFromArray/WriteToArray op (#5407) · c9b57dcc
      Yu Yang 提交于
      * Use stable_sort in lod_rank_table
      
      It is easy to debug and test when use `stable_sort`and the time
      complexity is not changed.
      
      * Add LoDTensorArray
      
      * Stash
      
      * Better debug message for IsInitialized
      
      * Stash
      
      * Better debug message for IsInitialized
      
      * Complete array read/write op unittests
      c9b57dcc
  14. 04 11月, 2017 1 次提交
    • Y
      Add LoDRankTable (#5349) · 74849158
      Yu Yang 提交于
      * Add LoDRankTable
      
      LoD Rank Table stores the `level` of `lod` which is ordered by sequence
      length in descending order. It is useful when implement dynamic RNN and
      is shared by dynamic RNN memory, dynamic RNN slice input and dynamic
      RNN slice output operators.
      
      * Add InferVarType
      74849158
  15. 03 11月, 2017 1 次提交
  16. 02 11月, 2017 1 次提交
    • Y
      Rewrite StaticRNN with Executor (#5224) · 0a32e74d
      Yu Yang 提交于
      * Init commit
      
      * Make executor use ProgramDescBind
      
      * Change Attribute from BlockDesc to BlockDescBind
      
      * Since we will get the program desc in RNN, just BlockDesc is not
        enough.
      
      * Add DeviceContext to Executor API
      
      * Rewrite RNN
      
      * Pass Python
      
      * AddBiasOp does not care num_flatten_dims
      
      * Stash
      
      * Fix MacOS Compile
      
      * Pass RNN forward
      
      * add python test
      
      * refactor test
      
      * Make compile pass
      
      * add gradopmaker
      
      * First draft done
      
      * Polish code
      
      * add grad op maker and grad infershape
      
      * Polish code
      
      * Fix backward.cc bug
      
      * Fix infershape
      
      * Rename function
      
      * add backward test
      
      * simplify recurrent test
      
      * Update
      
      * Pass unittest
      
      * Add comments & refine test
      
      * Add comments
      
      * refactor test
      
      * Complete Unittest
      
      * fix StepScopes enforce
      
      * Remove unused unittest
      
      * no type error
      
      * Update
      
      * Make RNN Pass unittest
      0a32e74d
  17. 31 10月, 2017 1 次提交
  18. 30 10月, 2017 1 次提交
  19. 27 10月, 2017 2 次提交
    • C
      write together · 51113cfe
      chengduoZH 提交于
      51113cfe
    • Q
      add sparse support for sum op (#5093) · 7f8574c0
      QI JUN 提交于
      * add sparse support for sum op
      
      * typo fix
      
      * fix gpu build error
      
      * fix unittest error
      
      * typo fix
      
      * infer var type and shape in op_test
      
      * follow comments
      
      * fix build error
      
      * bypass some unittests depend on NetOp
      7f8574c0
  20. 26 10月, 2017 6 次提交
  21. 25 10月, 2017 2 次提交
    • D
      "Serialize LoDTensor, Save/Restore model" (#4602) · fd2eb550
      dzhwinter 提交于
      * "add model format design doc"
      
      * "add restore function"
      
      * "add parse protobuf"
      
      * "move necessary information to saver.proto"
      
      * "format code"
      
      * "add gpu option"
      
      * "add lod info"
      
      * "add saveop python test wrapper"
      
      * "checkpoint reuse save operator"
      
      * "rewrite model format design doc"
      
      * "async support needed"
      
      * "fix run once"
      
      * "fix doc based on comments"
      
      * "refine based on comments"
      
      * "fix based comments"
      
      * "remove persistable flag from framework.proto"
      
      * "add IndicateDataType to restore op"
      
      * "add save test"
      
      * "modify save restore code"
      
      * "modified the restore logic"
      
      * rm checkpoint_op.cc
      
      * rm test_checkpoint_op.py
      
      * "get inputs outputs name from execution context"
      
      * Saving each variable to a independent file
      
      * Fix bugs
      
      * Rewrite save_restore_op_test with new Python framework
      
      * Move `SaveOp` and `RestoreOp` from OpWithKernel to OpBase
      
      * Refine unit test of SaveOp and RestoreOp
      
      * fix compile errorwq
      fd2eb550
    • D
      write nccl c++ test case · ef257e6d
      Dong Zhihong 提交于
      ef257e6d
  22. 24 10月, 2017 2 次提交
  23. 23 10月, 2017 1 次提交