1. 10 1月, 2018 1 次提交
  2. 08 1月, 2018 1 次提交
    • Q
      cpu gpu transform function (#7191) · 0f353ab4
      Qiao Longfei 提交于
      * add rename guard
      
      * add device_data_transform
      
      * add device_data_transform_test
      
      * modify GetExpectedKernelType
      
      * update operator.run
      
      * support test test_label_semantic_roles
      
      * optimize code
      
      * optimize code
      
      * rename GetActualKernelType to GetExpectedKernelType
      
      * fix chunk_eval_op and device_data_transform_test
      
      * add is_same_place to place
      
      * optimize code, refine rename_guard
      
      * refine rename guard, add GetKernelTypeForVar
      
      * optimize code
      
      * add some log
      
      * rename guard
      
      * use sub scope to create var
      
      * fix compile
      
      * add IsInitialized for Tensor
      
      * add VarIsTensor
      
      * fix op_registry_test
      
      * test
      
      * tmp disable priority
      
      * restore switch_kernel.md
      
      * code clean
      0f353ab4
  3. 05 1月, 2018 2 次提交
    • Y
      Add COWPtr and its unittest · 0cfb5465
      Yang Yu 提交于
      It will be used for LoD information in LoDTensor since LoD is a copy
      on write field.
      
      It is pretty slow for copying LoD information between operators. For
      resnet it will cost roughly 10% time of whole time, including reading
      data.
      0cfb5465
    • D
      Feature/use cudnn (#7141) · 5593858d
      dzhwinter 提交于
      * "add c++ side kernel selection"
      
      * "add multiple kernel op test"
      
      * "kernel selection only support cudnn"
      
      * "better formatter"
      
      * "small fix with UseCPU"
      
      * "depends on change interface Get(Place, Library)"
      
      * "fix CI"
      
      * "fix python cudnn test"
      
      * "leave the register cudnn op to another PR"
      
      * "fix CI"
      
      * "use all kernel by default"
      
      * "fix CI"
      5593858d
  4. 04 1月, 2018 1 次提交
  5. 03 1月, 2018 1 次提交
  6. 02 1月, 2018 1 次提交
    • D
      Feature/transform (#7111) · 899a79cc
      dzhwinter 提交于
      * "fix data transform"
      
      * "data transformer"
      
      * "add device pool"
      
      * "add test"
      
      * "fix ci"
      
      * "fix datalayout implementation "
      
      * "fix based on comment"
      899a79cc
  7. 28 12月, 2017 4 次提交
  8. 27 12月, 2017 2 次提交
  9. 26 12月, 2017 2 次提交
    • Q
      Add data transform fn (#6953) · f97f69fe
      Qiao Longfei 提交于
      * init data_transform
      
      * complete DataTransform
      
      * fix build error
      
      * add data_transform_test
      
      * add a register test for data_transform_fn
      
      * use function to simulate registration macro
      
      * add register macro
      
      * update test
      
      * clean code
      
      * restore unrelated code
      
      * update data transform test
      
      * generate unique name for REGISTER_DATA_TRANSFORM_FN
      
      * add const
      
      * follow comment
      
      * update KernelTypePair hash function
      f97f69fe
    • D
      "fix threadpool style" (#7017) · 80dafdf5
      dzhwinter 提交于
      * "fix threadpool style"
      
      * "remove header"
      80dafdf5
  10. 25 12月, 2017 2 次提交
    • Y
      Implement a simple threadpool (#6684) · 127bc2e0
      Yancey 提交于
      * implement a simple threadpool
      
      * unlock before cv.notify
      
      * add done function
      
      * add lock with GetAvailable function
      
      * delete done_
      
      * using call_once in GetInstance
      
      * update by comment
      
      * update comment
      
      * enhance unit test for multi threads task
      127bc2e0
    • Q
      add op_kernel_type_test · 313afc9c
      qiaolongfei 提交于
      313afc9c
  11. 24 12月, 2017 1 次提交
    • D
      Feature/operator run place (#6783) · 735eba29
      dzhwinter 提交于
      * "change operator interface"
      
      * "move devicepool to device_context"
      
      * "fix operator test"
      
      * "fix op_registry Run interface"
      
      * "net op passed. Need to fix nccl multi-Context"
      
      * "add nccl group function"
      
      * "add nccl group function"
      
      * "fix gpu count exceed 32 error"
      
      * "fix recurrent op, nccl op"
      
      * "change the other operators interface with Place"
      
      * "fix typo"
      
      * "fix pybind"
      
      * "fix device in python side"
      
      * "fix pybind failed"
      
      * "add init for test"
      
      * "fix CI"
      735eba29
  12. 18 12月, 2017 1 次提交
    • D
      Feature/global context (#6537) · 24fda392
      dzhwinter 提交于
      * "add DeviceContextPool"
      
      * "add devicecontextpool in pybind"
      
      * "add comments in python side "
      
      * "fix static link error"
      
      * "fix CI error"
      
      * "add executor.py"
      
      * "fix CI error"
      
      * "add with gpu macro"
      
      * "remove comment out codes"
      
      * "add TODO items"
      
      * "update init devices"
      24fda392
  13. 26 11月, 2017 1 次提交
    • D
      Feature/copytensor (#5455) · 45062fe5
      dzhwinter 提交于
      * "make global tensor function independently"
      
      * "replace functor"
      
      * "fix inline template error"
      
      * "fix tensor array with CopyFrom"
      
      * "fix other case use CopyFrom"
      
      * "move the op interface hardly"
      
      * "fix operators"
      
      * "fix typo"
      
      * "delete dynamic recurrent rnn and fix gru_unit in debugmode"
      
      * "fix unique_ptr copy"
      
      * "fix cuda copy"
      
      * "fix namespace error"
      
      * "removed nccl python test"
      
      * "fix include error"
      
      * "fix typo"
      
      * fix copy util test
      45062fe5
  14. 15 11月, 2017 1 次提交
  15. 04 11月, 2017 1 次提交
    • Y
      Add LoDRankTable (#5349) · 74849158
      Yu Yang 提交于
      * Add LoDRankTable
      
      LoD Rank Table stores the `level` of `lod` which is ordered by sequence
      length in descending order. It is useful when implement dynamic RNN and
      is shared by dynamic RNN memory, dynamic RNN slice input and dynamic
      RNN slice output operators.
      
      * Add InferVarType
      74849158
  16. 31 10月, 2017 1 次提交
  17. 29 10月, 2017 1 次提交
  18. 28 10月, 2017 1 次提交
  19. 27 10月, 2017 2 次提交
    • Q
      add sparse support for sum op (#5093) · 7f8574c0
      QI JUN 提交于
      * add sparse support for sum op
      
      * typo fix
      
      * fix gpu build error
      
      * fix unittest error
      
      * typo fix
      
      * infer var type and shape in op_test
      
      * follow comments
      
      * fix build error
      
      * bypass some unittests depend on NetOp
      7f8574c0
    • Y
      Gradient check use graph (#5027) · be00b0c4
      Yu Yang 提交于
      * Simplize Gradient Check
      
      * Stash
      
      * Extract apply_backward_pass to backward.py
      
      Rename apply_backward_pass to append_backward_ops
      
      * Use graph API to check gradient
      
      * Fix ci
      
      * Fix CI
      
      * Fix backward for double precision
      
      * Stash
      
      * Fix CI
      
      * Fix ci
      
      * Ignore GRU test
      
      * Ignore xe op
      
      * Fix CI
      
      * Fix softmax with xe gradient
      
      The correct equation should be IG = OG * (d_softmax_with_xe())
      
      * Fix typo
      
      * Fix merge error
      
      * Disable LRN
      be00b0c4
  20. 26 10月, 2017 1 次提交
    • Y
      Feature/save op (#5090) · efc2464f
      Yu Yang 提交于
      * Init
      
      * Stash
      
      * Polish SaveLoadOp
      
      * Fix CI
      
      * Polish code
      
      * Save GPU Tensor
      
      * Stash
      
      * Fix CI
      efc2464f
  21. 25 10月, 2017 1 次提交
    • D
      "Serialize LoDTensor, Save/Restore model" (#4602) · fd2eb550
      dzhwinter 提交于
      * "add model format design doc"
      
      * "add restore function"
      
      * "add parse protobuf"
      
      * "move necessary information to saver.proto"
      
      * "format code"
      
      * "add gpu option"
      
      * "add lod info"
      
      * "add saveop python test wrapper"
      
      * "checkpoint reuse save operator"
      
      * "rewrite model format design doc"
      
      * "async support needed"
      
      * "fix run once"
      
      * "fix doc based on comments"
      
      * "refine based on comments"
      
      * "fix based comments"
      
      * "remove persistable flag from framework.proto"
      
      * "add IndicateDataType to restore op"
      
      * "add save test"
      
      * "modify save restore code"
      
      * "modified the restore logic"
      
      * rm checkpoint_op.cc
      
      * rm test_checkpoint_op.py
      
      * "get inputs outputs name from execution context"
      
      * Saving each variable to a independent file
      
      * Fix bugs
      
      * Rewrite save_restore_op_test with new Python framework
      
      * Move `SaveOp` and `RestoreOp` from OpWithKernel to OpBase
      
      * Refine unit test of SaveOp and RestoreOp
      
      * fix compile errorwq
      fd2eb550
  22. 23 10月, 2017 1 次提交
  23. 21 10月, 2017 1 次提交
  24. 20 10月, 2017 1 次提交
    • Y
      Feature/py executor test (#4922) · 3db52783
      Yu Yang 提交于
      * Implement FC layer with helper
      
      * Update LayerHelper
      
      * Add debug string for Python ProtoBuf
      
      and Rename `Sync` to `Flush`
      
      * Add check of ProtoBuf initialization
      
      * Layer wrapper for FC
      
      * Fix unittest
      
      * Fix CI
      
      * Add code generator
      
      * AttributeChecker Better error log and speicalize bool
      
      Since lots of types can be cast to bool
      
      * Complete mlp, fit_a_line
      
      * Expose get global scope
      
      * Make global scope not thread-safe
      
      1. It is no need to make global scope thread-safe, since it will be
      invoked in Python main thread.
      2. Do not free the global scope when C++ exit. Let the OS free memories,
      otherwise, we need to handle the destroy dependencies.
      
      See
      https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables
      
      * Fix
      
      * Implementation of simple conv_2d layer
      
      * Stash
      
      * Remove private data members in OpRegister
      
      * Fix bugs
      
      * Stash
      
      * Expose FeedFetchList as VarType
      
      * Change ProgramDesc not a global variable
      
      * Polish code style
      
      * Stash
      
      * Correct implement BlockDesc destructor
      
      * Correct implement BlockDesc destructor
      
      * Unify program as parameter name
      
      * Fix bugs
      
      * Add unittest
      
      * Fix unit test error
      
      * Remove unused functions
      
      * Add clone for Python Program
      
      * Working on executor
      
      * Stash
      
      * Add glog as dependencies of ops
      
      * Use VLOG to logging some information is helpful when we debug Paddle
      
      * Expose VarDesc::persistable to Python
      
      * Test executor
      
      * Complete unittest
      
      * Polish code
      
      * Fix merge error
      
      * Follow comment
      
      * Polish Python Code
      3db52783
  25. 19 10月, 2017 3 次提交
    • D
      Add missing file. · a461bf13
      dangqingqing 提交于
      a461bf13
    • Y
      Copy Constructor for ProgramDesc (#4895) · 47f773dd
      Yu Yang 提交于
      * Implement FC layer with helper
      
      * Update LayerHelper
      
      * Add debug string for Python ProtoBuf
      
      and Rename `Sync` to `Flush`
      
      * Add check of ProtoBuf initialization
      
      * Layer wrapper for FC
      
      * Fix unittest
      
      * Fix CI
      
      * Add code generator
      
      * AttributeChecker Better error log and speicalize bool
      
      Since lots of types can be cast to bool
      
      * Complete mlp, fit_a_line
      
      * Implementation of simple conv_2d layer
      
      * Fix bugs
      
      * Change ProgramDesc not a global variable
      
      * Polish code style
      
      * Stash
      
      * Correct implement BlockDesc destructor
      
      * Correct implement BlockDesc destructor
      
      * Unify program as parameter name
      
      * Fix bugs
      
      * Add unittest
      
      * Fix unit test error
      
      * Remove unused functions
      
      * Add clone for Python Program
      
      * Compare OpDescBind directly
      47f773dd
    • Y
      Add glog as dependencies of ops (#4908) · e9249d16
      Yu Yang 提交于
      * Add glog as dependencies of ops
      
      * Use VLOG to logging some information is helpful when we debug Paddle
      
      * Fix Unittests
      e9249d16
  26. 18 10月, 2017 2 次提交
  27. 17 10月, 2017 2 次提交
  28. 15 10月, 2017 1 次提交
    • Q
      create grad_var when run Backward pass (#4796) · d7383c6d
      Qiao Longfei 提交于
      * add target to Backward, generate var in block when call backward
      
      * modify backward_test
      
      * fix executor_test
      
      * set var desc default type to LOD_TENSOR
      
      * update backward_test
      
      * insert loss in the top level of backward
      
      * create grad vars for all blocks in current program
      
      * optimize code
      
      * update test_program.py
      
      * only create var for newly create blocks when backward
      d7383c6d