1. 13 5月, 2019 1 次提交
    • Y
      Optimize the elementwise op using eigen (#15494) · dcda2023
      Yiqun Liu 提交于
      * Optimize the elementwise op with CUDA kernels.
      test=develop
      
      * Support setting of attr in op config file.
      test=develop
      
      * Add the support the setting dtype and initializer in config.
      test=develop
      
      * Save workspace.
      
      * Add initializer "zeros".
      test=develop
      
      * Fix compiling error.
      
      * Support the use of existed file to initailize tensor in op_tester.
      
      * Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims.
      test=develop
      dcda2023
  2. 07 3月, 2019 2 次提交
  3. 26 2月, 2019 1 次提交
    • Y
      Optimize the CUDA implementation of sequence_expand op by reduce the times of... · f4634d76
      Yiqun Liu 提交于
      Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. (#15493)
      
      * Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU.
      test=develop
      
      * Refine the op benchmark to support setting lod in config.
      test=develop
      f4634d76
  4. 22 2月, 2019 1 次提交
  5. 20 12月, 2018 1 次提交
  6. 16 11月, 2018 1 次提交
    • W
      Refine operator cmake (#14413) · a2d9b344
      Wu Yi 提交于
      * wip simplify operator framework
      
      * wip
      
      * wip
      
      * done test=develop
      
      * clean test=develop
      
      * fix test=develop
      
      * fix deps test=develop
      
      * fix cpu build test=develop
      
      * fix tensorrt build test=develop
      
      * fix tests test=develop
      
      * fix test=develop
      
      * fix cpu build test=develop
      a2d9b344
  7. 27 9月, 2018 1 次提交
    • J
      - Added initial pass for embedding-fc-lstm · 7ab5626d
      Jacek Czaja 提交于
      - Added draft of new operator
      
      - Added fused embedding fc lstm files
      
      - First time embedding_fc_lstm_fuse_pass was invoked in
        test_text_classification
      
      - Added Embedding pattern
      
      - Not crashing
      
      - Enabled draft of embedding_fc_lstm pass (does it job)
      
      - First working (Seqcompute only) version
      
      - Removed diagnostic comment
      
      - First enabling of BatchCompute
      
      - Disabling pass for embedding with is_sparse and is_distributed
      
      - Cosmetics
      
      - Style
      
      - Style
      7ab5626d
  8. 22 8月, 2018 2 次提交
  9. 15 8月, 2018 2 次提交
  10. 08 5月, 2018 1 次提交
    • Y
      Clean OpProtoAndCheckerMaker · 0e78cb69
      Yu Yang 提交于
      Do not use ctor
      
      * Reduce line of codes.
      * We can use virtual function for Maker now.
      * The implementation does not care what maker holds, it is easier to
      refactor later.
      0e78cb69
  11. 03 4月, 2018 2 次提交