提交 · dcda20233cedcc700a7556ec3fb7dbf689da6c15 · s920243400 / PaddleDetection

13 5月, 2019 1 次提交

Optimize the elementwise op using eigen (#15494) · dcda2023

由 Yiqun Liu 提交于 5月 13, 2019

* Optimize the elementwise op with CUDA kernels.
test=develop

* Support setting of attr in op config file.
test=develop

* Add the support the setting dtype and initializer in config.
test=develop

* Save workspace.

* Add initializer "zeros".
test=develop

* Fix compiling error.

* Support the use of existed file to initailize tensor in op_tester.

* Use eigen to optimize the elementwise_add/mul for the case that x and y have the same dims.
test=develop

dcda2023

07 3月, 2019 2 次提交

Enhance the op benchmark: (#16066) · f31d515c

由 Yiqun Liu 提交于 3月 07, 2019

- Support setting attr in config
- Support setting dtype and initializer for input in config
test=develop

f31d515c

Enhance the op benchmark: (#16066) · 36e2d324

由 Yiqun Liu 提交于 3月 07, 2019

- Support setting attr in config
- Support setting dtype and initializer for input in config
test=develop

36e2d324

26 2月, 2019 1 次提交

Optimize the CUDA implementation of sequence_expand op by reduce the times of... · f4634d76

由 Yiqun Liu 提交于 2月 26, 2019

Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. (#15493)

* Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU.
test=develop

* Refine the op benchmark to support setting lod in config.
test=develop

f4634d76

22 2月, 2019 1 次提交
- Y
  Initialize the benchmark tester for operator. (#15772) · 7d96c74a
  由 Yiqun Liu 提交于 2月 22, 2019
```
* Initialize the benchmark tester for operator.
test=develop

* Rearrange the codes.
test=develop
```
  7d96c74a
20 12月, 2018 1 次提交
- X
  Add Quantize OP · 019dbf7f
  由 xiaoli.liu@intel.com 提交于 12月 20, 2018
```
test=develop
```
  019dbf7f
16 11月, 2018 1 次提交

Refine operator cmake (#14413) · a2d9b344

由 Wu Yi 提交于 11月 16, 2018

* wip simplify operator framework

* wip

* wip

* done test=develop

* clean test=develop

* fix test=develop

* fix deps test=develop

* fix cpu build test=develop

* fix tensorrt build test=develop

* fix tests test=develop

* fix test=develop

* fix cpu build test=develop

a2d9b344

27 9月, 2018 1 次提交

- Added initial pass for embedding-fc-lstm · 7ab5626d

由 Jacek Czaja 提交于 9月 13, 2018

- Added draft of new operator

- Added fused embedding fc lstm files

- First time embedding_fc_lstm_fuse_pass was invoked in
  test_text_classification

- Added Embedding pattern

- Not crashing

- Enabled draft of embedding_fc_lstm pass (does it job)

- First working (Seqcompute only) version

- Removed diagnostic comment

- First enabling of BatchCompute

- Disabling pass for embedding with is_sparse and is_distributed

- Cosmetics

- Style

- Style

7ab5626d

22 8月, 2018 2 次提交
- T
  
  implement attention lstm cpu forward · 508548f8
  由 tensor-tang 提交于 8月 22, 2018
  
  508548f8
- T
  
  init attention lstm · 9affc36c
  由 tensor-tang 提交于 8月 20, 2018
  
  9affc36c
15 8月, 2018 2 次提交
- T
  
  fuse fc in lstm · 8f913295
  由 tensor-tang 提交于 8月 15, 2018
  
  8f913295
- T
  
  init fusion lstm op · ddb05dff
  由 tensor-tang 提交于 8月 15, 2018
  
  ddb05dff
08 5月, 2018 1 次提交

Clean OpProtoAndCheckerMaker · 0e78cb69

由 Yu Yang 提交于 5月 08, 2018

Do not use ctor

* Reduce line of codes.
* We can use virtual function for Maker now.
* The implementation does not care what maker holds, it is easier to
refactor later.

0e78cb69

03 4月, 2018 2 次提交
- M
  
  Added new fc files, register fc kernel · 34a80843
  由 mozga-intel 提交于 3月 29, 2018
  
  34a80843
- M
  
  Implementation of MKLDNN FC · 2811ea44
  由 mozga-intel 提交于 3月 28, 2018
  
  2811ea44

s920243400 / PaddleDetection 与 Fork 源项目一致

s920243400 / PaddleDetection
与 Fork 源项目一致