提交 · 084d4a9e9eed55d6df0cb8a8ad1da4682da07af2 · Crayon鑫 / Paddle

20 8月, 2018 1 次提交

Optimize CRF Decoding with AVX/AVX2/AVX512F instruction (#12767) · 084d4a9e

由 Yihua Xu 提交于 8月 20, 2018

* Optimize CRF decoding with AVX/AVX2 instruction

* Enable the AVX2 flags for compiling

* Clean the code and decrease the count of multiply calculation

* Add the support of AVX512 instruction to optimize CRF Decoding

* Clean the code

* Enable the AVX512f flags for compiling

* Clean the code for the invaluable switch

* Fixed the issue to check AVX512F status

* Clean the code

* Add some explanation of the key points

084d4a9e

11 4月, 2018 1 次提交
- S
  
  Fix cpplint errors (#9800) · cea39121
  由 Siddharth Goyal 提交于 4月 10, 2018
  
  cea39121
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
08 1月, 2018 1 次提交

cpu gpu transform function (#7191) · 0f353ab4

由 Qiao Longfei 提交于 1月 08, 2018

* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean

0f353ab4

12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

05 12月, 2017 1 次提交
- Q
  add crf_decoding layer (#6274) · 45c8a88a
  由 Qiao Longfei 提交于 12月 05, 2017
```
* add crf_decoding layer

* fix some typo

* fix test_crf_decoding_op
```
  45c8a88a
04 11月, 2017 1 次提交
- C
  Add the crf_decoding operator. (#5352) · 45eabb8c
  由 Cao Ying 提交于 11月 03, 2017
```
* proj init.

* add unittest and implementation.
```
  45eabb8c

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致