提交 · 3df38f5cdd0866c1e78f1c2674d3d6cf3166d35f · 机器未来 / Paddle

10 1月, 2020 1 次提交

[cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c

由 GaoWei8 提交于 1月 10, 2020

* Optimize the kernel implementation of layernorm with openmp (#20895)

* Add ernie c++ inference test (#21015)

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* remove ngraph

* optimize gpu test
test=develop

* optimize codes
test=develop

* fix cmake fails on inference_download_and_uncompress (#21185)

* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop

* Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

* Polish the codes of fc when needs padding (#21378)

test=develop

* Add ernie large c++ inference test (#21365)

* add ernie-large test
test=develop

* add ernie large c++ inference test
test=develop

* Modify padding strategy: remove weight copy in fc padding (#21650)

test=develop

* optimize fc jit (#21878)

test=develop
Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>

3df38f5c

11 9月, 2019 1 次提交

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

19 2月, 2019 1 次提交
- X
  update comment · f2262d73
  由 xuezhong 提交于 2月 19, 2019
```
test=develop
```
  f2262d73
30 1月, 2019 1 次提交
- X
  
  add sample_logits op · 58ad40cc
  由 xuezhong 提交于 1月 30, 2019
  
  58ad40cc
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
12 12月, 2017 1 次提交

Refine device context (#6433) · 61ec0b95

由 QI JUN 提交于 12月 12, 2017

There are mainly following fixes:

- take `DeviceContext` as the template parameter of math functors and OpKernel instead of `Place`
- remove `eigen_device` interface in base class  `DeviceContext`
- remove `GetEigenDevice` interface in `ExecutionContext` and base class `DeviceContext`
- remove unused `platform::EigenDeviceConverter`
- rename `REGISTER_OP_GPU_KERNEL` to `REGISTER_OP_CUDA_KERNEL`
- rename `USE_GPU_ONLY_OP` to `USE_CUDA_ONLY_OP`

61ec0b95

26 10月, 2017 1 次提交
- C
  
  follow comments · 99c6f44a
  由 chengduoZH 提交于 10月 26, 2017
  
  99c6f44a
23 10月, 2017 1 次提交
- C
  
  Add sequence_project_functor · 0ab2c436
  由 chengduoZH 提交于 10月 23, 2017
  
  0ab2c436
28 9月, 2017 1 次提交
- L
  
  Add SoftmaxGradFunctor, and use SoftmaxGradFunctor in softmax_op instead. · 05ed8ee8
  由 Liu Yiqun 提交于 9月 28, 2017
  
  05ed8ee8
27 9月, 2017 1 次提交
- Q
  
  fix SoftmaxWithCrossEntropyOp · 325ee637
  由 qiaolongfei 提交于 9月 26, 2017
  
  325ee637
26 9月, 2017 1 次提交
- C
  
  fix implementations of supporting soft labels. · 8b8ad6b1
  由 caoying03 提交于 9月 25, 2017
  
  8b8ad6b1
22 9月, 2017 1 次提交
- C
  
  support soft labels. · f1d5fb3b
  由 caoying03 提交于 9月 21, 2017
  
  f1d5fb3b
13 9月, 2017 1 次提交
- C
  
  softmax as functor. · c6366c81
  由 caoying03 提交于 9月 12, 2017
  
  c6366c81
12 9月, 2017 1 次提交
- C
  
  softmax as function. · c0cef849
  由 caoying03 提交于 9月 12, 2017
  
  c0cef849

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致