提交 · 7e7e9404217fb1f2473f9e41d01dece67c45681f · Crayon鑫 / Paddle

15 2月, 2022 1 次提交

[PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404

由 Aurelius84 提交于 2月 15, 2022

* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict
Co-authored-by: Nxiongkun <xiongkun03@baidu.com>

7e7e9404

26 1月, 2022 1 次提交
- L
  Optimize layer norm forward when cols is 1024. (#39167) · 01d04be6
  由 Li Min 提交于 1月 26, 2022
```
* Optimize layer_norm fwd when cols is 1024.
```
  01d04be6
10 1月, 2022 1 次提交

[Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

17 12月, 2021 1 次提交

Refine some AMP operators for BERT (#37923) · d80fe268

由 sneaxiy 提交于 12月 17, 2021

* support multi precision update for LAMB

* hide some api

* fix ci uts

* fix lamb output of dygraph

* remove some changes to some PR

* try to fix Py3 CI compile error

* fix test_imperative_optimizer, add lars ut, add layer_norm ut

* fix ut, fix format

* fix ut

* fix windows ci

d80fe268

23 8月, 2021 1 次提交

Refactor the organization of layer_norm cuda impl. (#34883) · 7f5eb533

由 Li Min 提交于 8月 23, 2021

Refactor the organization of layer_norm cuda impl so that it can be reused in fused attention op.

Extract the layer_norm cuda impl form layer_norm_op.cu to layer_norm_kernel.cu.h.
Define fused/attention_layer_norm.h, which can be used in fused attention op in next PR.

7f5eb533

24 6月, 2021 1 次提交
- L
  
  fix bug when the cuda kernel config exceeds dims max (#33748) · 56692f66
  由 Leo Chen 提交于 6月 24, 2021
  
  56692f66
22 6月, 2021 1 次提交
- Z
  
  fix gpt2 train loss Nan problem (#33658) · 687571f2
  由 zhiboniu 提交于 6月 22, 2021
  
  687571f2
15 6月, 2021 1 次提交
- S
  1, remove layernorm dynamic fp16; 2, let reshape out in dynamic shape (#33535) · c5a6ae4c
  由 Shang Zhizhou 提交于 6月 15, 2021
```
* 1, remove layernorm dynamic fp16; 2, let reshape out in dynamic shape

* remove useless code
```
  c5a6ae4c
12 6月, 2021 1 次提交

Fix LayerNorm Problem (#33420) · fe94db6c

由 zhiboniu 提交于 6月 12, 2021

* Eliminate numerical differences of LayerNorm; fix LayerNorm Nan Bug while large data input

* fix bug while large shape of data input

fe94db6c

08 6月, 2021 1 次提交

add dynamic layer_norm plugin (#33293) · 45d1ae21

由 Shang Zhizhou 提交于 6月 08, 2021

* add dynamic layer_norm plugin

* fix bug

* fix numpy.allclose

* fix format

* fix code style

* remove shepe in dynamic shape

* code format

* remove layer norm fp16

* fix format

45d1ae21

19 3月, 2021 1 次提交
- R
  
  [ROCM] fix layer_norm, norm, p_norm, test_sequence_softmax_op, test_math_op_patch_var_base (#31709) · 420527f0
  由 ronnywang 提交于 3月 19, 2021
  
  420527f0
02 3月, 2021 1 次提交
- Q
  
  [ROCM] update fluid operators for rocm (part8), test=develop (#31309) · 59940cb3
  由 Qi Li 提交于 3月 02, 2021
  
  59940cb3
15 1月, 2021 1 次提交
- Y
  Fix float64 bug in layer norm (#30452) · 008b0a8b
  由 Yang Zhang 提交于 1月 15, 2021
```
built-in `rsqrt` is shadowed
```
  008b0a8b
14 12月, 2020 1 次提交
- L
  Fix compile problem when cuda_arch < 6000 (#29576) · c0163837
  由 Leo Chen 提交于 12月 14, 2020
```
* fix compile problem when cuda_arch < 6000

* refine code

* refine code
```
  c0163837
10 12月, 2020 1 次提交

Layernorm opt (#29522) · 9f926eb7

由 Leo Chen 提交于 12月 10, 2020

* layernorm fw opt

* layernorm bw opt

* fix typo, test=develop

* remove const dim3 for windows CI compatibility

* merge develop
Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>

9f926eb7

07 12月, 2020 1 次提交
- L
  
  fix layer_norm accuracy (#29434) · a040c055
  由 Leo Chen 提交于 12月 07, 2020
  
  a040c055
02 12月, 2020 1 次提交

Layer norm fp16 (#29169) · 7584bb50

由 furnace 提交于 12月 02, 2020

* add fp16 for layer_norm op

* revert layernorm api

* fix forward

* fix forward

* fix backward for layernorm with fp16

* fix unit test for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16

* 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>

* fix with_mkldnn compile error for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

7584bb50

14 5月, 2020 1 次提交
- L
  API/OP (group_norm, layer_norm, random_crop, unpool) error message enhancement (#24413) · 9f83f0fe
  由 lijianshe02 提交于 5月 14, 2020
```
* API/OP (group_norm, layer_norm, unpool) error message enhancement test=develop
```
  9f83f0fe
20 4月, 2020 1 次提交
- M
  restrict block num of layer_norm_grad cuda block to 128 (#23878) · 7d4002e0
  由 mapingshuo 提交于 4月 20, 2020
```
restrict block num of layer_norm_grad cuda kernel to 128, test=develop
```
  7d4002e0
06 1月, 2020 1 次提交

Add TRT support for BERT (#21135) · 0a51098a

由 Pei Yang 提交于 1月 06, 2020

* add gelu plugin

* align trt bert with gpu

* add support for fused fc with relu,

* add unittest for bert trt

0a51098a

05 9月, 2018 1 次提交
- Y
  
  Use double to reduce · f57d706a
  由 Yu Yang 提交于 9月 05, 2018
  
  f57d706a
08 8月, 2018 1 次提交
- S
  
  refine layer_norm · ad45d392
  由 sneaxiy 提交于 8月 06, 2018
  
  ad45d392
12 2月, 2018 1 次提交
- Q
  
  Fix the grammar in copyright. (#8403) · 24509f4a
  由 qingqing01 提交于 2月 12, 2018
  
  24509f4a
10 2月, 2018 2 次提交
- Y
  
  Correct #include path · fc374821
  由 Yi Wang 提交于 2月 09, 2018
  
  fc374821
- Y
  
  Move file to fluid/; Edit CMakeLists.txt · 90648f33
  由 Yi Wang 提交于 2月 09, 2018
  
  90648f33
03 2月, 2018 2 次提交
- C
  
  unifid GPU and CPU implementation · e0333735
  由 chengduoZH 提交于 2月 03, 2018
  
  e0333735
- C
  
  Add layer norm [GPU] · 76e188e5
  由 chengduoZH 提交于 2月 02, 2018
  
  76e188e5

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致