提交 · b7b231a668ac51365cdce11dfafe6f7da04b2350 · PaddlePaddle / Paddle

30 9月, 2022 1 次提交

support pure bfloat16 for more ops (#46364) · b7b231a6

由 sneaxiy 提交于 9月 30, 2022

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* add bfloat16 to selu_grad to pass CI

* fix selu grad compilation error

b7b231a6

29 9月, 2022 24 次提交
- C
  
  Optimize softmax's performance when dim_size >= 100000. (#46535) · 9012787f
  由 carryyu 提交于 9月 29, 2022
  
  9012787f
- X
  
  fix mpi include bug (#46601) · 7057093e
  由 Xinger 提交于 9月 29, 2022
  
  7057093e
- Z
  
  [AutoParallel] fix amp when predict (#46637) · 6bc855d8
  由 zhaoyingli 提交于 9月 29, 2022
  
  6bc855d8
- Z
  
  update docs for ResNetBasicBlock, test=kunlun (#44607) · 09569323
  由 zhangyikun02 提交于 9月 29, 2022
  
  09569323
- Z
  Move valid check from python to kernel (#46412) · 37bc2d7b
  由 Zhang Zheng 提交于 9月 29, 2022
```
* Move valid check from python to kernel

* fix error throw

* fix

* invalid label check

* fix

* Revert "fix"

This reverts commit 79fad6799cfa4b30423dbc84e67d7d843d22b84a.

* Revert "invalid label check"

This reverts commit 402a9707390ad5386b3222e85844b92d2e9b9fa4.

* Revert "fix"

This reverts commit 09ba3080ee0587447f875c19cdf060485f15ae3b.

* Revert "fix error throw"

This reverts commit a901bfcc2179d5c120ec29af766f392b122dab52.

* Revert "Move valid check from python to kernel"

This reverts commit baa03cc4ef82d8d45516c30dfb52bf5aead30748.

* final fix

* fix

* fix
```
  37bc2d7b
- Z
  [GPUPS]add afs OpenWriter (#46611) · c7d60ce4
  由 zmxdream 提交于 9月 29, 2022
```
* add afs OpenWriter

* update
```
  c7d60ce4
- Z
  [Hackathon No.18] 为 Paddle 新增 frexp API (#46401) · 1e2af54c
  由 Zheng_Bicheng 提交于 9月 29, 2022
```
* 之前的pr合并了大量错误代码，重新提交一份

* 之前的pr合并了大量错误代码，重新提交一份

* 修正格式问题

* 改回原来的格式

* 按照要求修改

* 按照要求修改格式

* 修复注释的问题

* 更新格式

* 测试自动格式化

* 修正英文注释

* fix docs build error

* pre-commit

* for docs build

* for docs build

* 修复mantissa计算错误的bug

* 修复误判exponent可能存在负数，导致计算量增加的情况
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
```
  1e2af54c
- L
  Add index_select, index_select_grad, reduce_min kernel and their unittests for... · 9a1855ff
  由 Leo Guo 提交于 9月 29, 2022
```
Add index_select, index_select_grad, reduce_min kernel and their unittests for kunlun. Add registers of index_select, index_select_grad, reduce_min, sqrt, sqrt_grad to xpu2_op_list.test=kunlun. (#46557)
```
  9a1855ff
- N
  
  [CodeStyle][F401] update flake8 F401 config (unittest/asp,interpreter,autograd,collective) (#46616) · 98deee29
  由 Nyakku Shigure 提交于 9月 29, 2022
  
  98deee29
- N
  [CodeStyle][F401] remove unused imports in unittests/collective (#46615) · 0ef7a02f
  由 Nyakku Shigure 提交于 9月 29, 2022
```
* [CodeStyle][F401] remove unused import in unittests/collective

* empty commit, test=document_fix

* empty commit
```
  0ef7a02f
- C
  fix P40 topk: Make the optimized topk compatible with P40. (#46547) · 667082c0
  由 carryyu 提交于 9月 29, 2022
```
* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.

* fix P40 topk: Make the optimized topk compatible with P40.
```
  667082c0
- Y
  Remove calibration file path when deploy quantize model (#46283) · d71f1b3f
  由 yeliang2258 提交于 9月 29, 2022
```
* remove calibration file path

* remove useless code
```
  d71f1b3f
- [MLU] add mlu kernel for add_reduce_max_grad (#45651) · 1ef1cace
  由光明和真理提交于 9月 29, 2022
```
Co-authored-by: Nliupeiyu <liupeiyu@cambricon.com>
```
  1ef1cace
- Z
  [AutoParallel] fix reshard when train with eval (#46605) · 8e9c719d
  由 zhaoyingli 提交于 9月 29, 2022
```
* [AutoParallel] fix reshard when train with eval

* fix mppp
```
  8e9c719d
- M
  
  add register for strided_slice_grad (#46549) · 40ab6faf
  由 ming1753 提交于 9月 29, 2022
  
  40ab6faf
- N
  
  [CodeStyle][F401] remove unused import in unittests/{asp,autograd,interpreter} (#46376) · f6039929
  由 Nyakku Shigure 提交于 9月 29, 2022
  
  f6039929
- 傅
  
  fix uniform_rand_kernel FP16 support in dygraph mode (#46212) · ccab0e2a
  由傅剑寒提交于 9月 29, 2022
  
  ccab0e2a
- H
  [OptLayoutSelect] Select the highest priority layout (#46598) · 596d8209
  由 HongyuJia 提交于 9月 29, 2022
```
* select highest priority layout

* opt performance, save virtual table find
```
  596d8209
- H
  [Fix KernelKeyParser] Unify the logic of `operator()` in `KernelKeyParser` (#46560) · 4140d7ec
  由 HongyuJia 提交于 9月 29, 2022
```
* add datatype check for ParseKernelKeyByInputArgs

* polish error message

* Actually, einsum has vector<Tensor> inpute with DataType::COMPLEX64, see test_einsum_v2.py

* headerfile remove enforce.h
```
  4140d7ec
- Z
  Improve the python file annotation check strategy for precise testing (#46559) · 3e0a1765
  由 zhangbo9674 提交于 9月 29, 2022
```
* test

* test

* refine check pr is_comment chanege

* test
```
  3e0a1765
- R
  [CustomDevice] add to_static, amp ut (#46536) · acf785b6
  由 ronnywang 提交于 9月 29, 2022
```
* [CustomDevice] add to_static, amp ut

* update

* fix failed ut

* update
```
  acf785b6
- W
  [Eager, Performance optimization] support mod / matmul ( % and @ operator) to... · 7d7444cc
  由 Weilong Wu 提交于 9月 29, 2022
```
[Eager, Performance optimization] support mod / matmul ( % and @ operator) to sink to Cpp layer (#46565)

* [Eager, Performance optimization] support mod ( % operator) to sink to Cpp layer

* fix mod logic

* support matmul math operator

* rm LOG(warning), use VLOG(6)

* fix conflicts mistake
```
  7d7444cc
- H
  [XPU] update xpu cmake to 0928. (#46437) · 58a478f8
  由 houj04 提交于 9月 29, 2022
```
* [XPU] update xpu cmake to 0923. test=kunlun

* [XPU] update xpu cmake to 0928. test=kunlun
```
  58a478f8
- R
  check change of unittest before checking coverage rate,test=coverage (#46593) · 2f76ddd7
  由 risemeup1 提交于 9月 29, 2022
```
* check change of unittest before checking coverage rate,test=coverage

* modify paddle_build.sh

* adding test_list.py
```
  2f76ddd7
28 9月, 2022 15 次提交
- S
  
  fix collective helper (#46582) · bd10211c
  由 sneaxiy 提交于 9月 28, 2022
  
  bd10211c
- Z
  
  [AutoParallel] fix process_mesh (#46583) · 7a7826b7
  由 zhaoyingli 提交于 9月 28, 2022
  
  7a7826b7
- Z
  
  [AutoParallel] fix sharding (#46572) · e65cdaee
  由 zhaoyingli 提交于 9月 28, 2022
  
  e65cdaee
- Z
  [AutoParallel] fix dist_split (#46505) · e87f65c3
  由 zhaoyingli 提交于 9月 28, 2022
```
* [AutoParallel] fix dist_split

* add unittest

* update cmakelist
```
  e87f65c3
- C
  Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e
  由 Chen Weihang 提交于 9月 28, 2022
```
* remove needless using tensor

* remove needless using tensor

* resolve conflict

* replace tensor using

* fix format error

* revert needless changing

* fix rocm and npu compile error

* fix cinn compile error

* fix format error

* fix mkldnn format error

* fix mkldnn format error

* fix cinn compile error

* fix cinn compile error

* fix cinn compile error

* resolve conflict
```
  e12a905e
- R
  Convert GradMergeAllReduceOpHandle in GraphToBlock (#46544) · 6a706e63
  由 Ruibiao Chen 提交于 9月 28, 2022
```
* Convert GradMergeAllReduceOpHandle in GraphToBlock

* Set FLAGS_CONVERT_GRAPH_TO_PROGRAM to False
```
  6a706e63
- J
  [New AD] Fix p_norm n=1 issue (#46514) · 3fc4fa29
  由 Jiabin Yang 提交于 9月 28, 2022
```
* fix p_norm n=1 issue

* fix p norm test error
```
  3fc4fa29
- Y
  
  [dygraph sharding] Overlap the reduce and the caculation for sharding stage 2. (#46495) · 9c01eaed
  由 Yuang Liu 提交于 9月 28, 2022
  
  9c01eaed
- H
  
  rename filenames from pten to phi (#46579) · 3f8585a9
  由 HongyuJia 提交于 9月 28, 2022
  
  3f8585a9
- H
  [phi Backend] Change BackendSet from uint64_t to uint32_t (#46532) · f6f8c935
  由 HongyuJia 提交于 9月 28, 2022
```
* change BackendSet from 64bits to 32bits

* fix _MSC_VER error, __lzcnt32->__lzcnt

* fix __GNUC__ error, __builtin_clzl->__builtin_clz
```
  f6f8c935
- W
  [Eager, Performance optimization] support less_than & less_equal( < & <=... · 7d238139
  由 Weilong Wu 提交于 9月 28, 2022
```
[Eager, Performance optimization] support less_than & less_equal( < & <= operator) to sink to Cpp layer (#46542)
```
  7d238139
- Z
  
  [GPUPS]fix ChannelReader (#46575) · 2aec65be
  由 zmxdream 提交于 9月 28, 2022
  
  2aec65be
- L
  
  remove const qualifier in function return (#46546) · 8c5b9cf8
  由 Leo Chen 提交于 9月 28, 2022
  
  8c5b9cf8
- J
  Replacing set_format with set_mem_desc in FC onednn kernel (#46372) · 844d9855
  由 Jacek Czaja 提交于 9月 28, 2022
```
* added fc int8 tests

* CI fix

* added skipping UTs for GPUs

* fixes for CI

* added support for residual connections inside fc

* fix for quant int8 bias

* - lint
Co-authored-by: Njakpiase <jakpia21@gmail.com>
```
  844d9855
- L
  
  first commit (#46525) · 806b252c
  由 limingshu 提交于 9月 28, 2022
  
  806b252c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功