提交 · 0b39b244f1567f5fb8dc89e888ded57f5daf792c · PaddlePaddle / Paddle

17 10月, 2022 2 次提交

Support BF16 training for sharding (#46846) · 0b39b244

由 Ghost Screaming 提交于 10月 17, 2022

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
is wrong.

* support pure bfloat16

* support bf16 linear

* update PR to pass CI

* tiny fix where_grad_kernel.cu

* Support bfloat16 type for reducer and sharding.

* Fix some bug.

* Polish code.

* Polise code.

* Add bfloat16 datatype in fill_grad kernels.
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

0b39b244

Y
[PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
ec749398

13 10月, 2022 2 次提交
- [Zero-Dim] support 0D for paddle.transpose/reshape/stack/tile/unsqueeze (#46555) · 78add057
  由 zhouweiwei2014 提交于 10月 13, 2022
  
  78add057
- Z
  Revert #46111 (#46961) · cf9ca61d
  由 Zhang Ting 提交于 10月 13, 2022
```
* Revert "【Hackathon No.56&38】deformable_conv_v1 算子实现 float16 数据类型支持&前向运行加速 (#46111)"
```
  cf9ca61d
11 10月, 2022 1 次提交
- F
  
  set_value_op: add support for complex types (#46884) · 34c7e3e3
  由 Feiyu Chan 提交于 10月 11, 2022
  
  34c7e3e3
10 10月, 2022 1 次提交
- R
  【Hackathon No.56&38】deformable_conv_v1 算子实现 float16 数据类型支持&前向运行加速 (#46111) · 5e0614a1
  由 Rayman 提交于 10月 10, 2022
```
support fp16 for deformable conv
```
  5e0614a1
23 9月, 2022 1 次提交
- Y
  
  move selected_rows_functor (#46373) · b6c6f4f9
  由 YuanRisheng 提交于 9月 23, 2022
  
  b6c6f4f9
20 9月, 2022 4 次提交

Y

move reduce func (#46248) · 6b47507d
由 YuanRisheng 提交于 9月 20, 2022

6b47507d

[Eager Bug fix]Fix Detection (#46147) · 192e7ccf

由 Jiabin Yang 提交于 9月 20, 2022

* fix linspace error in amp

* fix log

* fix amp error

* Revert "Simplify size op impl (#45808)"

This reverts commit c252b1de.

* fix_seg

* fix detection
Co-authored-by: NChen Weihang <sunny_cwh@163.com>

192e7ccf

[Eager] Fix ocr (#46124) · d13a4a25

由 Jiabin Yang 提交于 9月 20, 2022

* fix linspace error in amp

* fix log

* fix amp error

* fix ocr error which caused by amp

* add more check

* rename dtype ns

d13a4a25

H
[PolishComments] Polish some code comments (#46032) · 56f9452c
由 HongyuJia 提交于 9月 20, 2022
```
* polish code comments

* polish data_device_transform.cc
```
56f9452c

19 9月, 2022 3 次提交
- Y
  [PHI]Move sum op to PHI (#45860) · 4b3f2af1
  由 YuanRisheng 提交于 9月 19, 2022
```
* move sum

* fix ci bugs

* fix ci bugs

* fix set_lod bugs

* fix infershape bugs

* fix ci bugs

* fix ci unittest bug

* fix ci bugs

* perfect code

* update code according comment

* add unittest

* fix ci bugs
```
  4b3f2af1
- C
  Revert "Simplify size op impl (#45808)" (#46123) · d963e2e4
  由 Chen Weihang 提交于 9月 19, 2022
```
This reverts commit c252b1de.
```
  d963e2e4
- R
  [vision.ops.nms] Fix return order error and duplicate results with specific inputs (#46148) · 2b76db99
  由 RichardWooSJTU 提交于 9月 19, 2022
```
* fix return order error and duplicate results with specific inputs
```
  2b76db99
18 9月, 2022 1 次提交
- Y
  Delete redundant param in SoftmaxFunctor (#46003) · 7f346a76
  由 YuanRisheng 提交于 9月 18, 2022
```
* perfect softmax functor

* fix compile bugs

* fix ci bugs
```
  7f346a76
14 9月, 2022 1 次提交
- L
  
  Support fp16 for index_select and index_add (#45601) · 61012a76
  由 Li Min 提交于 9月 14, 2022
  
  61012a76
13 9月, 2022 2 次提交
- J
  add softmax infer kernel (#45955) · 01888482
  由 JingZhuangzhuang 提交于 9月 13, 2022
```
* add softmax infer kernel
```
  01888482
- Y
  
  migrate squeeze kernel to phi, test=kunlun (#45968) · d3366853
  由 ykkk2333 提交于 9月 13, 2022
  
  d3366853
09 9月, 2022 1 次提交
- C
  Simplify size op impl (#45808) · c252b1de
  由 Chen Weihang 提交于 9月 09, 2022
```
* simplify size op

* trans to cuda manuly

* fix copy error
```
  c252b1de
07 9月, 2022 1 次提交
- W
  [OpAttr]Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose (#45620) · fe169bf1
  由 WangZhen 提交于 9月 07, 2022
```
Adapt tensor output_size for conv2d_transpose and depthwise_conv2d_transpose
```
  fe169bf1
06 9月, 2022 2 次提交
- Y
  
  migrate unsqueeze kernels to phi, test=kunlun (#45673) · 4acf1ef7
  由 ykkk2333 提交于 9月 06, 2022
  
  4acf1ef7
- W
  
  Completes basic dtypes for collective api in eager mode (#45574) · 7a92e74b
  由 Wen Sun 提交于 9月 06, 2022
  
  7a92e74b
01 9月, 2022 1 次提交

[phi] Migrate uniform_random XPU kernel to PHI (#45583) · ded33b58

由 HongyuJia 提交于 9月 01, 2022

* copy kernel file to phi

* delete some code

* migrate uniform_random, test=kunlun

* fix input error, test=kunlun

* fix gpu register error, test=kunlun

* add include file, test=kunlun

* try fix error from CI, test=kunlun

* polish other PR

* fix CI-coverage error, test=kunlun

ded33b58

31 8月, 2022 6 次提交

D
enhance grid_sampler cpu kernel to 5D input (#45578) · 663ebd5f
由 duanyanhui 提交于 8月 31, 2022
```
* enhance grid_sampler cpu kernel to 5D input

* fix bug when 5D input tensor running on the cudnn kernel
```
663ebd5f

[PHI]Move elementwise div/mul of XPU kernel to PHI (#45581) · f41b8566

由 YuanRisheng 提交于 8月 31, 2022

* move elementwise test=kunlun

* move add/sub/mul/div kernel to elementwise_kernel, test=kunlun

* fix ci bugs,test=kunlun

* fix ci bugs

* test=kunlun

f41b8566

[phi] Migrate truncated_gaussian_random XPU kernel to PHI (#45529) · c2942144

由 HongyuJia 提交于 8月 31, 2022

* migrate truncated_gaussian_random kernel to phi, test=kunlun

* reuse CPU kernel, test=kunlun

* debug kernel, test=kunlun

* migrate truncated_gaussian_random kernel to phi, test=kunlun

* split truncated_normal, test=kunlun

* try fix error from CI, test=kunlun

c2942144

A
[OpAttr]output_size of unpool support Tensor type (#45543) · 236ac0d0
由 Aurelius84 提交于 8月 31, 2022
```
* [OpAttr]output_size of unpool support Tensor type

* fix coverage

* fix contain_var

* fix coverage
```
236ac0d0

Fix split api bug (#45396) · 4a25b60d

由 Charles-hit 提交于 8月 31, 2022

* fix split bug

* solve function redefine

* fix fluid.layers.split and add unit test

* delete splitInferMeta register in unary.cc

* modify test_split_op GPU unit test

* modify test_split_op GPU unit test place param

* refactor split op and fix infershape bugs

* add () in && and ||

* fix split C++ unit test

* fix split infershape

4a25b60d

L

Add index add API (#45176) · 45171911
由 Li Min 提交于 8月 31, 2022

45171911

30 8月, 2022 4 次提交
- W
  [OpAttr]Adapt tensor axis for argmin/max (#45453) · 6fc15986
  由 WangZhen 提交于 8月 30, 2022
```
* Adapt tensor axis for argmin/max

* Add UT

* Polish UT
```
  6fc15986
- W
  [OpAttr]Adapt tensor axis for reduce_min/max/mean/sum/prod (#45078) · 32f42e94
  由 WangZhen 提交于 8月 30, 2022
```
* [OpAttr]Adapt tensor axis for reduce_min/max/mean/sum/prod
```
  32f42e94
- W
  
  Adapt tensor num_samples for multinomial (#45522) · c857841e
  由 WangZhen 提交于 8月 30, 2022
  
  c857841e
- C
  
  rename mod c api name (#45476) · ad96fe2c
  由 Chen Weihang 提交于 8月 29, 2022
  
  ad96fe2c
29 8月, 2022 1 次提交

[geometric]Move graph-related incubate api to geometric (#44970) · 8f657f74

由 Siming Dai 提交于 8月 29, 2022

* move incubate to geometric

* add paddle.geometric

* fix unittest bug

* add float16 support for segment op

* change reindex and sample neighbors flag name

* add heter graph reindex

* move sample_neighbors.py to neighbors.py

* delete khop_sampler in geometric

* delete unused code

* change sample_neighbors api input order

* fix en doc

* fix unittest

* fix unittest

* change reindex

* fix division by 0

* delete unnecessary input argument

* delete final_state

8f657f74

25 8月, 2022 4 次提交

Enable OMP multithreading in lookup_table_v2 (#45249) · 0c363de8

由 piotrekobi 提交于 8月 25, 2022

* Add omp parallel for directives

* Revert "Add omp parallel for directives"

This reverts commit f4e4f8ddb12454018d9c1e49c074af2543659de6.

* Add #pragma omp parallel for to correct file

* Add check for _OPENMP definition

* Disable omp on gpu

* Trigger CI

* Readd check for _OPENMP definition

* Change macro disabling changes on GPU

* Improve macro readability

0c363de8

A
[OpAttr]min/max of uniform_random support Tensor type (#45417) · c8955d0d
由 Aurelius84 提交于 8月 25, 2022
```
* [OpAttr]min/max of Uniform_rand support Tensor type

* fix typo
```
c8955d0d
S
make full_like support double_max in dygraph (#45385) · edd66f2e
由 Sing_chan 提交于 8月 25, 2022
```
* make full_like support double_max in dygraph

* fix bug
```
edd66f2e
R

[triu_indices] add triu_indices_op (#45168) · a410c397
由 Rayman 提交于 8月 25, 2022

a410c397

24 8月, 2022 2 次提交

make tensor_util contains no cuda code (#45256) · 78916a7a

由 Leo Chen 提交于 8月 24, 2022

* make tensor_util contains no cuda code

* refine isfinite

* revert ut

* move isfinite function to its op

* fix test

* fix compile

* std::isnan is not defined for int type on windows

* fix windows compile

* fix fp16

* fix rocm compile

* revert gradient node

78916a7a

W

Adapt tensor axis for cumsum (#45372) · 7f49b9ba
由 WangZhen 提交于 8月 24, 2022

7f49b9ba

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功