提交 · 6d78524c27732fdc4f3505815d392d8f24b2dca8 · PaddlePaddle / Paddle

27 5月, 2022 1 次提交

[Phi] Change optional tensor from `optional<const Tensor&>` to `optional<Tensor>` (#42939) · 6d78524c

由 zyfncg 提交于 5月 27, 2022

* refactor the optional tensor

* remove optiona<MetaTensor> in InferMeta

* fix bug

* fix optional<vector<Tensor>>

* fix bug

* fix rmsprop

* fix amp of eager_gen

* polish code

* fix deleted code

* fix merge conflict

* polish code

* remove is_nullopt_

* fix merge conflict

* fix merge conflict

6d78524c

16 3月, 2022 1 次提交
- Z
  [Ops] segment pool op support for int int64 kernel. (#40577) · 6849d33b
  由 Zhong Hui 提交于 3月 16, 2022
```
* segment pool support for int int64 kernel.

* add support in python api
```
  6849d33b
10 3月, 2022 1 次提交
- Z
  [PHI] Move segment_pool to phi. (#40099) · a07f19ee
  由 Zhong Hui 提交于 3月 10, 2022
```
* move segment_pool to phi.

* mark summed ids as optional tensor.

* fix as reviews.
```
  a07f19ee
02 3月, 2022 1 次提交
- S
  Move gather.h/gather.cu.h/scatter.h/scatter.cu.h to the phi library (#40043) · 09258040
  由 sneaxiy 提交于 3月 02, 2022
```
* move gather.h gather.cu.h scatter.h scatter.cu.h to phi library

* fix CI

* fix rocm ci
```
  09258040
20 2月, 2022 1 次提交

[PTen->Phi PR1] Change pten dirname and namespace to phi (#39748) · dcfe1986

由 Chen Weihang 提交于 2月 20, 2022

* rename pten dir to phi

* rename namespace to phi

* rename infrt pten dir to phi

* resolve conflict

* rename pten to phi in cmake

* revert all infrt change

* change needed files

* fix infrt failed

* fix inference failed

dcfe1986

11 2月, 2022 1 次提交
- F
  [Pten] move operators/math/math_function_* to pten/kernels/func (#39300) · d25a7f9e
  由 Feiyu Chan 提交于 2月 11, 2022
```
* move operators/math/math_function_* to pten/kernels/func
* namespace from `paddle::operators::math` to `pten::funcs`
```
  d25a7f9e
17 12月, 2021 1 次提交

add launch bound to limit the registers usage for volta architecture (#38113) · 18a59822

由 zlsh80826 提交于 12月 17, 2021

From --ptxas-options=-v, SegmentOpsKernel uses 66 registers in a block.
There are two ways to resolve this problem:
Reduce the threads per block launch configuration
add __launch_bound__ to give information to nvcc compiler for reducing registers usage
this PR chooses __launch_bound__ solution because changing gpu_launch_config may affect other ops.

18a59822

03 12月, 2021 1 次提交
- R
  refine structure for cuda and rocm (#37202) · a6d2fddb
  由 ronnywang 提交于 12月 03, 2021
```
* refine structure for cuda and rocm

* update

* update

* update

* update
```
  a6d2fddb
27 4月, 2021 1 次提交
- Z
  [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) · 1afe1ac9
  由 Zhong Hui 提交于 4月 27, 2021
```
* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
```
  1afe1ac9
20 10月, 2020 1 次提交
- W
  
  refine gpu kernel config for Paddle (#28085) · 463c72c2
  由 wangchaochaohu 提交于 10月 20, 2020
  
  463c72c2
26 9月, 2020 1 次提交
- Z
  fix cpplint error for the autmic max/min · a85592bc
  由 Zhong Hui 提交于 9月 26, 2020
```
fix cpplint error for the autmic max/min
```
  a85592bc
24 9月, 2020 1 次提交
- Z
  Add GPU Kernels of Segment Ops, support, sum, max, min, mean · 4a9d21de
  由 Zhong Hui 提交于 9月 24, 2020
```
Add GPU Kernels of Segment Ops,  support, sum, max, min, mean
```
  4a9d21de

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功