提交 · 7a0b8625bd9b0581b9c4cf9c5ec9ce08cc76fe67 · PaddlePaddle / Paddle

28 11月, 2022 1 次提交

Cherrypick NV fixes to release/2.4 (#48263) · 7a0b8625

由 zlsh80826 提交于 11月 28, 2022

* Reduce squeeze2_matmul_fuse_pass, flattent tests time (#47098)

* Add missing fp32 config and reduce the testing combination

* Reduce trt matmul pass test max examples

* Loose TRT fp16 tests tolerance (#47100)

* Loose TRT half test tolerance to 1e-3 (#47101)

* Loose TRT half test tolerance to 1e-3 (#47106)

* Update distributed_strategy.proto (#46531)

* Close popen pipe after used (#47053)

* Add launch_bounds (#47285)

* Fix TRT UT failures (#47488)

* Format cherry-picked commits

* CudnnNormConvolution is no longer supported on NVIDIA Hopper GPUs (#48203)

* Skip tests that use fused_ops on H100

* Add error message to FusedOps on H100
Co-authored-by: NShijie <505749828@qq.com>
Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
Co-authored-by: NTian Zheng <tizheng@nvidia.com>

7a0b8625

19 9月, 2022 1 次提交
- M
  Add INT8 support for fused_multi_transformer_op (#45284) (#46169) · db368d5b
  由 minghaoBD 提交于 9月 19, 2022
```
Co-authored-by: NRichardWooSJTU <37864677+RichardWooSJTU@users.noreply.github.com>
```
  db368d5b
01 8月, 2022 1 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
04 3月, 2022 1 次提交
- L
  clean distribution_helper, index_impl, aligned_vector code in fluid (#40071) · b9672a1e
  由 Leo Chen 提交于 3月 04, 2022
```
* clean distribution_helper, index_impl, aligned_vector code in fluid

* fix conflicts
```
  b9672a1e
17 9月, 2021 1 次提交

add a fusion op: fused_layernorm_residual_dropout_bias (#35151) · 7975dfcf

由 zhangkaihuo 提交于 9月 17, 2021

Fused elementwise_add, dropout, elementwise_add and layer_norm into one operator, only support Forward. 
No Python API changed.

7975dfcf

16 9月, 2021 1 次提交
- Z
  
  add a fusion op: fused_dropout_act_bias (#35129) · cee70434
  由 zhangkaihuo 提交于 9月 16, 2021
  
  cee70434

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功