提交 · release/2.3 · 机器未来 / Paddle

11 10月, 2022 1 次提交
- W
  
  fix_slice_convert_varlen (#46874) · a5875319
  由 Wangzheee 提交于 10月 11, 2022
  
  a5875319
12 8月, 2022 1 次提交
- X
  [Cherry-pick] Support nvcc lazy on for cpu. (#45090) · 4596b9a2
  由 xiaoxiaohehe001 提交于 8月 12, 2022
```
* nvcc_lazy__

* nvcc_lazy__

* nvcc_lazy__

* nvcc_lazy__

* nvcc_lazy__

* nvcc_lazy__
```
  4596b9a2
10 8月, 2022 4 次提交
- W
  
  disable_skip_layernorm_fp16 (#45041) · 1bec83f4
  由 Wangzheee 提交于 8月 10, 2022
  
  1bec83f4
- Z
  [Paddle-TRT] fix conv2d/int64 (#45023) · 9a04540c
  由 zhoutianzi666 提交于 8月 10, 2022
```
* fix_conv2d_2_3

* commit

* fix_conv2d_2_3

* fix_conv2d_2_3

* fix_conv2d_2_3
```
  9a04540c
- C
  [Cherry pick] fix quant scale name (#44903) · cbab0184
  由 ceci3 提交于 8月 10, 2022
```
* fix quant scale name (#44116)

* fix acc diff problem caused by pr #44116 (#44311)
Co-authored-by: Nhandiz <35895648+ZhangHandi@users.noreply.github.com>
```
  cbab0184
- X
  [Cherry Pick] Nvcc lazy linux fix. (#44997) · 26762817
  由 xiaoxiaohehe001 提交于 8月 10, 2022
```
* nvcclazylinuxfix

* nvcclazylinuxfix
```
  26762817
09 8月, 2022 2 次提交

[Cherry-pick] Several bugs fix (#44991) · e00aa903

由 Chen Weihang 提交于 8月 08, 2022

* fix device context init error (#43910)

* Fix core so name mismatch error (#43977)

* fix core avx soname error

* remove print info

* add clip_extra (#44008)

* fix tensor stream error in custom op (#44500)

* fix custom op attr names size error (#44938)

e00aa903

C

add post layer norm (#44931) · c5f4a9cc
由 carryyu 提交于 8月 09, 2022

c5f4a9cc

08 8月, 2022 2 次提交
- J
  add trt int8 dynamic support (#44800) · 9336dd3e
  由 JingZhuangzhuang 提交于 8月 08, 2022
```
* add trt int8 dynamic support

* just support trt7+

* just for trt7.1.3.a

* Update tensorrt_subgraph_pass.cc

* delete trt_engine when it not use
```
  9336dd3e
- X
  
  nvcclazylinux (#44957) · 210fa777
  由 xiaoxiaohehe001 提交于 8月 08, 2022
  
  210fa777
05 8月, 2022 2 次提交
- Z
  
  fix conflict (#44891) · 30b66f03
  由 zhaoyingli 提交于 8月 05, 2022
  
  30b66f03
- Z
  
  commit (#44887) · 247002ec
  由 zhoutianzi666 提交于 8月 05, 2022
  
  247002ec
04 8月, 2022 3 次提交
- G
  [cherry-pick] fix QuantizeLinear pass and support reduce_max in quantization (#44872) · 24b3bbde
  由 Guanghua Yu 提交于 8月 04, 2022
```
* fix QuantizeLinear kernel and pass in QAT (#44784)

* Add Reduce Max in Quant (#44825)
Co-authored-by: NChang Xu <molixu7@gmail.com>
```
  24b3bbde
- Z
  [Paddle-TRT][cherry pick] Slice to 2.3 (#44757) · 245005d4
  由 zhoutianzi666 提交于 8月 04, 2022
```
* slice_to_2.3
```
  245005d4
- C
  [cherry pick] add cast trt convert (#44837) · 7cdce09b
  由 ccrrong 提交于 8月 04, 2022
```
* add cast trt convert

* skip cast trt convert when input dtype is bool

* code format

* fix bug

* update unittest

* fix bug
```
  7cdce09b
03 8月, 2022 1 次提交
- Y
  Adjust the relative error of QR's grad (#44785) · 627e5bd5
  由 Yulong Ao 提交于 8月 03, 2022
```
* Adjust the relative error of QR's grad (#42221)

* Fix the format
```
  627e5bd5
02 8月, 2022 4 次提交

H

paddle2onnx upgrade version to 1.0.0rc2 (#44759) (#44791) · cd59df5f
由 heliqi 提交于 8月 02, 2022

cd59df5f

[cherry-pick]Ort backend optimizer(#44136 #44703 #44724) (#44766) · 35297bd8

由 heliqi 提交于 8月 02, 2022

* [Inference]ort backend optimizer (#44136)

* add ort clone interface

* paddle2onnx update to 1.0.0rc

* ort input_tensor use mutable data of scope

* clone ort_predictor reuse session (#44703)

* ort backend support output mutable data (#44724)

* 2.3 interface is different from the Develop interface

* 2.3 interface is different from the Develop interface

* 2.3 interface is different from the Develop interface

35297bd8

Y
Pass NVIDIA_TF32_OVERRIDE to internal (#43646) (#44796) · e7547ca7
由 Yuang Liu 提交于 8月 02, 2022
```
Co-authored-by: Ngongweibao <gongweibao@baidu.com>
```
e7547ca7

Fix operator type record in profiler [cherry-pick PR44582] (#44654) · 6de20581

由 chenjian 提交于 8月 02, 2022

* fix record event for operator type in new dygraph (#44582)

* fix new dygraph record event for op

* update unit test

* fix file mode

6de20581

01 8月, 2022 1 次提交

[UT]fix test_poisson op random fail (#44763) · b71833ea

由 zhouweiwei2014 提交于 8月 01, 2022

修复poisson op单测随机挂

原因：由于随机OP的无法直接验证数值正确性，该单测随机采样100万个样本，统计落到直方图各区间的数量，计算出粗略的概率密度函数，与标准概率密度函数对比，这种测试方式会有一定误差。
当采样数量越小，误差越大，因此该PR增大采样样本数量（100万->200万），误差进一步减小在rtol范围内。

b71833ea

25 7月, 2022 1 次提交
- [cherry-pick]remove unuse cuSparse function (#44511) · 684b12ee
  由 zhouweiwei2014 提交于 7月 25, 2022
```
cherry-pick #43626
```
  684b12ee
19 7月, 2022 1 次提交

Record op shape data for profiler [cherry-pick PR43405 43578 43822] (#44384) · a2240190

由 chenjian 提交于 7月 19, 2022

* add serialization for new field in event node (#43405)

* add serialization for new field in event node

* fix a bug

* add more field to memory record (#43578)

* Add infer shape in dygraph (#43822)

* record memory and op supplement info

* update

* update

* fix a bug

* fix memory recording

* fix a bug

* update

* update

* fix a bug

* update

* fix a bug

* fix a bug

* fix a bug

* update dygraph record

* add infer shape record

* fix

* fix

* fix

* add comments

* fix a bug

* fix

* fix

* add record op info

* fix file mode

* add op input shape info

* fix dependency

a2240190

12 7月, 2022 1 次提交

add new field for event node (#43223) (#44245) · 94271bc2

由 chenjian 提交于 7月 12, 2022

* add new field for event node

* fix

* fix bug

* fix bug

* fix clang

* fix clang format

* fix code format

94271bc2

01 7月, 2022 1 次提交
- S
  
  make only win32 and 11.6 use external/cub (#44005) · 3cc6ae69
  由 Sing_chan 提交于 7月 01, 2022
  
  3cc6ae69
30 6月, 2022 4 次提交
- H
  [Cherry-pick] Apply IOU to test_parallel_executor_seresnext_base_gpu … (#43925) · fde34eb8
  由 Huihuang Zheng 提交于 6月 30, 2022
```
* [Cherry-pick] Apply IOU to test_parallel_executor_seresnext_base_gpu (#43812)
1. Fix the conflict between #43812 and current release/2.3 branch
2. test_parallel_executor_seresnext_base_gpu failed on 2 P100 GPUs with `470.82` driver.
```
  fde34eb8
- S
  
  cherry pick 43934 and not format (#43935) · 83520fd2
  由 Sing_chan 提交于 6月 30, 2022
  
  83520fd2
- W
  [Paddle Inference ]Fix emb pass for ernie3.0 (#43948) · 35abeda7
  由 Wangzheee 提交于 6月 30, 2022
```
* fix emb pass for ernie3.0

* fix emb pass for ernie3.0

* fix emb pass for ernie3.0
```
  35abeda7
- J
  
  modify graph_pattern to thread_local (#43945) · 1ea9971a
  由 JingZhuangzhuang 提交于 6月 30, 2022
  
  1ea9971a
29 6月, 2022 2 次提交

Fix elementwise_div UT by providing user defined gradients (#43536) (#43909) · 26187c27

由 Qi Li 提交于 6月 29, 2022

Cherry-pick of #43536

Backgroud in #43262

In elementwise_div UT, the numeric gradient (validation) has large relative error in comparison to analytic gradient (Paddle OP).

The default rtol for UTs is 0.005
The rtol for float32 and float64 elementwise_div OP is set to be 0.05
The rtol for float16 and bfloat16 elementwise_div OP is set to be 1.0

The relative error is too large, so this PR provides user defined gradients to test elementwise_div followed by the analytic method.

26187c27

R
cherry pick 43890 (#43892) · 69e82d83
由 ronnywang 提交于 6月 29, 2022
```
* cherry pick 43890
```
69e82d83

28 6月, 2022 4 次提交

[cherry-pick] Fix code examples (#43904) · dc12605d

由 Chen Long 提交于 6月 28, 2022

* Update api docs (#42725)

* Fix max_pool3d doc, test=document_fix (#42715)

* fix pooling doc

* fix typo test=document_fix

* fix doc typo, test=document_fix

* fix adaptive_avg_pool1d doc bug (#42721)

* fix adaptive_avg_pool1d doc bug

* fix adaptive_avg_pool1d doc bug

* fix spectral_norm en doc (#42728)

* Fix example code bugs (#42739)

* update readme test=document_fix

* fix api docs bugs test=document_fix

* fix code example bugs;test=document_fix
Co-authored-by: NLinjie Chen <40840292+linjieccc@users.noreply.github.com>
Co-authored-by: NWei Shengyu <weisy11@163.com>
Co-authored-by: NWalter <dongshl1226@hotmail.com>
Co-authored-by: Nwangna11BD <79366697+wangna11BD@users.noreply.github.com>

dc12605d

[Docs] Fix doc of kaiming initializer (#43823) (#43827) · 63458e5b

由 Jackwaterveg 提交于 6月 28, 2022

* Update kaiming.py

* Update initializer.py

* fix doc bug;test=document_fix

* fix doc;test=document_fix

* Update initializer.py

* Update kaiming.py

* for ci;test=document_fix
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>

63458e5b

P

Cherry-pick PR43834, support mac m1 arm compile in paddle_build (#43834) (#43872) · 61bededd
由 pangyoki 提交于 6月 28, 2022

61bededd
Z
[Inference TRT] elementwise layer support (#43851) · 17a2003d
由 zhoutianzi666 提交于 6月 28, 2022
```
* elementwise support

* commit
```
17a2003d

27 6月, 2022 2 次提交

G
[cherry-pick]Update quantization round and clip calculation methods (#43829) · ff70a269
由 Guanghua Yu 提交于 6月 27, 2022
```
* update quantization clip and round

* fix quantization clip and round Attribute

* fix typo
```
ff70a269

[Cherry-pick] Fix incompatible error for place type (#43830) · 9e776f62

由 Chen Weihang 提交于 6月 27, 2022

* Create Tensor by paddle::empty  in custom operator (#41840)

* create tensor by empty in custom op

* fix some bug

* update relu custom op demo (#43173)

* Fix incompatible error for custom op Placetype (#43749)

* fix incompatible error

* rmeove default constructor

* add macro

* fix cpu make error

* add DefaultGPUPlace api
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>

9e776f62

25 6月, 2022 2 次提交
- H
  
  Upgrade onnxruntime to 1.11.1 (#43797) · 51240331
  由 heliqi 提交于 6月 25, 2022
  
  51240331
- L
  [new-exec] lazy creating work queue (#43551) (#43768) · 0c44dd64
  由 Leo Chen 提交于 6月 25, 2022
```
* lazy creating work queue

* fix dry_run
```
  0c44dd64
24 6月, 2022 1 次提交

[cherry-pick] NVIDIA fixes (#43780) · 9edbe4aa

由 Aganlengzi 提交于 6月 24, 2022

* Use all sitepackages path as the library/include path (#42940)

* Fix several unit tests and increase the unit tests stability (#43670)

* Reduce gather op unit tests size and increase the timeout

* Add NVIDIA_TF32_OVERRIDE for multi-processes environment

* Remove record test for device event ut

* Fix 3 unittest errors (#43532)

* Fix test_fuse_resnet_unit failure

* Fix test_imperative_auto_mixed_precision failure

* Fix sparse_attention_op error

* Fix sparse_attention_op error

* Use fixed random seed (#43659)

* for CI test_collective_sendrecv_api
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>
Co-authored-by: NShijie <505749828@qq.com>

9edbe4aa

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致