提交 · 99a4ff8fe4be92c982177b735b176aa8f55fae71 · BaiXuePrincess / Paddle

30 6月, 2022 8 次提交
- L
  [new-exec] support runing with different scope and the same program using scope_guard (#43962) · 99a4ff8f
  由 Leo Chen 提交于 6月 30, 2022
```
* support scope_guard

* fix test
```
  99a4ff8f
- X
  [Dy2Static] Add non-local for while and for. (#43864) · 8279dfea
  由 xiongkun 提交于 6月 30, 2022
```
* merge and add base support for non-local for

* for and while non-local support

* fix ci errors: v1

* fix bug

* fix

* fix code

* fix

* fix

* fix
```
  8279dfea
- R
  Remove boost::variant for FetchResultType (#43932) · f720e231
  由 Ruibiao Chen 提交于 6月 30, 2022
```
* Remove boost::variant for FetchResultType

* Fix pybind errors
```
  f720e231
- J
  modify graph_pattern to thread_local (#43942) · 6467ca0d
  由 JingZhuangzhuang 提交于 6月 30, 2022
```
* modify graph_pattern to thread_local

* modify graph_pattern to thread_local
```
  6467ca0d
- Z
  Add new attr of fused_multi_transformer (#43730) · c2a5bb91
  由 Zhang Zheng 提交于 6月 30, 2022
```
* Add new attr of fused_multi_transformer

* fix format

* add note

* add in layer

* fixfixfixfix
```
  c2a5bb91
- C
  [phi]add relu6 kernel and yaml (#43549) · a9bba5ba
  由 chentianyu03 提交于 6月 30, 2022
```
* add relu6 kernel and yaml

* format files

* format code and fix bug

* fix build failed
```
  a9bba5ba
- C
  
  [MLU] add rnn forward kernel. (#43894) · 2616d51a
  由 Chenxiao Niu 提交于 6月 30, 2022
  
  2616d51a
- J
  
  fix deriv with inplace (#43930) · 1efc80c6
  由 Jiabin Yang 提交于 6月 30, 2022
  
  1efc80c6
29 6月, 2022 17 次提交
- S
  
  fix compiling error in cuda 11.6 windows (#43934) · 77d75aa4
  由 Sing_chan 提交于 6月 29, 2022
  
  77d75aa4
- Z
  
  [GPUPS]Optimize dymf kernel (#43911) · a7a4843c
  由 zmxdream 提交于 6月 29, 2022
  
  a7a4843c
- Z
  Support code auto-gene for optimizer api in yaml (#43915) · aa45f931
  由 zyfncg 提交于 6月 29, 2022
```
* support complexd selected_rows kernel in yaml

* support configuring optimizer api in yaml

* fix data transform bug
```
  aa45f931
- Z
  
  pr file count is greater than 30 need to run all cases (#43906) · 78023658
  由 zhangchunle 提交于 6月 29, 2022
  
  78023658
- W
  
  convert to mixed model python api (#43881) · cbaebb04
  由 Wilber 提交于 6月 29, 2022
  
  cbaebb04
- J
  [Auto parallel] Bug fixed for GPT3 benchmark (#43793) · 74c9b57b
  由 JZ-LIANG 提交于 6月 29, 2022
```
* fixed bug for pass & engine

* fixed bug for benchmark GPT-3
```
  74c9b57b
- Z
  Update the lock logic used in CinnCompiler::Compile. (#43876) · ccfde2da
  由 Zhen Wang 提交于 6月 29, 2022
```
* Update the lock logic used in CinnCompiler::Compile.
```
  ccfde2da
- T
  
  Add test ut cicheck_py37 (#43804) · 8bd69193
  由 tianshuo78520a 提交于 6月 29, 2022
  
  8bd69193
- Z
  
  Change sparse Copy from Kernel to basic component utils (#43916) · 148fa05e
  由 zhangkaihuo 提交于 6月 29, 2022
  
  148fa05e
- C
  add equal trt converter (#43461) · 1dbbe20e
  由 ccrrong 提交于 6月 29, 2022
```
* add comparisons trt converter
```
  1dbbe20e
- L
  
  add kernel_decalre for xpu kp kernels (#43920) · 6132476d
  由 Leo Chen 提交于 6月 29, 2022
  
  6132476d
- L
  [new-exec] remove variable scope, stage 1 (#43865) · 9f74363f
  由 Leo Chen 提交于 6月 29, 2022
```
* separate variable scope and scope

* hot fix for lod_tensor_blocking_queue

* fix bug that variable exists in global scope
```
  9f74363f
- C
  
  fix device context init error (#43910) · d1ac85e5
  由 Chen Weihang 提交于 6月 29, 2022
  
  d1ac85e5
- W
  inference support mixed-precision model [1]. (#43814) · c7694b82
  由 Wilber 提交于 6月 29, 2022
```
* inference add convert to mixed model ability.
```
  c7694b82
- Z
  Move apis(cross, diagonal) legacy_api.yaml to api.yaml (#43893) · 8fa8e17e
  由 zyfncg 提交于 6月 29, 2022
```
* move cross form legacy_api.yaml to api.yaml

* move diagonal to api.yaml
```
  8fa8e17e
- R
  
  fix custom_device log (#43890) · fb1a93a8
  由 ronnywang 提交于 6月 29, 2022
  
  fb1a93a8
- Q
  skip xpu conv2d fp16 unitest (#43547) · bceca47a
  由 QingshuChen 提交于 6月 29, 2022
```
* skip xpu conv2d fp16 unitest
*test=kunlun

* minor
*test=kunlun
```
  bceca47a
28 6月, 2022 15 次提交

Y

[fused_transformer] update transformer fustion for dygraph, test=allcases (#43858) · 99b3727d
由 Yuang Liu 提交于 6月 28, 2022

99b3727d
S

make eager_utils.h in the begining of all headers (#43896) · 72116696
由 Sing_chan 提交于 6月 28, 2022

72116696
A

[Dy2Stat]Enhance Python if-else by pruning usless no_return variable (#43880) · 6e0aa776
由 Aurelius84 提交于 6月 28, 2022

6e0aa776
A
[Dy2Stat]Unify all API name in_jst import path to improve readablity (#43868) · 6cb24967
由 Aurelius84 提交于 6月 28, 2022
```
* [Dy2Stat]Polish all API name of _jst
```
6cb24967
X
add unittest for PR43688 (#43747) · 13451615
由 xiongkun 提交于 6月 28, 2022
```
* add unittest for PR43688
```
13451615

[UPDATE FLUID API] PaddleRec-related only (#43862) · 4ef704eb

由 wangzhen38 提交于 6月 28, 2022

* [UPDATE FLUID API] only reference in paddlerec

* change lr

* [UPDATE FLUID API] only reference in paddlerec

* update by reviews

4ef704eb

change the condition to find python interpreter to avoid skipping the find process. (#43888) · 6547d256

由 Feiyu Chan 提交于 6月 28, 2022

* change to condition to find python interpreter to avoid skipping the find process.
PYTHONINTERP_FOUND is the best signal that python interpreter is found.

6547d256

C

fix code example bugs;test=document_fix (#43905) · 323160df
由 Chen Long 提交于 6月 28, 2022

323160df

Enable Bert on bfloat16 datatype (#43455) · 6d31dc93

由 Tomasz Socha 提交于 6月 28, 2022

* Remove output arguments from functions.
Replace pointers with references

* Name used bool flags

* Reorder functions

* Enable bfloat16 data type

* Give declarations some space

* Style

* Style

6d31dc93

Z

[MLU]: add roi_align and roi_align_grad kernel (#43757) · 99ea0a9c
由 zhaoying9105 提交于 6月 28, 2022

99ea0a9c

Apply IOU to test_parallel_executor_seresnext_base_gpu (#43812) · c0cf5cb7

由 Ming-Xu Huang 提交于 6月 28, 2022

1. test_parallel_executor_seresnext_base_gpu failed on 2 P100 GPUs with `470.82` driver.
```
======================================================================
FAIL: test_seresnext_with_learning_rate_decay (test_parallel_executor_seresnext_base_gpu.TestResnetGPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/paddle/paddle/build/python/paddle/fluid/tests/unittests/test_parallel_executor_seresnext_base_gpu.py", line 32, in test_seresnext_with_learning_rate_decay
    self._compare_result_with_origin_model(
  File "/opt/paddle/paddle/build/python/paddle/fluid/tests/unittests/seresnext_test_base.py", line 56, in _compare_result_with_origin_model
    self.assertAlmostEquals(
AssertionError: 6.8825445 != 6.882531 within 1e-05 delta (1.335144e-05 difference)
----------------------------------------------------------------------
```
2. To be more accuracte on evaluating loss convergence, we proposed to apply IOU as metric, instead of comparing first and last loss values.
3. As offline discussion, we also evaluated convergence on P100 and A100 in 1000 interations to make sure this UT have the same convergence property on both devices. The curves are showed below.
![A100-Single, P100-Single and Diff (1)](https://user-images.githubusercontent.com/13541238/175461920-25df6101-6dd8-4387-862c-d1c8e9299c57.png)

c0cf5cb7

F

[MLU]add mlu kernel for where_index op (#43720) · ec5f8cfd
由 fuyou765 提交于 6月 28, 2022

ec5f8cfd
石

fixes a bug, test=develop (#43884) · 5369378b
由石晓伟提交于 6月 28, 2022

5369378b
【Sparse】add SparseTensor mv kernel(csr*dense_vec->dence_vec, coo*dense_vec->dense_vec) (#43668) · 5161a047
由 zhouweiwei2014 提交于 6月 28, 2022
```
* [Sparse]add SparseTensor mv kernel(csr*dense_vec->dence_vec, coo*dense_vec->dense_vec)

* fix CI
```
5161a047

[GPUPS]Optimize hbm for dymf (#43863) · 1dc2117f

由 zmxdream 提交于 6月 28, 2022

* fix merge_grad&push_sparse

* change typo

* fix code format. test=develop

* fix code format. test=develop

* fix code format. test=develop

* fix debug info

* optimize hbm

* fix size_t

* fix size_t

1dc2117f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致