提交 · 8ac0344a4ba3f291fa170145504cbfd9ead03d2c · PaddlePaddle / Paddle

01 12月, 2021 2 次提交
- T
  
  Add paddle.rad2deg and paddle.deg2rad (#37598) · 8ac0344a
  由 Tao Luo 提交于 12月 01, 2021
  
  8ac0344a
- H
  Modify ShareTensorWithCinnBuffer by callback to save memory (#37493) · 661dbdbe
  由 Huihuang Zheng 提交于 12月 01, 2021
```
Modify ShareTensorWithCinnBuffer by callback to save memory
```
  661dbdbe
30 11月, 2021 20 次提交
- W
  
  [fleet_executor] interceptor run from python interface (#37693) · 8a4460f5
  由 WangXi 提交于 11月 30, 2021
  
  8a4460f5
- S
  
  add matmul_v2_transpose_reshape_fuse_pass to quant2_int8_mkldnn_pass.py (#37619) · 82b55961
  由 Sylwester Fraczek 提交于 11月 30, 2021
  
  82b55961
- S
  refactoring matmul_v2 mkldnn hierarchy (#37622) · fab92824
  由 Sylwester Fraczek 提交于 11月 30, 2021
```
* refactoring matmul hierarchy

* review fix

* review fix

* review_FIX-part2
```
  fab92824
- C
  
  add pten_transpose dependence device_context (#37705) · 5747fd1e
  由 chentianyu03 提交于 11月 30, 2021
  
  5747fd1e
- S
  Add new unittests for gIOHW format in conv_transpose_mkldnn_op (#37344) · d93ee063
  由 Sławomir Siwek 提交于 11月 30, 2021
```
* Add new unittests

* Replace I with O channel for filter groups

* Undo changes affecting other operators

* Fix oneDNN namespace typo

* Fix code format error
```
  d93ee063
- Z
  [opt] Add regularation and Nesterov for mergerd_momentum op (#37527) · c8ffdecb
  由 zhangbo9674 提交于 11月 30, 2021
```
* add regularation and Nesterov for mergerd_momentum

* refine unittest for use_nesterov attr

* refine op check

* refine code

* fix bug

* refine code of regularization_flag

* delete useless code
```
  c8ffdecb
- X
  [Auto Parallel] elastic support auto parallel re-launch (#37523) · 5440d2f9
  由 xiayanming 提交于 11月 30, 2021
```
* [Auto Parallel] elastic support auto parallel re-launch

* [Auto Parallel] elastic support auto parallel re-launch

* fix ci issue

* fix ci issue

* fix rank mapping unittest

* fix rank mapping unittest

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue

* fix ci issue
```
  5440d2f9
- Z
  Eager dygraph egr_utils_api namespace refactor (#37654) · 3d2ec707
  由 Zhanlue Yang 提交于 11月 30, 2021
```
* Refactored eager legacy namespace

* Fixed namespace issues
```
  3d2ec707
- Z
  Enabled performance benchmark tests for Eager Dygraph (#37653) · eb9e3305
  由 Zhanlue Yang 提交于 11月 30, 2021
```
* Enabled performance benchmark tests for Eager Dygraph

* Protected CUDA tests with macro

* Fixed dependency issues for windows-ci
```
  eb9e3305
- Z
  
  pscore global shuffle&default accessor config (#37626) · 1514eec6
  由 zhaocaibei123 提交于 11月 30, 2021
  
  1514eec6
- A
  Add diff op (#37441) · 2f4c089b
  由 andyjpaddle 提交于 11月 30, 2021
```
* add diff op, test=develop

* rm some notes, test=develop

* update diff doc

* update sample code

* fix diff api params and example code, test=develop
```
  2f4c089b
- C
  
  add scale api and test (#37683) · 0c8b9994
  由 Chen Weihang 提交于 11月 30, 2021
  
  0c8b9994
- S
  Open trt in windows (#37397) · 5f916c37
  由 Sing_chan 提交于 11月 30, 2021
```
* modify for wincheck-inference case

* modify according to zhouwei's comment

* open with_trt and block failed unittests in windows

* test
```
  5f916c37
- Y
  
  [fleet_executor] pass the env from carrier to interceptor (#37691) · 809a6452
  由 Yuang Liu 提交于 11月 30, 2021
  
  809a6452
- G
  support data_format='NHWC' for prelu channel mode (#37019) · 3f2a665a
  由 Guoxia Wang 提交于 11月 30, 2021
```
* support data_format='NHWC' for prelu channel mode
```
  3f2a665a
- Y
  
  fix overflow in some cuda ops (#37670) · 0c82e3a0
  由 Yang 提交于 11月 30, 2021
  
  0c82e3a0
- Y
  [Auto Parallel] Do the physical mapping between the process graph and the cluster graph (#37094) · b0dff05d
  由 Yulong Ao 提交于 11月 30, 2021
```
* [Auto Parallel]  Add the unified cluster representation

* [Auto Parallel] Add the graph class for physical mapping

* [Auto Parallel] Add the simple physical mapper

* Set the timeout of the mapper

* Merge the upstream develop unittests cmake files

* Fix a bug of the process group

* Remove mapper unittest from platforms which is not GPU

* Move the instantiation of process group after resharding

* Add the local id for devices

* Update the rank mapping format

* Add some comments

* Remove the related files about mapping

* Update the unittest for auto mapping

* Remove unused rank_mapping unittest

* Improve the unittest coverage

* Improve the unittest coverage
```
  b0dff05d
- L
  
  [Fleet_Executor] Passing runtime scope and place (#37603) · 87e65a99
  由 LiYuRio 提交于 11月 30, 2021
  
  87e65a99
- C
  
  open pten tensor test (#37673) · 0156669e
  由 Chen Weihang 提交于 11月 29, 2021
  
  0156669e
- X
  Fix test calc gradient (#37672) · a0631364
  由 xiongkun 提交于 11月 30, 2021
```
* add scope_guard

* 1. fix control flow cases 2. fix calc_gradient
```
  a0631364
29 11月, 2021 18 次提交

Z

Refactored eager legacy namespace (#37659) · 74fdba7c
由 Zhanlue Yang 提交于 11月 29, 2021

74fdba7c
T

DLTP-40731 [Bug] xpu1+x86环境，develop paddle包，nlp case glue_xpu1_dy_bert_bs32 (#37666) · 46c71f2c
由 taixiurong 提交于 11月 29, 2021

46c71f2c

[Pten] Add reduce mean kernel, replace with mean API (#37559) · f9e9fd19

由 chentianyu03 提交于 11月 29, 2021

* add pten reduce kernel

* add reduce_sum kernel

* update attribute args and order

* make out dtype undefined

* fix empty input error

* merge develop branch

* rename sum as reduce function

* rename sum as reduce function

* fix reducekernelImpl args error

* add reduce cuda kernel

* modify dims type to const &

* remove unsed log

* fix reduce_all out eigen function error

* remove unused codes

* add the missing sum api define and testcase

* merge develop branch

* fix sum test axis value error

* replace pten mean kernel with reduce_mean

* revcover meam cuda to original implement

f9e9fd19

add expand_v2/expand_as_v2 for kunlun (#37592) · dae4e7f2

由 TTerror 提交于 11月 29, 2021

* add expand_v2/expand_as_v2 for kunlun

* update expand_as_v2

* update expand_as_v2

* support float16/bool

* update xpu.cmake

dae4e7f2

W

continue if transform not support dtype, test=develop (#37661) · 1b00fc48
由 wanghuancoder 提交于 11月 29, 2021

1b00fc48
P

Add third batch of deprecated mkldnn namespace name changes (#37558) · 1ba81500
由 piotrekobiIntel 提交于 11月 29, 2021

1ba81500
T

test=document_fix (#37652) · 6b8a6220
由 tianshuo78520a 提交于 11月 29, 2021

6b8a6220
L

update (#37620) · 1d456659
由 lilong12 提交于 11月 29, 2021

1d456659
C
[Pten] add cuda implement of cast kernel (#37610) · 9956763e
由 chentianyu03 提交于 11月 29, 2021
```
* add cuda implement of cast kernel

* remove bfloat16 when defined paddle_with_hip
```
9956763e

[AMP] For `amp.decorate()` optimizers set to None is ok (#37541) · 2bb3f0b5

由 zhangbo9674 提交于 11月 29, 2021

* amp.decorate optimizers set to None is ok

* refine unittest

* add unittest and refine example code

* refine unittest

2bb3f0b5

S

put time compute of unittests to test_unit block (#37602) · 92276ef8
由 Sing_chan 提交于 11月 29, 2021

92276ef8

Support fetch lodtensor array (#37580) · a0678eb1

由 wanghuancoder 提交于 11月 29, 2021

* suport fetch lodtensor array, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

a0678eb1

Y

[fleet_executor] Hold the carrier while running for one micro step. (#37605) · 74ca89ef
由 Yuang Liu 提交于 11月 29, 2021

74ca89ef
T
[HeterPs] fix allocation (#37476) · 27a5f52b
由 Thunderbrook 提交于 11月 29, 2021
```
* auc temp

* cuballocator

* code format

* code format
```
27a5f52b
W

[fleet_executor] Interceptor run op (#37623) · 5b962bd9
由 WangXi 提交于 11月 29, 2021

5b962bd9
A

[NPU] fix compile (#37648) · b6307742
由 Aganlengzi 提交于 11月 29, 2021

b6307742
S

unity variable name in third_party cmake files (#37590) · ba6f645e
由 Sing_chan 提交于 11月 29, 2021

ba6f645e

[New features] Support batch_jacobian and batch_hessian (#37547) · 4d24d352

由 Weilong Wu 提交于 11月 29, 2021

* native commit for triple grad of sigmod

* Updated unittests files

* init functional jacobian api

* Updated trible_test func

* Updated gradient_checker & test_script

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* fix dygraph grad to support high differential

* polish API docstring

* Updated gradient checker and some related files

* fix double grad strip error for high differential

* fix double grad strip error for high differential

* Add Sigmoid triple grad tests

* fix dygraph double grad dtype error when calling for high differential senario

* Updated triple grad teses func

* Use np.random to initialize ddx

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* format python code

* support multi input in triple gradient checker

* Add matmul triple grad kernel

* Updated comments of TODO

* Supported some special tests

* Change code-format to follow CI std

* Updated gradient_checker.py

* Fix conflicts

* Removed unnecessary printing log

* Change code style to follow CI std

* support batch in jacobian and hessian

* add batch jacobian and batch hessian

* Add batch_jacobian test, draft version

* [New features] Add elementwise_mul triple grad kernel (#37152)

* Add elementwise_mul triple grad kernel

* Removed InplaceInferer and polished code

* Add numerical_batch_jacobian,numerical_batch_hessian and tests

* Support batch_jacobian and batch_numerical

* Use pre-commit to check code format

* Update doc, polish code, add unit test

* Reset the TIMEOUT properties of test_jacobian to pass CI
Co-authored-by: Nlevi131 <limaolin01@baidu.com>
Co-authored-by: NJiabin Yang <360788950@qq.com>

4d24d352

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功