提交 · b937cdc51794ee5112f9ec948c4518b9931b72c9 · 机器未来 / Paddle

09 4月, 2022 5 次提交
- L
  Autotune the workspace_size_limit in conv. (#40338) · b937cdc5
  由 limingshu 提交于 4月 09, 2022
```
* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.

* Use the system cudaMalloc and cudaFree to allocate workspace during searching.

* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
```
  b937cdc5
- J
  fix_ci_problem3 (#41484) · 9cb2287c
  由 Jiabin Yang 提交于 4月 09, 2022
```
* fix_ci_problem3

* support windows no default error
```
  9cb2287c
- W
  
  fix pylayer mem leak, test=develop (#41559) · be11648a
  由 wanghuancoder 提交于 4月 09, 2022
  
  be11648a
- L
  [new-exec] fix bug that no thread is waked up when adding task to threadpool (#41567) · f581f5bf
  由 Leo Chen 提交于 4月 09, 2022
```
* fix bug that no thread is waked up when adding task to threadpool

* fix typo
```
  f581f5bf
- L
  
  [fleet executor] Add sink interceptor and test (#41497) · b3e79731
  由 LiYuRio 提交于 4月 09, 2022
  
  b3e79731
08 4月, 2022 8 次提交
- W
  
  Fix fake quant cuda kernel (#41305) · 330582e2
  由 whs 提交于 4月 08, 2022
  
  330582e2
- C
  fix group_norm (#41531) · 04a4bdf8
  由 crystal 提交于 4月 08, 2022
```
fix group_norm vectorized address misalignment
```
  04a4bdf8
- modify unittest of lstm forward, *test=kunlun (#41534) · d4710dfe
  由 z8hanghuan 提交于 4月 08, 2022
```
* modify unittest of lstm forward, *test=kunlun

* modify unittest of lstm forward, *test=kunlun
```
  d4710dfe
- A
  [Eager]Fix segment_pool/allclose/isclose/scale API bug (#41506) · 0a6fe699
  由 Aurelius84 提交于 4月 08, 2022
```
* [Eager]Fix segment_pool/allclose/isclose/scale API bug

* fix kernel register problem
```
  0a6fe699
- Q
  [ROCm] fix dcu error in device event base, test=develop (#41521) · 14dba636
  由 Qi Li 提交于 4月 08, 2022
```
* [ROCm] fix dcu error in device event base, test=develop

* fix, test=develop
```
  14dba636
- T
  
  xpu mul unittest *test=kunlun (#41140) · 770ce7cf
  由 taixiurong 提交于 4月 08, 2022
  
  770ce7cf
- R
  
  pybind support CustomPlace (#41136) · 0cd577cf
  由 ronnywang 提交于 4月 08, 2022
  
  0cd577cf
- H
  Add conj pixel shuffle yaml (#41499) · bc88fbb5
  由 hong 提交于 4月 08, 2022
```
* ad conj flip yaml

* add flip conj pixel shuffle
```
  bc88fbb5
07 4月, 2022 16 次提交
- T
  [GPUPS] bind afs wrpper (#41227) · b3bcebbe
  由 Thunderbrook 提交于 4月 07, 2022
```
* afs wrapper

* format

* format

* macro
```
  b3bcebbe
- remove FLAGS_use_curand and change all random op CUDA implementation (#41308) · 9714878c
  由 zhouweiwei2014 提交于 4月 07, 2022
  
  9714878c
- Y
  [Phi]Add hard_swish/kron/linspace/logit yaml file (#41298) · 90cb337e
  由 YuanRisheng 提交于 4月 07, 2022
```
* add yaml

* perfect converage
```
  90cb337e
- L
  
  use group id to differentiate keys for tcp store (#41496) · 75227c9e
  由 lilong12 提交于 4月 07, 2022
  
  75227c9e
- L
  Profile Executors (#41100) · dfb47986
  由 liutiexing 提交于 4月 07, 2022
```
* Profile Executors

* update

* fix ut

* fix names

* update

* update
```
  dfb47986
- L
  
  add send/recv to/from switch module for PrcoessGroupHeter (#41285) · 633ac4e6
  由 lilong12 提交于 4月 07, 2022
  
  633ac4e6
- S
  Add Output(Step) to DistributedFusedLamb optimizer (#41249) · e4459a40
  由 sneaxiy 提交于 4月 07, 2022
```
* add Output(Step) to distributed fused lamb op

* add _set_step
```
  e4459a40
- Z
  
  Add Sparse API to_dense, to_sparse_coo and values (#41394) · f78cc3da
  由 zhangkaihuo 提交于 4月 07, 2022
  
  f78cc3da
- C
  Fix dygraph record event position (#41445) · 8fba68d3
  由 chenjian 提交于 4月 07, 2022
```
* no

* maintain old profiler

* fix old dygraph record event
```
  8fba68d3
- Q
  ignore some failed test for KL2 (#41342) · 81389c51
  由 QingshuChen 提交于 4月 07, 2022
```
* ignore some failed test for KL2
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun
```
  81389c51
- S
  modify inference model test build method to support multi version (#41027) · c9e0e10e
  由 Sing_chan 提交于 4月 07, 2022
```
* change inference demo_test build method to ninja to choose visual studio version automaticly

* notest;test=windows_ci_inference

* set cuda of demo_ci by arg,fix bug of ninja compile,test=document_fix;test=windows_ci;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci_inference"

* set lib_path according to generator
```
  c9e0e10e
- Z
  
  remove cudnn_deterministic=True (#41341) · cefa91fd
  由 Zhang Jun 提交于 4月 07, 2022
  
  cefa91fd
- H
  momentum support l2decay for xpu. test=kunlun (#41325) · 533c649f
  由 houj04 提交于 4月 07, 2022
```
* momentum support l2decay for xpu. test=kunlun

* fix include file. test=kunlun

* fix cmake for device_worker. test=kunlun
```
  533c649f
- J
  modify infer gpu memory strategy (#41427) · 56e72b20
  由 JingZhuangzhuang 提交于 4月 07, 2022
```
* modify infer gpu memory strategy

* modify infer gpu memory strategy
```
  56e72b20
- Y
  
  fix bugs of reshape double grad infermeta (#41459) · 53409bcd
  由 YuanRisheng 提交于 4月 07, 2022
  
  53409bcd
- Y
  Add GPU memory usage information in the print of profiler. (#41440) · 516160a4
  由 Yiqun Liu 提交于 4月 07, 2022
```
* Add GPU memory usage information in the print of profiler.

* Add ifdef.
```
  516160a4
06 4月, 2022 7 次提交

0

Fix eager try catch (#41438) · 55e26637
由 0x45f 提交于 4月 06, 2022

55e26637
P
fix device_id bug for final_state op in multiprocess testcase (#41407) · b25f25d0
由 pangyoki 提交于 4月 06, 2022
```
* support final_state in multiprocess

* fix no place.device

* set device_id in eager_gen
```
b25f25d0
F

add div plugin and add filter (#41243) · 0c968b9d
由 feng_shuai 提交于 4月 06, 2022

0c968b9d
A
[IPU] remove paddle_ipu shared library (#41307) · 229e91bf
由 Allen Guo 提交于 4月 06, 2022
```
* remove paddle_ipu shared library

* fix unique_name
```
229e91bf

[Eager] Support test_layers's test cases switch to eager mode (#41216) · 5ae8babb

由 Weilong Wu 提交于 4月 06, 2022

* [Eager] Support test_layers's test cases switch to eager mode

* Update batch_norm _C_ops action to fix CI

* Use None instead of new EmptyTensor

* Updated var name

* Make sure to switch eager mode, Fix Coverage_CI

* Remove _non_static_mode statement

* Remove batch_norm dispensable input statement

* Polish batch_norm code

* Fix CI issue

5ae8babb

Add conv yaml (#41354) · 7ed7c6c7

由 hong 提交于 4月 06, 2022

* update

* add conv yaml

* add backward

* remove useless code

* fix bug

* fix bug

* revert fluid dygraph conv2d

* remove useless infermeta function

* fix meta fn deluplicat error

* conv using custom impl

* remove amp include

* fix bug

* use cudnn = true

* fix test mkldnn caching bug

7ed7c6c7

W

fix split and concat out (#41419) · a057df50
由 wanghuancoder 提交于 4月 06, 2022

a057df50

05 4月, 2022 4 次提交

Z
Fix bug of data transform in inference executor (#41349) · 91212104
由 zyfncg 提交于 4月 05, 2022
```
* fix bug of data transform in inference executor

* fix bug
```
91212104

Table refine: remove table/accessor unuseful (#41400) · a288fcab

由 zhaocaibei123 提交于 4月 05, 2022

* update name

* update name

* fix test

* fix fleet bind

* update name

* update name

* fix test

* fix gpups wrapper

* remove Push/Pull/Load/Save with context in client and wrapper base class

* fix

* fix

* remove some interface

* fix

* remove

* code style

* recover

* fix

* remove code unused

* remove some unused table & accessor & CommonDenseTable => MemoryDenseTable

* fix

* fix

* fix

* recover

* remove unused code
Co-authored-by: Nesythan <esythan@126.com>

a288fcab

W
add fake index and unittest for multiclass_nms3 trt (#41344) · 1bd8125f
由 wangxinxin08 提交于 4月 05, 2022
```
* add fake index and unittest for multiclass_nms3 trt

* modify unittest
```
1bd8125f

[DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul (#41387) · d8a10977

由 Zhanlue Yang 提交于 4月 05, 2022

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues

* [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul

* Fixed issues with phi kernel

* Added triple grad test case

* Fixed minor issue

d8a10977

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致