提交 · c64d9a44127cfd0ef7b08d31a94466024997c0f3 · 机器未来 / Paddle

11 4月, 2022 4 次提交
- A
  
  support more ops (#41421) · fc621dfe
  由 Allen Guo 提交于 4月 11, 2022
  
  fc621dfe
- J
  
  fix for gaussian random (#41572) · 8fc9c412
  由 jakpiase 提交于 4月 11, 2022
  
  8fc9c412
- Y
  
  fix arg_max for int type, *test=kunlun (#41522) · 368f1dda
  由 ykkk2333 提交于 4月 11, 2022
  
  368f1dda
- X
  [Yaml] add yaml for Uniform random and add unit test. (#41517) · cd2a4cdf
  由 xiongkun 提交于 4月 11, 2022
```
* gather op

* add mod

* [Yaml] final state for uniform and uniform_random
```
  cd2a4cdf
10 4月, 2022 3 次提交
- L
  [KP]fix bug when TruncatedNormal cannot fall back in cpu (#41565) · c1394c6a
  由 Liu-xiandong 提交于 4月 10, 2022
```
* [KP]fix bug when TruncatedNormal cannot fall back in cpu

* delete useless comment

* delete useless comment
```
  c1394c6a
- B
  
  add mkldnn compute_propagate_scales int8 pass (#41592) · c00d869b
  由 baoachun 提交于 4月 10, 2022
  
  c00d869b
- B
  add mkldnn int8 pass [step1] (#41579) · e68da187
  由 baoachun 提交于 4月 10, 2022
```
* add mkldnn int8 pass

* add mkldnn int8 pass

* update pass
```
  e68da187
09 4月, 2022 7 次提交

由 zhaocaibei123 提交于 4月 09, 2022

* update name

* update name

* fix test

* fix fleet bind

* update name

* update name

* fix test

* fix gpups wrapper

* remove Push/Pull/Load/Save with context in client and wrapper base class

* fix

* fix

* remove some interface

* fix

* remove

* code style

* recover

* fix

* remove code unused

* remove some unused table & accessor & CommonDenseTable => MemoryDenseTable

* fix

* fix

* fix

* recover

* remove unused code

* recover unittest

* fix

* remove

* fix

* remove code unuseful

* remove

* fix

* recover

* remove
Co-authored-by: Nesythan <esythan@126.com>

7a07c4a5

C

modify the block size of the group_norm backward (#41570) · ff2fba39
由 crystal 提交于 4月 09, 2022

ff2fba39

Autotune the workspace_size_limit in conv. (#40338) · b937cdc5

由 limingshu 提交于 4月 09, 2022

* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.

* Use the system cudaMalloc and cudaFree to allocate workspace during searching.

* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>

b937cdc5

J
fix_ci_problem3 (#41484) · 9cb2287c
由 Jiabin Yang 提交于 4月 09, 2022
```
* fix_ci_problem3

* support windows no default error
```
9cb2287c
W

fix pylayer mem leak, test=develop (#41559) · be11648a
由 wanghuancoder 提交于 4月 09, 2022

be11648a
L
[new-exec] fix bug that no thread is waked up when adding task to threadpool (#41567) · f581f5bf
由 Leo Chen 提交于 4月 09, 2022
```
* fix bug that no thread is waked up when adding task to threadpool

* fix typo
```
f581f5bf
L

[fleet executor] Add sink interceptor and test (#41497) · b3e79731
由 LiYuRio 提交于 4月 09, 2022

b3e79731

08 4月, 2022 8 次提交
- W
  
  Fix fake quant cuda kernel (#41305) · 330582e2
  由 whs 提交于 4月 08, 2022
  
  330582e2
- C
  fix group_norm (#41531) · 04a4bdf8
  由 crystal 提交于 4月 08, 2022
```
fix group_norm vectorized address misalignment
```
  04a4bdf8
- modify unittest of lstm forward, *test=kunlun (#41534) · d4710dfe
  由 z8hanghuan 提交于 4月 08, 2022
```
* modify unittest of lstm forward, *test=kunlun

* modify unittest of lstm forward, *test=kunlun
```
  d4710dfe
- A
  [Eager]Fix segment_pool/allclose/isclose/scale API bug (#41506) · 0a6fe699
  由 Aurelius84 提交于 4月 08, 2022
```
* [Eager]Fix segment_pool/allclose/isclose/scale API bug

* fix kernel register problem
```
  0a6fe699
- Q
  [ROCm] fix dcu error in device event base, test=develop (#41521) · 14dba636
  由 Qi Li 提交于 4月 08, 2022
```
* [ROCm] fix dcu error in device event base, test=develop

* fix, test=develop
```
  14dba636
- T
  
  xpu mul unittest *test=kunlun (#41140) · 770ce7cf
  由 taixiurong 提交于 4月 08, 2022
  
  770ce7cf
- R
  
  pybind support CustomPlace (#41136) · 0cd577cf
  由 ronnywang 提交于 4月 08, 2022
  
  0cd577cf
- H
  Add conj pixel shuffle yaml (#41499) · bc88fbb5
  由 hong 提交于 4月 08, 2022
```
* ad conj flip yaml

* add flip conj pixel shuffle
```
  bc88fbb5
07 4月, 2022 16 次提交
- T
  [GPUPS] bind afs wrpper (#41227) · b3bcebbe
  由 Thunderbrook 提交于 4月 07, 2022
```
* afs wrapper

* format

* format

* macro
```
  b3bcebbe
- remove FLAGS_use_curand and change all random op CUDA implementation (#41308) · 9714878c
  由 zhouweiwei2014 提交于 4月 07, 2022
  
  9714878c
- Y
  [Phi]Add hard_swish/kron/linspace/logit yaml file (#41298) · 90cb337e
  由 YuanRisheng 提交于 4月 07, 2022
```
* add yaml

* perfect converage
```
  90cb337e
- L
  
  use group id to differentiate keys for tcp store (#41496) · 75227c9e
  由 lilong12 提交于 4月 07, 2022
  
  75227c9e
- L
  Profile Executors (#41100) · dfb47986
  由 liutiexing 提交于 4月 07, 2022
```
* Profile Executors

* update

* fix ut

* fix names

* update

* update
```
  dfb47986
- L
  
  add send/recv to/from switch module for PrcoessGroupHeter (#41285) · 633ac4e6
  由 lilong12 提交于 4月 07, 2022
  
  633ac4e6
- S
  Add Output(Step) to DistributedFusedLamb optimizer (#41249) · e4459a40
  由 sneaxiy 提交于 4月 07, 2022
```
* add Output(Step) to distributed fused lamb op

* add _set_step
```
  e4459a40
- Z
  
  Add Sparse API to_dense, to_sparse_coo and values (#41394) · f78cc3da
  由 zhangkaihuo 提交于 4月 07, 2022
  
  f78cc3da
- C
  Fix dygraph record event position (#41445) · 8fba68d3
  由 chenjian 提交于 4月 07, 2022
```
* no

* maintain old profiler

* fix old dygraph record event
```
  8fba68d3
- Q
  ignore some failed test for KL2 (#41342) · 81389c51
  由 QingshuChen 提交于 4月 07, 2022
```
* ignore some failed test for KL2
*test=kunlun

* minor
*test=kunlun

* minor
*test=kunlun
```
  81389c51
- S
  modify inference model test build method to support multi version (#41027) · c9e0e10e
  由 Sing_chan 提交于 4月 07, 2022
```
* change inference demo_test build method to ninja to choose visual studio version automaticly

* notest;test=windows_ci_inference

* set cuda of demo_ci by arg,fix bug of ninja compile,test=document_fix;test=windows_ci;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci_inference"

* set lib_path according to generator
```
  c9e0e10e
- Z
  
  remove cudnn_deterministic=True (#41341) · cefa91fd
  由 Zhang Jun 提交于 4月 07, 2022
  
  cefa91fd
- H
  momentum support l2decay for xpu. test=kunlun (#41325) · 533c649f
  由 houj04 提交于 4月 07, 2022
```
* momentum support l2decay for xpu. test=kunlun

* fix include file. test=kunlun

* fix cmake for device_worker. test=kunlun
```
  533c649f
- J
  modify infer gpu memory strategy (#41427) · 56e72b20
  由 JingZhuangzhuang 提交于 4月 07, 2022
```
* modify infer gpu memory strategy

* modify infer gpu memory strategy
```
  56e72b20
- Y
  
  fix bugs of reshape double grad infermeta (#41459) · 53409bcd
  由 YuanRisheng 提交于 4月 07, 2022
  
  53409bcd
- Y
  Add GPU memory usage information in the print of profiler. (#41440) · 516160a4
  由 Yiqun Liu 提交于 4月 07, 2022
```
* Add GPU memory usage information in the print of profiler.

* Add ifdef.
```
  516160a4
06 4月, 2022 2 次提交
- 0
  
  Fix eager try catch (#41438) · 55e26637
  由 0x45f 提交于 4月 06, 2022
  
  55e26637
- P
  fix device_id bug for final_state op in multiprocess testcase (#41407) · b25f25d0
  由 pangyoki 提交于 4月 06, 2022
```
* support final_state in multiprocess

* fix no place.device

* set device_id in eager_gen
```
  b25f25d0

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致