提交 · b9675acc9d4326b73f5b3167265a1d3f6e98dac9 · PaddlePaddle / Paddle

23 2月, 2022 16 次提交

change CUDA implementaion of bernoulli OP (#39732) · b9675acc
由 zhouweiwei2014 提交于 2月 23, 2022
```
* change CUDA implementaion of bernoulli OP

* fix CI
```
b9675acc
Z
refactor range unittest for kunlun (#39800) · 69a04209
由 zhangxiaoci 提交于 2月 23, 2022
```
*test=kunlun
```
69a04209
R

[phi] migrate atan2_op into phi (#39806) · b089e7cd
由 ronnywang 提交于 2月 23, 2022

b089e7cd

[phi] move unbind to phi (#39789) · dba694f4

由 Leo Chen 提交于 2月 23, 2022

* move unbind to phi

* revert infer shape

* add header file

* move concat_and_split to phi

dba694f4

[KP] Add elementwise add xpu after phi, test=develop (#39787) · 1a1a2ce8

由 Liu-xiandong 提交于 2月 23, 2022

* [KP] Add elementwise add xpu, test=develop

* modify the File Permissions

* modify the copyright time

* modify code style

* modify code style

1a1a2ce8

A
[Phi] Migrate lable_smooth_op into Phi (#39796) · b7bcd0f6
由 Aurelius84 提交于 2月 23, 2022
```
* [Phi] Migrate lable_smooth_op into Phi

* fix PT->PD
```
b7bcd0f6
A
[IPU] update inference demos (#39792) · 24f55aed
由 Allen Guo 提交于 2月 23, 2022
```
* update inference part

* restore white space
```
24f55aed
B
update gather_nd trt converter ut (#39584) · 4130b640
由 baoachun 提交于 2月 23, 2022
```
* update gather_nd trt converter ut

* update ut
```
4130b640
T

refactoring gather/masked_select/arg_max unittests for kunlun, *test=kunlun (#39711) · da492a13
由 TTerror 提交于 2月 23, 2022

da492a13
L
fix 'is with a literal' warning (#39798) · 22abb6b3
由 Leo Chen 提交于 2月 23, 2022
```
* fix 'is with a literal'

* fix typo
```
22abb6b3
H

fix activation ut typo xpu. test=kunlun (#39813) · 9880595a
由 houj04 提交于 2月 23, 2022

9880595a

[Eager] Support Eager mode for some model testcase (#39248) · abe232d8

由 wanghuancoder 提交于 2月 23, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

abe232d8

[bf16] add bf16 kernel: elementwise_div (#39602) · ca4df333

由 zhangbo9674 提交于 2月 23, 2022

* add elementwise_div

* refine rocm

* refine code

* refine op register

* solve conflict

* refine unittest

* refine unittest precision

* add rocm

ca4df333

Update record interface using part3 (#39695) · 1fcaab45

由 chenjian 提交于 2月 23, 2022

* fix RecordEvent interface

* modify default level to 4

* update interface use

* add const default trace level

* update record event interface using

* update record event interface using

* update record event interface using

* update operator.cc

* update part2

* update part1

* update part3

* fix include profiler.h header in ps server

* fix include profiler.h header in ps server

* fix profiler.h header

* fix profiler.h header

* fix merge buf

* update

* fix bug

* fix bug

1fcaab45

Z
Supported intermediate outputs for eager final state codegen (#39767) · 94243789
由 Zhanlue Yang 提交于 2月 23, 2022
```
* Supported intermediate outputs for eager final state codegen

* Added validation check for intermediate tensors
```
94243789

[PHI] Remove fill_any_like kernel register in fluid (#39807) · 69e9e9d5

由 zyfncg 提交于 2月 23, 2022

* remove fill_any_like kernel in fluid and fix data transform bug

* support scalar in infershpe

* recover infershape in fill_and_like

69e9e9d5

22 2月, 2022 24 次提交
- A
  [custom kernel]Delete useless and upgrade (#39791) · edc3ba13
  由 Aganlengzi 提交于 2月 22, 2022
```
* [custom kernel]Delete useless

* change RegType enum names

* mod notes

* merge

* update
```
  edc3ba13
- C
  
  import llvm::ArrayRef and add test (#39802) · a167a143
  由 chentianyu03 提交于 2月 22, 2022
  
  a167a143
- Z
  
  unset fluid in tensor (#35082) · 42eb56e2
  由 zhiboniu 提交于 2月 22, 2022
  
  42eb56e2
- J
  Auto Parallel support conditional block (#39612) · a08ee62a
  由 JZ-LIANG 提交于 2月 22, 2022
```
* add subblock logic for context and partitioner

* partitioner support sub blocks

* revise typos

* fixed param init bug for while

* chmod 644

* add unitest

* mv forward parser

* update unitest

* update dist op ctx

* update dist op ctx

* fixed bug in dist op ctx

* fixed bug for recompute subblock
```
  a08ee62a
- Y
  
  disable some distribute test case when in CPU test env (#39801) · ae8c811a
  由 YUNSHEN XIE 提交于 2月 22, 2022
  
  ae8c811a
- F
  Move real and imag op to phi (#39777) · 345cc8fa
  由 From00 提交于 2月 22, 2022
```
* Move Real OP to phi

* Move Imag OP to phi

* Move Real and Imag InferShape to phi

* Move Real and Imag to complex_kernel

* Change PT_REGISTER_XXX to PD_REGISTER_XXX
```
  345cc8fa
- J
  
  added round fwd onednn kernel (#39653) · 74c0bc1c
  由 jakpiase 提交于 2月 22, 2022
  
  74c0bc1c
- L
  Add the implementation of TCP Store (#39384) · b95cd3b7
  由 lilong12 提交于 2月 22, 2022
```
* add tcp_socket and tcp_store
```
  b95cd3b7
- F
  delete gather_ut skip_case (#39657) · da43e065
  由 feng_shuai 提交于 2月 22, 2022
```
* delete gather_ut skip_case

* add trt version limit
```
  da43e065
- L
  Adapt to batch_norm_grad op and add align function in roi_align op for kunlun (#39685) · f33ae206
  由 Leo Guo 提交于 2月 22, 2022
```
* Adapt to batch_norm_grad op and add align function in
roi_align op for kunlun, *test=kunlun

* Adapt to batch_norm, batch_norm_grad op api for kunlun, and add unit-tests of batch_norm, roi_align. *test=kunlun
```
  f33ae206
- X
  change Vector to std::vector and provide MixVector class as a helper … (#39559) · 728c0624
  由 xiongkun 提交于 2月 22, 2022
```
* change Vector to std::vector and provide MixVector class as a helper wrapper class

* solve the multi-gpu hang problem

* remove the duplicate template instantialize

* Copy vector to cpu

* add CopyToCPU

* xxx

* final version: fix the problem of all reduce

* remove mixvector dependence

* fix

* merge

* fix code

* fix by CI
```
  728c0624
- W
  fix bug in new the_one_ps (#39505) · d56a0a1b
  由 wangguanqun 提交于 2月 22, 2022
```
* fix benchmark and communicator config

* fix bugs of the_one_ps

* multi program and fix bug in optimizer

* multi program in the_one_ps

* public commcontext
```
  d56a0a1b
- 王
  
  add pten convert pass.test=develop (#39664) · a6abb6e7
  由王明冬提交于 2月 22, 2022
  
  a6abb6e7
- Z
  
  unset fluid in nn.others (#34935) · a710738e
  由 zhiboniu 提交于 2月 22, 2022
  
  a710738e
- A
  [Phi] Migrate unfold_op into phi (#39778) · 1aa67778
  由 Aurelius84 提交于 2月 22, 2022
```
* [Phi] Migrate unfold_op into phi

* fix im2col CPUContext template instantial

* fix unfold_op.h header include problem

* fix unittest

* fix PT->PD
```
  1aa67778
- R
  
  [CustomRuntime] fix CustomDeviceContext (#39766) · 60fc555e
  由 ronnywang 提交于 2月 22, 2022
  
  60fc555e
- L
  Update profiler (#39779) · c5d15655
  由 liutiexing 提交于 2月 22, 2022
```
* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* add log for Executor

* update the profiler
Co-authored-by: Nliutiexing <liutiexing@google.com>
```
  c5d15655
- T
  
  build_cinn_pass: fix bug because of output control var (#39782) · 62ae5f62
  由 TeFeng Chen 提交于 2月 22, 2022
  
  62ae5f62
- H
  
  update unittests for nearest_interp_v2_op_xpu: 'sync' from gpu. test=kunlun (#39768) · e89bf25b
  由 houj04 提交于 2月 22, 2022
  
  e89bf25b
- W
  [Paddle-Inference] fix pass and convert_op for preln_ernie (#39733) · 574f3402
  由 Wangzheee 提交于 2月 22, 2022
```
* fix pass and convert_op for preln_ernie and add preln_ernie'flag in pass
```
  574f3402
- Y
  [Auto Parallel] Add the high-level Engine API (#39709) · 5595fdbb
  由 Yulong Ao 提交于 2月 22, 2022
```
* [Auto Parallel] Add the high-level Engine API

* Update the test cmakefile
```
  5595fdbb
- Z
  refactor reshape2/shape unittest for kunlun (#39665) · c8d6c146
  由 zhangxiaoci 提交于 2月 22, 2022
```
*test=kunlun
```
  c8d6c146
- Z
  [GPUPS]Config fleet optimize 2 (#39783) · 0efa64c8
  由 zmxdream 提交于 2月 22, 2022
```
* update. test=develop

* update. test=develop

* fix. test=develop

* update. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* update. test=develop

* update. test=develop
```
  0efa64c8
- Z
  Modify the implementation of BlockXReduce to fit more scenes (#39554) · 85a11c47
  由 Zhang Zheng 提交于 2月 22, 2022
```
* Modify the implementation of BlockYReduce to fit more scenes

* fix

* fix
```
  85a11c47

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功