提交 · 0b2058172ffab252f011fe59cddc75ab0d92faf8 · PaddlePaddle / Paddle

23 2月, 2022 18 次提交

S
Add ProcessGroupNCCL for distributed training (#39737) · 0b205817
由 ShenLiang 提交于 2月 23, 2022
```
* add processgroup_nccl
```
0b205817
Z

Support dispensable inputs for eager final state codegen (#39743) · ca11a0e5
由 Zhanlue Yang 提交于 2月 23, 2022

ca11a0e5
S
move trunc_op's infere shape to phi (#39772) · 95280a36
由 Sing_chan 提交于 2月 23, 2022
```
* move trunc_op's infere shape

* modify according to risheng's comment
```
95280a36
L
[phi] move randperm to phi (#39816) · 30992ea0
由 Leo Chen 提交于 2月 23, 2022
```
* move randperm to phi

* fix npu

* fix memory::Copy
```
30992ea0
Y

[Phi] move flip op to phi kernel (#39822) · ad294a81
由 Yang 提交于 2月 23, 2022

ad294a81
C
[Phi] Polish default signature attr and output select impl (#39810) · 64ed92bd
由 Chen Weihang 提交于 2月 23, 2022
```
* polish default sig impl

* revert dispenable out
```
64ed92bd
[MLU] add cncl parallel context and mlu resource pool (#39803) · 6241913b
由 mhhhh1 提交于 2月 23, 2022
```
* [MLU] add cncl parallel context and mlu resource pool

* [MLU] fix the cncl_context_test
```
6241913b
change CUDA implementaion of bernoulli OP (#39732) · b9675acc
由 zhouweiwei2014 提交于 2月 23, 2022
```
* change CUDA implementaion of bernoulli OP

* fix CI
```
b9675acc
R

[phi] migrate atan2_op into phi (#39806) · b089e7cd
由 ronnywang 提交于 2月 23, 2022

b089e7cd

[phi] move unbind to phi (#39789) · dba694f4

由 Leo Chen 提交于 2月 23, 2022

* move unbind to phi

* revert infer shape

* add header file

* move concat_and_split to phi

dba694f4

[KP] Add elementwise add xpu after phi, test=develop (#39787) · 1a1a2ce8

由 Liu-xiandong 提交于 2月 23, 2022

* [KP] Add elementwise add xpu, test=develop

* modify the File Permissions

* modify the copyright time

* modify code style

* modify code style

1a1a2ce8

A
[Phi] Migrate lable_smooth_op into Phi (#39796) · b7bcd0f6
由 Aurelius84 提交于 2月 23, 2022
```
* [Phi] Migrate lable_smooth_op into Phi

* fix PT->PD
```
b7bcd0f6
A
[IPU] update inference demos (#39792) · 24f55aed
由 Allen Guo 提交于 2月 23, 2022
```
* update inference part

* restore white space
```
24f55aed

[Eager] Support Eager mode for some model testcase (#39248) · abe232d8

由 wanghuancoder 提交于 2月 23, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

abe232d8

[bf16] add bf16 kernel: elementwise_div (#39602) · ca4df333

由 zhangbo9674 提交于 2月 23, 2022

* add elementwise_div

* refine rocm

* refine code

* refine op register

* solve conflict

* refine unittest

* refine unittest precision

* add rocm

ca4df333

Update record interface using part3 (#39695) · 1fcaab45

由 chenjian 提交于 2月 23, 2022

* fix RecordEvent interface

* modify default level to 4

* update interface use

* add const default trace level

* update record event interface using

* update record event interface using

* update record event interface using

* update operator.cc

* update part2

* update part1

* update part3

* fix include profiler.h header in ps server

* fix include profiler.h header in ps server

* fix profiler.h header

* fix profiler.h header

* fix merge buf

* update

* fix bug

* fix bug

1fcaab45

Z
Supported intermediate outputs for eager final state codegen (#39767) · 94243789
由 Zhanlue Yang 提交于 2月 23, 2022
```
* Supported intermediate outputs for eager final state codegen

* Added validation check for intermediate tensors
```
94243789

[PHI] Remove fill_any_like kernel register in fluid (#39807) · 69e9e9d5

由 zyfncg 提交于 2月 23, 2022

* remove fill_any_like kernel in fluid and fix data transform bug

* support scalar in infershpe

* recover infershape in fill_and_like

69e9e9d5

22 2月, 2022 22 次提交
- A
  [custom kernel]Delete useless and upgrade (#39791) · edc3ba13
  由 Aganlengzi 提交于 2月 22, 2022
```
* [custom kernel]Delete useless

* change RegType enum names

* mod notes

* merge

* update
```
  edc3ba13
- F
  Move real and imag op to phi (#39777) · 345cc8fa
  由 From00 提交于 2月 22, 2022
```
* Move Real OP to phi

* Move Imag OP to phi

* Move Real and Imag InferShape to phi

* Move Real and Imag to complex_kernel

* Change PT_REGISTER_XXX to PD_REGISTER_XXX
```
  345cc8fa
- J
  
  added round fwd onednn kernel (#39653) · 74c0bc1c
  由 jakpiase 提交于 2月 22, 2022
  
  74c0bc1c
- L
  Add the implementation of TCP Store (#39384) · b95cd3b7
  由 lilong12 提交于 2月 22, 2022
```
* add tcp_socket and tcp_store
```
  b95cd3b7
- F
  delete gather_ut skip_case (#39657) · da43e065
  由 feng_shuai 提交于 2月 22, 2022
```
* delete gather_ut skip_case

* add trt version limit
```
  da43e065
- L
  Adapt to batch_norm_grad op and add align function in roi_align op for kunlun (#39685) · f33ae206
  由 Leo Guo 提交于 2月 22, 2022
```
* Adapt to batch_norm_grad op and add align function in
roi_align op for kunlun, *test=kunlun

* Adapt to batch_norm, batch_norm_grad op api for kunlun, and add unit-tests of batch_norm, roi_align. *test=kunlun
```
  f33ae206
- X
  change Vector to std::vector and provide MixVector class as a helper … (#39559) · 728c0624
  由 xiongkun 提交于 2月 22, 2022
```
* change Vector to std::vector and provide MixVector class as a helper wrapper class

* solve the multi-gpu hang problem

* remove the duplicate template instantialize

* Copy vector to cpu

* add CopyToCPU

* xxx

* final version: fix the problem of all reduce

* remove mixvector dependence

* fix

* merge

* fix code

* fix by CI
```
  728c0624
- W
  fix bug in new the_one_ps (#39505) · d56a0a1b
  由 wangguanqun 提交于 2月 22, 2022
```
* fix benchmark and communicator config

* fix bugs of the_one_ps

* multi program and fix bug in optimizer

* multi program in the_one_ps

* public commcontext
```
  d56a0a1b
- A
  [Phi] Migrate unfold_op into phi (#39778) · 1aa67778
  由 Aurelius84 提交于 2月 22, 2022
```
* [Phi] Migrate unfold_op into phi

* fix im2col CPUContext template instantial

* fix unfold_op.h header include problem

* fix unittest

* fix PT->PD
```
  1aa67778
- R
  
  [CustomRuntime] fix CustomDeviceContext (#39766) · 60fc555e
  由 ronnywang 提交于 2月 22, 2022
  
  60fc555e
- L
  Update profiler (#39779) · c5d15655
  由 liutiexing 提交于 2月 22, 2022
```
* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* add log for Executor

* update the profiler
Co-authored-by: Nliutiexing <liutiexing@google.com>
```
  c5d15655
- T
  
  build_cinn_pass: fix bug because of output control var (#39782) · 62ae5f62
  由 TeFeng Chen 提交于 2月 22, 2022
  
  62ae5f62
- H
  
  update unittests for nearest_interp_v2_op_xpu: 'sync' from gpu. test=kunlun (#39768) · e89bf25b
  由 houj04 提交于 2月 22, 2022
  
  e89bf25b
- W
  [Paddle-Inference] fix pass and convert_op for preln_ernie (#39733) · 574f3402
  由 Wangzheee 提交于 2月 22, 2022
```
* fix pass and convert_op for preln_ernie and add preln_ernie'flag in pass
```
  574f3402
- Z
  [GPUPS]Config fleet optimize 2 (#39783) · 0efa64c8
  由 zmxdream 提交于 2月 22, 2022
```
* update. test=develop

* update. test=develop

* fix. test=develop

* update. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* update. test=develop

* update. test=develop
```
  0efa64c8
- Z
  Support NoNeedBuffer for final state codegen (#39628) · 911cb2ea
  由 Zhanlue Yang 提交于 2月 22, 2022
```
* Support NoNeedBuffer for final state codegen

* Replaced pten with phi
```
  911cb2ea
- Z
  
  add hard_swish in xpu2_op_list.h and update xpu.cmake,test=kunlun (#39586) · 8d1d0bdf
  由 zhangyikun02 提交于 2月 22, 2022
  
  8d1d0bdf
- A
  
  sync recent changes (#39763) · d945e24c
  由 Allen Guo 提交于 2月 22, 2022
  
  d945e24c
- L
  
  make enable_program_desc_tracing_ thread_local (#39776) · ec21bf98
  由 Leo Chen 提交于 2月 22, 2022
  
  ec21bf98
- N
  Modified RandomKernel with Kernel Primitive API (#39666) · 9f94821b
  由 niuliling123 提交于 2月 22, 2022
```
* Modified RandomKernel with Kernel Primitive API

* update pten.h to phi.h

* update

* update fullKernel
```
  9f94821b
- C
  [PTen->Phi PR2] Rename PT_REGISTER macro to PD_REGISTER (#39790) · 4a338796
  由 Chen Weihang 提交于 2月 22, 2022
```
* unify register macro

* rename declare macro

* fix infrt error
```
  4a338796
- Y
  
  [fleet exe] supprot fp16 feed and fetch on cpp side (#39758) · 73bf9673
  由 Yuang Liu 提交于 2月 22, 2022
  
  73bf9673

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功