提交 · f33ae2060320fe68a1aa0465de503bc882febc8c · Crayon鑫 / Paddle

22 2月, 2022 31 次提交
- L
  Adapt to batch_norm_grad op and add align function in roi_align op for kunlun (#39685) · f33ae206
  由 Leo Guo 提交于 2月 22, 2022
```
* Adapt to batch_norm_grad op and add align function in
roi_align op for kunlun, *test=kunlun

* Adapt to batch_norm, batch_norm_grad op api for kunlun, and add unit-tests of batch_norm, roi_align. *test=kunlun
```
  f33ae206
- X
  change Vector to std::vector and provide MixVector class as a helper … (#39559) · 728c0624
  由 xiongkun 提交于 2月 22, 2022
```
* change Vector to std::vector and provide MixVector class as a helper wrapper class

* solve the multi-gpu hang problem

* remove the duplicate template instantialize

* Copy vector to cpu

* add CopyToCPU

* xxx

* final version: fix the problem of all reduce

* remove mixvector dependence

* fix

* merge

* fix code

* fix by CI
```
  728c0624
- W
  fix bug in new the_one_ps (#39505) · d56a0a1b
  由 wangguanqun 提交于 2月 22, 2022
```
* fix benchmark and communicator config

* fix bugs of the_one_ps

* multi program and fix bug in optimizer

* multi program in the_one_ps

* public commcontext
```
  d56a0a1b
- 王
  
  add pten convert pass.test=develop (#39664) · a6abb6e7
  由王明冬提交于 2月 22, 2022
  
  a6abb6e7
- Z
  
  unset fluid in nn.others (#34935) · a710738e
  由 zhiboniu 提交于 2月 22, 2022
  
  a710738e
- A
  [Phi] Migrate unfold_op into phi (#39778) · 1aa67778
  由 Aurelius84 提交于 2月 22, 2022
```
* [Phi] Migrate unfold_op into phi

* fix im2col CPUContext template instantial

* fix unfold_op.h header include problem

* fix unittest

* fix PT->PD
```
  1aa67778
- R
  
  [CustomRuntime] fix CustomDeviceContext (#39766) · 60fc555e
  由 ronnywang 提交于 2月 22, 2022
  
  60fc555e
- L
  Update profiler (#39779) · c5d15655
  由 liutiexing 提交于 2月 22, 2022
```
* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* add log for Executor

* update the profiler
Co-authored-by: Nliutiexing <liutiexing@google.com>
```
  c5d15655
- T
  
  build_cinn_pass: fix bug because of output control var (#39782) · 62ae5f62
  由 TeFeng Chen 提交于 2月 22, 2022
  
  62ae5f62
- H
  
  update unittests for nearest_interp_v2_op_xpu: 'sync' from gpu. test=kunlun (#39768) · e89bf25b
  由 houj04 提交于 2月 22, 2022
  
  e89bf25b
- W
  [Paddle-Inference] fix pass and convert_op for preln_ernie (#39733) · 574f3402
  由 Wangzheee 提交于 2月 22, 2022
```
* fix pass and convert_op for preln_ernie and add preln_ernie'flag in pass
```
  574f3402
- Y
  [Auto Parallel] Add the high-level Engine API (#39709) · 5595fdbb
  由 Yulong Ao 提交于 2月 22, 2022
```
* [Auto Parallel] Add the high-level Engine API

* Update the test cmakefile
```
  5595fdbb
- Z
  refactor reshape2/shape unittest for kunlun (#39665) · c8d6c146
  由 zhangxiaoci 提交于 2月 22, 2022
```
*test=kunlun
```
  c8d6c146
- Z
  [GPUPS]Config fleet optimize 2 (#39783) · 0efa64c8
  由 zmxdream 提交于 2月 22, 2022
```
* update. test=develop

* update. test=develop

* fix. test=develop

* update. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* update. test=develop

* update. test=develop
```
  0efa64c8
- Z
  Modify the implementation of BlockXReduce to fit more scenes (#39554) · 85a11c47
  由 Zhang Zheng 提交于 2月 22, 2022
```
* Modify the implementation of BlockYReduce to fit more scenes

* fix

* fix
```
  85a11c47
- Y
  
  [phi] add dtype fetcher for scalar (#39775) · 7fa29a6b
  由 Yuang Liu 提交于 2月 22, 2022
  
  7fa29a6b
- 0
  
  dont show warn msg default (#39730) · 9b9d52e0
  由 0x45f 提交于 2月 22, 2022
  
  9b9d52e0
- F
  
  delete skip_case for dropout_ut (#39629) · cdf05dfc
  由 feng_shuai 提交于 2月 22, 2022
  
  cdf05dfc
- F
  
  fix:Modify matrix latitude (#39686) · b8dbffb7
  由 feng_shuai 提交于 2月 22, 2022
  
  b8dbffb7
- C
  [pten]add check for using HostAlloc (#39771) · 12c6d06a
  由 chentianyu03 提交于 2月 22, 2022
```
* add check for using HostAlloc

* add check for using HostAlloc
```
  12c6d06a
- Z
  Support NoNeedBuffer for final state codegen (#39628) · 911cb2ea
  由 Zhanlue Yang 提交于 2月 22, 2022
```
* Support NoNeedBuffer for final state codegen

* Replaced pten with phi
```
  911cb2ea
- Z
  
  add hard_swish in xpu2_op_list.h and update xpu.cmake,test=kunlun (#39586) · 8d1d0bdf
  由 zhangyikun02 提交于 2月 22, 2022
  
  8d1d0bdf
- A
  
  sync recent changes (#39763) · d945e24c
  由 Allen Guo 提交于 2月 22, 2022
  
  d945e24c
- Z
  
  update precision catalog (#39717) · df1dbff1
  由 zhangchunle 提交于 2月 22, 2022
  
  df1dbff1
- L
  
  make enable_program_desc_tracing_ thread_local (#39776) · ec21bf98
  由 Leo Chen 提交于 2月 22, 2022
  
  ec21bf98
- L
  
  fix usage of paddle.version.cuda() (#39780) · 38f87238
  由 Leo Chen 提交于 2月 22, 2022
  
  38f87238
- N
  Modified RandomKernel with Kernel Primitive API (#39666) · 9f94821b
  由 niuliling123 提交于 2月 22, 2022
```
* Modified RandomKernel with Kernel Primitive API

* update pten.h to phi.h

* update

* update fullKernel
```
  9f94821b
- N
  Add Sort API for Kernel Primitive API (#39734) · f4e74887
  由 niuliling123 提交于 2月 22, 2022
```
* Add Sort API for Kernel Primitive API

* update & -> ptr
```
  f4e74887
- A
  
  [Dy2St]Fix gym library version update problem with unittest (#39785) · de760d2c
  由 Aurelius84 提交于 2月 22, 2022
  
  de760d2c
- C
  [PTen->Phi PR2] Rename PT_REGISTER macro to PD_REGISTER (#39790) · 4a338796
  由 Chen Weihang 提交于 2月 22, 2022
```
* unify register macro

* rename declare macro

* fix infrt error
```
  4a338796
- Y
  
  [fleet exe] supprot fp16 feed and fetch on cpp side (#39758) · 73bf9673
  由 Yuang Liu 提交于 2月 22, 2022
  
  73bf9673
21 2月, 2022 9 次提交
- A
  [PluggableDevice]custom kernel to phi core structs (#39690) · 68631ed4
  由 Aganlengzi 提交于 2月 21, 2022
```
* [PluggableDevice]custom kernel to pten core structs

* mod extension.h for custom op

* compatible python for CI

* support custom context

* refactor to pten

* fix windows and ut
```
  68631ed4
- F
  Move Abs InferShape to phi (#39762) · 9c51eee1
  由 From00 提交于 2月 21, 2022
```
* Move Abs InferShaper to phi

* Fix CI error
```
  9c51eee1
- A
  [Pten] Migrate huber_loss into phi (#39761) · 6aafb2fa
  由 Aurelius84 提交于 2月 21, 2022
```
* migrate huber_loss into phi

* migrate infershape

* modify pten into phi
```
  6aafb2fa
- P
  Add loss conversion from uint16 to float in ProgressBar class (#39231) · 740cfa94
  由 piotrekobiIntel 提交于 2月 21, 2022
```
* Add loss conversion from uint16 to float in progressbar class

* Fix test coverage

* Actually fix coverage

* Fix format error
```
  740cfa94
- 0
  [Dy2St]Fix cond grad error when handle tensor array (#39689) · a863b32e
  由 0x45f 提交于 2月 21, 2022
```
* fix cond grad error when handle tensor array

* add UT
```
  a863b32e
- S
  gpu ps graph engine (#39699) · 05982c10
  由 seemingwang 提交于 2月 21, 2022
```
* gpu ps graph engine

* remove logs
```
  05982c10
- C
  [pten]rm reduce_sum and reduce_mean raw kernel (#39484) · 2bb5aae8
  由 chentianyu03 提交于 2月 21, 2022
```
* rm reduce_sum raw kernel

* remove reduce_mean kernel

* remove reduce_mean kernel

* reduce support int and int64_t

* mean support int and int64_t type
```
  2bb5aae8
- W
  disable some distribute test case when in CPU test env (#39682) · 941bdb41
  由 wanghuancoder 提交于 2月 21, 2022
```
* disable some distribute test case when in CPU test env, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop
```
  941bdb41
- fix fill_constant bug, *test=kunlun (#39681) · b1805727
  由 z8hanghuan 提交于 2月 21, 2022
```
* fix fill_constant bug, *test=kunlun

* fix fill_constant bug,*test=kunlun
```
  b1805727

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致