提交 · aec493c0d30dd739bc7e9d39269ebca0357a10ca · PaddlePaddle / Paddle

06 1月, 2022 14 次提交
- T
  
  fix expand_v2 and expand_as_v2 bug (#38677) · aec493c0
  由 Thomas Young 提交于 1月 06, 2022
  
  aec493c0
- C
  [pten]move reduce files and dev_api (#38715) · c48bd3ff
  由 chentianyu03 提交于 1月 06, 2022
```
* move eigen/reduce.h imple into cpu/reduce.h

* ctx to dev_ctx
```
  c48bd3ff
- W
  
  fix slot, test=develop (#38738) · 4514f16d
  由 wanghuancoder 提交于 1月 06, 2022
  
  4514f16d
- Z
  Handled special sum_grad_op code gen in Eager Dygraph (#38573) · d422a1ed
  由 Zhanlue Yang 提交于 1月 06, 2022
```
* Handled special sum_grad_op code gen in Eager Dygraph

* Fixed merge issues
```
  d422a1ed
- B
  
  add mkldnn matmulv2 ut (#38749) · 89c0877e
  由 baoachun 提交于 1月 06, 2022
  
  89c0877e
- L
  Revert "Remove useless headers for some grad ops (#38732)" (#38743) · fc990d08
  由 limingshu 提交于 1月 06, 2022
```
This reverts commit c0e2b98e.
```
  fc990d08
- T
  
  Add NPU dockerfile (#38659) · a28eb0f0
  由 tianshuo78520a 提交于 1月 06, 2022
  
  a28eb0f0
- M
  
  [Paddle-ASP]Asp sharding (#37725) · aec6e8a9
  由 minghaoBD 提交于 1月 06, 2022
  
  aec6e8a9
- W
  nearest_interp_v2 bug fix (#38725) · 9c1167cf
  由 wenbin 提交于 1月 06, 2022
```
* bug fix

* remove blank
```
  9c1167cf
- R
  
  fix bugs: output of splited fc is wrong (#38724) · 35213c64
  由 Roc 提交于 1月 06, 2022
  
  35213c64
- L
  Remove useless headers for some grad ops (#38732) · c0e2b98e
  由 limingshu 提交于 1月 06, 2022
```
* fix the wrong filename

* first commit
```
  c0e2b98e
- Z
  【PTen】Adjust the format of full kernel (#38596) · 0c02d2ed
  由 zyfncg 提交于 1月 06, 2022
```
* adjust the full kernel

* remove creation.h

* use Empty to create tensor in full
```
  0c02d2ed
- Y
  [Pten]Move GPU_implementation of elementwise kernel in new directory (#38696) · c1adced7
  由 YuanRisheng 提交于 1月 06, 2022
```
* move gpu_impl of elementwise kernel

* change copyright to 2022
```
  c1adced7
- J
  Added exp FP32 FWD/BWD oneDNN kernel and optimized other oneDNN grad kernels (#38624) · 718183f1
  由 jakpiase 提交于 1月 06, 2022
```
* added exp activation and use_dst_for_bwd kernels

* CI RERUN

* minor change
```
  718183f1
05 1月, 2022 18 次提交

optimize elementwise_mul_grad using new interfaces (#37728) · 36a102f8

由 Lijunhui 提交于 1月 05, 2022

* init commit: new elem_mul_grad

* add template speciallization for complex in multiply

* reply review comments

* correct dx and dy computation when T is complex

* reply review comments

* update to new ReduceRunctor

* mul-output broadcast

* call functions

* call functions with comments

* remove comments

36a102f8

Fix bug for UT GetAllocatorInterfaceTest (#38720) · 905c8022

由 From00 提交于 1月 05, 2022

* Fix bug of GetAllocatorInterfaceTest

* Replace some shared_ptr with unique_ptr

* Change Alloc call

905c8022

J
Make post training quant API support dataloader (#38686) · 0af1a87b
由 Jiaqi Liu 提交于 1月 05, 2022
```
* make post training quant API support dataloader
```
0af1a87b
J

Add input data type checking in BF16 placement pass (#38702) · 60c51de5
由 joanna.wozna.intel 提交于 1月 05, 2022

60c51de5
Q

[XPU] update XPU run check scripts, test=develop (#38698) · bbe83ed1
由 Qi Li 提交于 1月 05, 2022

bbe83ed1
T

update masked_select_op for kunlun (#38678) · 40078103
由 TTerror 提交于 1月 05, 2022

40078103

[Eager] Support test imperative basic in eager test_empty_grad (#38376) · 9108e777

由 wanghuancoder 提交于 1月 05, 2022

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* eager test case

* support inference test

* refine test and fix initializer failed

* modify eagertensor patch method

* add eagertensor.clear_grandint, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* call monkey_patch_varbase in _test_eager_guard, test=develop

* split clear_gradient to clear_gradient and zero_grads, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NJiabinYang <360788950@qq.com>

9108e777

W

add depthwise_conv2d op for mkldnn (#38484) · e1cc2236
由 wangxinxin08 提交于 1月 05, 2022

e1cc2236

[pten]Move reduce code new (#38648) · 7a4a512d

由 chentianyu03 提交于 1月 05, 2022

* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* fix compile bugs

* move reduce files by new rule

* add set header

* format code style

* merge develop and fix conflict

* merge develop and fix conflict
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

7a4a512d

W
add the examples for the mm (#38669) · c90a652d
由 wawltor 提交于 1月 05, 2022
```
* add the examples for the mm

* fix the document of paddle.mm
```
c90a652d
C
[PTen] Polish infermeta filename (#38695) · d6df5bd9
由 Chen Weihang 提交于 1月 05, 2022
```
* polish infermeta filename

* polish infermeta filename
```
d6df5bd9
J
Fix for matmul_v2 oneDNN op broadcasting when inputs dims have different lengths (#38665) · 67923124
由 jakpiase 提交于 1月 05, 2022
```
* fix for matmul_v2 broadcasting

* fix for output shape not broadcasted
```
67923124
W
inference c_api support std::string (#38667) · f289cf85
由 Wilber 提交于 1月 05, 2022
```
* c_api support std::string

* update

* update

* add NOTE

* fix delete error.
```
f289cf85

Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d

由 joanna.wozna.intel 提交于 1月 05, 2022

* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list

1456b02d

add huber_loss for kunlun (#38589) · a268c7ce

由 TTerror 提交于 1月 05, 2022

* add huber_loss for kunlun

* update xpu.cmake

* update unitests

* update unitests

* update elementwise_add

* update elementwise_add

* update elementwise_add

a268c7ce

Support EagerTensor initialization with kwargs (#38488) · 4ba6d4e4

由 Weilong Wu 提交于 1月 05, 2022

* Support EagerTensor init with kwargs

* Updated comments

* Updated unit tests case

* Refactor InitTensor related code to reduce duplicate code

* Updated the error reporting msg

* Updated VLOG msg

* Merge develop and Update EagerTensor init func

* Polish switch case, reduce some code

* Add SyntaxError unit test case

* Refactor the related initialization func of EagerTensor

* Remove ParseStopGradient and ParseZeroCopy and ParsePersistable, construct ParseBooleanArgs instead.

* Updated error msg to pass CI

* Updated PADDLE_ENFORCE error type

4ba6d4e4

implementation of broadcast div backward by reduce (#38044) · 55cd9cb8

由 crystal 提交于 1月 05, 2022

* add elementwise div

* move mul and div grad functor

* Combine multiple CUDA kernels

* Update the reduce interface call

* add multi-output

* add multi-output div

* add branch judge

* Package branch

* Combine the x and y functions into one

55cd9cb8

王

[infrt] optimize the infrt rewriter pattern format. test=develop (#38694) · d1dc677a
由王明冬提交于 1月 05, 2022

d1dc677a

04 1月, 2022 8 次提交
- N
  Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with... · 6eac06e3
  由 niuliling123 提交于 1月 04, 2022
```
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
```
  6eac06e3
- L
  
  [new-exec] avoid adding_feed_fetch in each run (#38672) · 1345a456
  由 Leo Chen 提交于 1月 04, 2022
  
  1345a456
- Q
  
  [XPU] update XPU device info, test=develop (#37884) · e1187e50
  由 Qi Li 提交于 1月 04, 2022
  
  e1187e50
- A
  Fix memcpyD2H sync behavior with other stream (#38647) · c0c54ba3
  由 Aurelius84 提交于 1月 04, 2022
```
* Fix memcpyD2H sync behavior with other stream

* add wait

* add wait

* add wait
```
  c0c54ba3
- Y
  [Pten]Move CPU_implementation of elementwise kernel in new directory (#38651) · 7c020c71
  由 YuanRisheng 提交于 1月 04, 2022
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* move cpu_impl of elementwise kernel to new directory
```
  7c020c71
- F
  [NPU] add pad and pad_grad (#38658) · 6e9714a2
  由 furnace 提交于 1月 04, 2022
```
[NPU] add pad and pad_grad
```
  6e9714a2
- L
  
  [fleet_executor] Support multi carriers (#38650) · 2273471d
  由 LiYuRio 提交于 1月 04, 2022
  
  2273471d
- J
  
  added sqrt bf16 fwd/bwd (#38599) · 2d2609ea
  由 jakpiase 提交于 1月 04, 2022
  
  2d2609ea

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功