提交 · 7a4a512daa172062068c7fab669bd321f1926274 · 机器未来 / Paddle

05 1月, 2022 10 次提交

[pten]Move reduce code new (#38648) · 7a4a512d

由 chentianyu03 提交于 1月 05, 2022

* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* fix compile bugs

* move reduce files by new rule

* add set header

* format code style

* merge develop and fix conflict

* merge develop and fix conflict
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>

7a4a512d

W
add the examples for the mm (#38669) · c90a652d
由 wawltor 提交于 1月 05, 2022
```
* add the examples for the mm

* fix the document of paddle.mm
```
c90a652d
C
[PTen] Polish infermeta filename (#38695) · d6df5bd9
由 Chen Weihang 提交于 1月 05, 2022
```
* polish infermeta filename

* polish infermeta filename
```
d6df5bd9
J
Fix for matmul_v2 oneDNN op broadcasting when inputs dims have different lengths (#38665) · 67923124
由 jakpiase 提交于 1月 05, 2022
```
* fix for matmul_v2 broadcasting

* fix for output shape not broadcasted
```
67923124
W
inference c_api support std::string (#38667) · f289cf85
由 Wilber 提交于 1月 05, 2022
```
* c_api support std::string

* update

* update

* add NOTE

* fix delete error.
```
f289cf85

Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d

由 joanna.wozna.intel 提交于 1月 05, 2022

* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list

1456b02d

add huber_loss for kunlun (#38589) · a268c7ce

由 TTerror 提交于 1月 05, 2022

* add huber_loss for kunlun

* update xpu.cmake

* update unitests

* update unitests

* update elementwise_add

* update elementwise_add

* update elementwise_add

a268c7ce

Support EagerTensor initialization with kwargs (#38488) · 4ba6d4e4

由 Weilong Wu 提交于 1月 05, 2022

* Support EagerTensor init with kwargs

* Updated comments

* Updated unit tests case

* Refactor InitTensor related code to reduce duplicate code

* Updated the error reporting msg

* Updated VLOG msg

* Merge develop and Update EagerTensor init func

* Polish switch case, reduce some code

* Add SyntaxError unit test case

* Refactor the related initialization func of EagerTensor

* Remove ParseStopGradient and ParseZeroCopy and ParsePersistable, construct ParseBooleanArgs instead.

* Updated error msg to pass CI

* Updated PADDLE_ENFORCE error type

4ba6d4e4

implementation of broadcast div backward by reduce (#38044) · 55cd9cb8

由 crystal 提交于 1月 05, 2022

* add elementwise div

* move mul and div grad functor

* Combine multiple CUDA kernels

* Update the reduce interface call

* add multi-output

* add multi-output div

* add branch judge

* Package branch

* Combine the x and y functions into one

55cd9cb8

王

[infrt] optimize the infrt rewriter pattern format. test=develop (#38694) · d1dc677a
由王明冬提交于 1月 05, 2022

d1dc677a

04 1月, 2022 18 次提交
- N
  Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with... · 6eac06e3
  由 niuliling123 提交于 1月 04, 2022
```
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
```
  6eac06e3
- L
  
  [new-exec] avoid adding_feed_fetch in each run (#38672) · 1345a456
  由 Leo Chen 提交于 1月 04, 2022
  
  1345a456
- Q
  
  [XPU] update XPU device info, test=develop (#37884) · e1187e50
  由 Qi Li 提交于 1月 04, 2022
  
  e1187e50
- A
  Fix memcpyD2H sync behavior with other stream (#38647) · c0c54ba3
  由 Aurelius84 提交于 1月 04, 2022
```
* Fix memcpyD2H sync behavior with other stream

* add wait

* add wait

* add wait
```
  c0c54ba3
- Y
  [Pten]Move CPU_implementation of elementwise kernel in new directory (#38651) · 7c020c71
  由 YuanRisheng 提交于 1月 04, 2022
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* move cpu_impl of elementwise kernel to new directory
```
  7c020c71
- F
  [NPU] add pad and pad_grad (#38658) · 6e9714a2
  由 furnace 提交于 1月 04, 2022
```
[NPU] add pad and pad_grad
```
  6e9714a2
- L
  
  [fleet_executor] Support multi carriers (#38650) · 2273471d
  由 LiYuRio 提交于 1月 04, 2022
  
  2273471d
- J
  
  added sqrt bf16 fwd/bwd (#38599) · 2d2609ea
  由 jakpiase 提交于 1月 04, 2022
  
  2d2609ea
- 0
  [Dy2st]Fix error when set buffer in forward (#38540) · 1e3f01ed
  由 0x45f 提交于 1月 04, 2022
```
* fix error when set buffer in forward

* add unittest

* refine class name

* refine not framework.in_dygraph_mode() in if

* fix UT error

* add comment

* refine code

* remove useless import
```
  1e3f01ed
- Z
  
  Modify macro definition to support arm (#38642) · 719f7419
  由 zhangkaihuo 提交于 1月 04, 2022
  
  719f7419
- 王
  
  [infrt] add trt_graph_split_pass for infrt. test=develop (#38494) · 9f0958fa
  由王明冬提交于 1月 04, 2022
  
  9f0958fa
- Z
  [Unify Tensors PR #3]Port framework::Tensor members & interfaces to... · dfdc9960
  由 Zhanlue Yang 提交于 1月 04, 2022
```
[Unify Tensors PR #3]Port framework::Tensor members & interfaces to pten::DenseTensor, test=allcases (#38473)

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes
```
  dfdc9960
- W
  
  Support test_imperative container_sequential and signal_handler with eager_guard (#38614) · a7b13d38
  由 Weilong Wu 提交于 1月 04, 2022
  
  a7b13d38
- H
  
  remove sigmoid cross entropy with logits from kl1 oplist. (#38641) · 30be9317
  由 houj04 提交于 1月 04, 2022
  
  30be9317
- C
  [PTen] Move inner empty and cast api to kernel.h (#38587) · 64538c8d
  由 Chen Weihang 提交于 1月 04, 2022
```
* move inner cast api to cast_kernel.h

* resolve conflit
```
  64538c8d
- Y
  heter context support dynamic mf dim (#38487) · 59888bba
  由 yaoxuefeng 提交于 1月 04, 2022
```
heter context support dynamic mf dim
```
  59888bba
- W
  
  [Eager] Fix benchmark Performance (#38610) · 08b7f17d
  由 wanghuancoder 提交于 1月 04, 2022
  
  08b7f17d
- Z
  
  plugin terminate should be called by TensorRT (#38374) · ba411960
  由 zlsh80826 提交于 1月 04, 2022
  
  ba411960
31 12月, 2021 12 次提交
- Z
  [XPU]add split op for kunlun2,*test=kunlun (#38277) · 26b845e2
  由 Zhangjingyu06 提交于 12月 31, 2021
```
* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun2,*test=kunlun

* [XPU]add split op for kunlun,*test=kunlun
Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
```
  26b845e2
- J
  [new API] add paddle.kthvalue and paddle.Tensor.kthvalue (#38386) · 538b5721
  由 JYChen 提交于 12月 31, 2021
```
* add new api/op kthvalue

* kthvalue cuda kernel to cub sorting

* fix example code error

* throw errors instead of LOG in cuda sort

* throw errors by Paddle_ENFORCE
```
  538b5721
- B
  add mul_gru_fuse_pass ut (#37772) · bc827307
  由 baoachun 提交于 12月 31, 2021
```
* add mul_gru_fuse_pass ut

* update ut

* update ut

* update ut timeout setting

* update ut
```
  bc827307
- J
  Fix for MKLDNNDeviceContext error in matmul_v2_transpose_reshape fuse pass when GLOG_v set (#38554) · 1d31764e
  由 jakpiase 提交于 12月 31, 2021
```
* glog fix

* changed approach
```
  1d31764e
- J
  Fix for undefined format for 6 dim tensor (#38553) · 730ccd9e
  由 jakpiase 提交于 12月 31, 2021
```
* 6 dims fix

* removed limitations of max dims
```
  730ccd9e
- Y
  [Pten]Fix bugs of compilation when use pten::add/subtract (#38631) · 31efec53
  由 YuanRisheng 提交于 12月 31, 2021
```
* change 'math' to 'math_kernel'

* fix compile bugs

* merge develop

* fix compile bugs

* fix compile bugs
```
  31efec53
- X
  Probability distribution API of Beta and KL-Divergence (#38558) · 4794a44f
  由 Xiaoxu Chen 提交于 12月 31, 2021
```
* add beta distribution
* add kl_divergence and register_kl api
```
  4794a44f
- Z
  
  fix compile error for fleetwrapper with -DWITH_TESTING=ON (#38603) · 761055f0
  由 zmxdream 提交于 12月 31, 2021
  
  761055f0
- T
  
  fix_CUDA_ARCH_BIN (#38601) · 98b13322
  由 tianshuo78520a 提交于 12月 31, 2021
  
  98b13322
- F
  [MLU]support calling mlu op from python interface (#38292) · b6bf650a
  由 fwenguang 提交于 12月 31, 2021
```
* [MLU]support calling mlu op from python interface

* [MLU]fix

* fix

* [mlu]fix mlu_places

* [mlu]fix required mlu

* fix

* [MLU]fix tensor copy

* [mlu] fix MLUPlace call path
```
  b6bf650a
- W
  
  fix python ascend run error. (#38605) · 1df354e7
  由 Wilber 提交于 12月 31, 2021
  
  1df354e7
- J
  [new api] add new api paddle.quantile and paddle.Tensor.quantile (#38567) · 20dc1ac2
  由 JYChen 提交于 12月 31, 2021
```
* add new api paddle.quantile and paddle.Tensor.quantile

* add take_todo and fix UT
```
  20dc1ac2

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致