提交 · d666c7df39f78e5a0aa4a5ec18de47406940c56b · BaiXuePrincess / Paddle

12 12月, 2022 10 次提交
- P
  [PHI] OneDNN version of Copy (#48539) · d666c7df
  由 Paulina Gacek 提交于 12月 12, 2022
```
* OneDNN version of Copy, tranpose kernels adjusted

* style fixes in tranpose_grad

* redundant headers deleted
```
  d666c7df
- Y
  Enhance check_nan_inf implementation for CPU. (#48591) · 69e695b7
  由 Yiqun Liu 提交于 12月 12, 2022
```
* Enable to print device info.

* Enhance the nan and inf checking for cpu.

* Implement a common print function.

* Unify the check of complex numbers.

* Rewrite the omp method.

* Count and print the number of nan and inf.

* Change the print content.

* Add unittest.
```
  69e695b7
- F
  
  fix: Move the pass location to the appropriate location (#48951) · 6698e8d1
  由 feng_shuai 提交于 12月 12, 2022
  
  6698e8d1
- Z
  
  forbid conv op whose weight is not a persistable weight into Paddle-TRT (#48763) · 60223894
  由 zhoutianzi666 提交于 12月 12, 2022
  
  60223894
- H
  [PHI decoupling] move norm_utils.cu.h from fluid to phi and remove norm_utils.h in fluid (#48930) · 3cb8db8f
  由 huangjiyi 提交于 12月 12, 2022
```
* move norm_utils.cu.h from fluid to phi

* remove norm_utils.h in fluid

* fix bugs and replace mutable_data with Alloc

* replace mutable_data with Alloc
```
  3cb8db8f
- Z
  
  add static_ops.yaml for static op (#48991) · 8f87f0c7
  由 zyfncg 提交于 12月 12, 2022
  
  8f87f0c7
- Z
  
  fix a bug in GetTrtWeight (#48993) · 93e36b06
  由 zhoutianzi666 提交于 12月 12, 2022
  
  93e36b06
- Generate static graph code of some ops by yaml (#48771) · 4c0d46a8
  由 HappyHeavyRain 提交于 12月 12, 2022
```
* generate static graph code of some ops by yaml, test = develop

* fix 'take_along_axis' yaml style

* reset scatter/scatter_nd_add

* delete the comments of put_along_axis
```
  4c0d46a8
- R
  Support cross-step stream synchronization for standalone executor (#48809) · 9455d146
  由 Ruibiao Chen 提交于 12月 12, 2022
```
* Add UT

* Support cross-step stream synchronization for standalone executor

* Fix typos

* Fix typos

* Update UTs
```
  9455d146
- W
  Add dynamic checks for collective communication on NCCL (#48915) · e7711592
  由 Wen Sun 提交于 12月 12, 2022
```
* chore: unify `SingleTensor`

* feat: dynamic check
```
  e7711592
11 12月, 2022 2 次提交
- L
  H2D data transfer optimization with usage of structure type for stack kernel (#48899) · a78f0a16
  由 limingshu 提交于 12月 11, 2022
```
* first commit.

* refine performance with fast_divmod

* refine performance with fast_divmod
```
  a78f0a16
- W
  
  fix for mkldnn (#48852) · 96e58f87
  由 Wilber 提交于 12月 11, 2022
  
  96e58f87
10 12月, 2022 1 次提交
- Z
  [Paddle-TRT] add cast between int64 tensor and Paddle-TRT (#45547) · fd373579
  由 zhoutianzi666 提交于 12月 10, 2022
```
* Add cast between int64 tensor and Paddle-TRT
* Add Unit testing.
```
  fd373579
09 12月, 2022 16 次提交
- R
  support py3 in setup.py (#48905) · 2935ce07
  由 risemeup1 提交于 12月 09, 2022
```
* support py3 in setup.py

* support setup.py bdist_wheel in py3

* support py3 in setup.py

* modify run_setup
```
  2935ce07
- S
  [PHI] Migrate reshape kernel (#48749) · 7b2b0c1b
  由 Sławomir Siwek 提交于 12月 09, 2022
```
* reshape

* typo

* remove header
```
  7b2b0c1b
- Y
  [Inference] optimize some code and fix some bug (#48780) · c0034b5b
  由 Yuanle Liu 提交于 12月 09, 2022
```
* clean ir_pass_manager and fix map_depthwise_conv_to_conv_pass

* fix unitest timeout
```
  c0034b5b
- H
  [Custom XPU Support] Custom extension support xpu backend (#48733) · 5ecd0ad5
  由 HongyuJia 提交于 12月 09, 2022
```
* support custom_xpu

* update cmake to test xpu

* support custom_xpu, verify mechanism

* fix test_custom_relu_op_xpu_setup.py, test=kunlun

* fix FLAGS_init_allocated_mem

* cancel TIMEOUT property

* reset FLAGS_init_allocated_mem property
```
  5ecd0ad5
- Z
  [inference][trt] upgrade prelu op (#48528) · 98ab2433
  由 Zhang Jun 提交于 12月 09, 2022
```
* add prelu
```
  98ab2433
- fix scale type in alpha and beta (#48887) · c1cadcca
  由 MarDino 提交于 12月 09, 2022
  
  c1cadcca
- H
  
  move ops_extra_info_gen.py from phi to fluid (#48926) · c7d6d9f4
  由 huangjiyi 提交于 12月 09, 2022
  
  c7d6d9f4
- W
  mv fused_bias_dropout_residual_ln to fluid manual dir (#48824) · e0131224
  由 Weilong Wu 提交于 12月 09, 2022
```
* mv fused_bias_dropout_residual_ln to fluid manual dir

* rm useless comments
```
  e0131224
- J
  xpu support inplace flatten (#48909) · e6fdcd90
  由 james 提交于 12月 09, 2022
```
This is a PR to catch up with latest xpu white list strategy
(https://github.com/PaddlePaddle/Paddle/pull/48606)
, since original list only include 'fluid' fashion names, but new list
must include 'phi' fashion as well.
Refer to paddle/phi/core/kernel_factory.cc for more details.
```
  e6fdcd90
- H
  
  temporally disable set_value (#48942) · 905be668
  由 haosicheng 提交于 12月 09, 2022
  
  905be668
- N
  
  Modified the Kernel policy. When the compute is NHWC (#48563) · 992250bf
  由 niuliling123 提交于 12月 09, 2022
  
  992250bf
- Z
  [Paddle Inference]add cutlass act set in conv_elementwise_add_act_fuse_pass (#48838) · 0f6c5459
  由 zhoutianzi666 提交于 12月 09, 2022
```
* add cutlass act set in conv_elementwise_add_act_fuse_pass
```
  0f6c5459
- Z
  Support static graph code-gen for scalar and int_array (#48792) · 58f08924
  由 zyfncg 提交于 12月 09, 2022
```
* add suppport_tensor for code_gen to static graph

* support code-gen for int_array

* polish code

* fix bug of data_type
```
  58f08924
- H
  [Kernel Selection] Simplify kernel selection process in phi, reduce search number to half (#47771) · ff8b2cb7
  由 HongyuJia 提交于 12月 09, 2022
```
* simplify SelectKernelOrThrowError function in phi

* opt kernel_selection process

* polish code, fix backend error
```
  ff8b2cb7
- L
  move share_buffer kernel to phi (#48858) · c2e77ba3
  由 Leo Chen 提交于 12月 09, 2022
```
* move share_buffer kernel to phi

* fix ut

* add source file

* fix window links
```
  c2e77ba3
- P
  
  [PHI decoupling] move "flags.h" from fluid to phi (#48696) · 39ffef0d
  由 PuQing 提交于 12月 09, 2022
  
  39ffef0d
08 12月, 2022 11 次提交
- J
  
  fix paddle2cinn float16 type support bug (#48249) · 73bff10f
  由 jiangcheng 提交于 12月 08, 2022
  
  73bff10f
- L
  
  first commit (#38143) · 2e7c172c
  由 limingshu 提交于 12月 08, 2022
  
  2e7c172c
- K
  fix 'BlasAXPBY unimplemented' error with custom device (#48762) · 127da101
  由 Kai Song 提交于 12月 08, 2022
```
* fix 'BlasAXPBY unimplemented' error with custom device

* fix utils CmakeLists bug
```
  127da101
- H
  
  [XPU] add set_value and set_value_grad (#48845) · 94fe929a
  由 haosicheng 提交于 12月 08, 2022
  
  94fe929a
- R
  rewrite delete_weight_dequant_linear_op_encoder/decoder pass (#48650) · 95332bef
  由 RichardWooSJTU 提交于 12月 08, 2022
```
* rewrite delete_weight_deqquant_linear_op_encoder/decoder pass
```
  95332bef
- H
  
  opt kernel_selection error msg (#48864) · a14ae84b
  由 HongyuJia 提交于 12月 08, 2022
  
  a14ae84b
- J
  proper fix (#48360) · f95e9245
  由 jakpiase 提交于 12月 08, 2022
```
Reenabled ext_reorder recording for TransDataLayoutFromOneDNN
```
  f95e9245
- W
  [Paddle Inference] General optimization for no_varlen embedding layernorm (#48580) · 22bfa579
  由 Wangzheee 提交于 12月 08, 2022
```
* general optimization no_varlen embedding layernorm
```
  22bfa579
- H
  [XPU] add load op into oplist. (#48860) · 2bba3e18
  由 houj04 提交于 12月 08, 2022
```
* [XPU] add load op into oplist.

* remove test_sampling_id_op_xpu.py
```
  2bba3e18
- H
  [PHI decoupling] move cuda_graph from fluid to phi (#48686) · a4d9851b
  由 huangjiyi 提交于 12月 08, 2022
```
* move cuda_graph from fluid to phi

* move device_memory_aligment from fluid to phi

* Revert "move device_memory_aligment from fluid to phi"

This reverts commit b92fcd39a0a50fdac13278f49be0237a85f3a13f.

* update xpu cmake
```
  a4d9851b
- T
  fix-gpups setup.py (#48888) · 91ff2071
  由 tianshuo78520a 提交于 12月 08, 2022
```
* fix-gpups

* test=document_fix
```
  91ff2071

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致