提交 · b8236b7bc102a77e8a4dfaca12e5d1ef09cdf4fc · BaiXuePrincess / Paddle

27 3月, 2022 5 次提交

由 hong 提交于 3月 27, 2022

* move slice to pten

* merge develop; test=develop

* fix slice bug;

* update

* update

* fix error

* update

* fix bug

* polish code

* polish code

* polish code

* try to fix windows bug

* add gpu compile flag;

* try to fix

* remov template;

* polish code;

* fix npu bug;

* fix npu bug

* fix npu bug; test=develop

* fix slice bug;

* remove no need dep

b8236b7b

Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy (#40886) · 0ad2e192

由 From00 提交于 3月 27, 2022

* Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy

* Set FLAGS_use_stream_safe_cuda_allocator to false

* Update

* Remove unnecessary code

* Fix CI errors

* Add UT

0ad2e192

P
fix inplace bug in final_state eager_gen (#40979) · 25591674
由 pangyoki 提交于 3月 27, 2022
```
* fix inplace bug in final_state eager_gen

* fix python_c_gen
```
25591674
Z
Fix amp with optiontional api bug (#40980) · 52f07ab4
由 zhangbo9674 提交于 3月 27, 2022
```
* fix amp with optiontional api bug

* refine optional code for amp
```
52f07ab4

Add StringTensor (#39830) · 0695e1ac

由 Jack Zhou 提交于 3月 27, 2022

* add string tensor and case convert kernels

* Add strings empty kernel; Reorganize the structure of case convert kernel

* Add string infermeta

* Update mutable_data of string tensor

* rename kernel name

* add string copy tmp

* Fix strings copy device bug

* add utf8 gpu converter

* add string tensor c++ api

* Remove mutable_data of string tensor

* update string tensor interface

* remove charcases_flag.h

* remove some fluid headers

* Add make_ddim

* __HIPCC__ -> PADDLE_WITH_HIP

* remove fluid headers

* fix cpu compile

* remove std::hash

* Fix cudaMalloc

* Remove strings/impl directory

* Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps

* Add empty kernel test

* Remove some comments

* Modify lower/upper api encoding type: string->bool

* STRING->PSTRING; Add CreateInferLikeMeta

* Add code gen for C++ String API

* remove strings_api_utils.h

* Add ignore file (strings_api.h, strings_api.cc)

* update strings gen script

* change args order of case convert kernels

* Add comments for pstring, StringTensor

* cpstring_internal.h -> cpstring_impl.h

* Update accordding to comments:

1. Remove fluid headers
2. paddle::platform::errors -> phi::errors
3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
4. Use camel code style

* Remove all singletons in strings kernels

* fix rocm compile

* Fix py3 compile

* Fix c++ coverage

* 1. Add pstring proto type
2. Add StringTensor debug info
3. Rename case_convert_kernel to strings_lower_upper
4. Remove serialize derialize strings kernel

* DataLayout::PSTRING -> DataLayout::PSTRING_UNION

* Register pstring data type

* Fix strings api gen

* Fix dense tensor register pstring dtype

* Fix error messages

* remove line

* add pstring unittest

* remove test string api unitest

* remove empty line

* Remove some headers to decrease the size of executable file

0695e1ac

26 3月, 2022 2 次提交
- Z
  [AMP] add amp for final_status_dygraph (#40945) · 3b895425
  由 zhangbo9674 提交于 3月 26, 2022
```
* add amp for final status

* solve compile error
```
  3b895425
- C
  [Phi] Move mean infershape into phi (#40922) · b94cf842
  由 Chen Weihang 提交于 3月 26, 2022
```
* move mean infershape into phi

* try to run ci

* share layout for mkldnn

* revert grad infershape

* revert grad infershape
```
  b94cf842
25 3月, 2022 20 次提交
- H
  update eager code gen (#40924) · afe2fdd1
  由 hong 提交于 3月 25, 2022
```
* update

* remove useless code

* remove label smooth test

* polish code

* polish code

* polish code

* remove _in_eager_mode error;
```
  afe2fdd1
- D
  fix lars optitmizer bug (#40892) · c006a609
  由 duanboqiang 提交于 3月 25, 2022
```
* fix lars optitmizer bug

* Update optimizer.py
```
  c006a609
- Z
  
  [MLU]add allreduce max/prod/min mlu kernel (#40792) · 9261dff4
  由 zn 提交于 3月 25, 2022
  
  9261dff4
- Z
  
  [Refactor] refactored eager_gen.py PR #2 (#40907) · f027b2ad
  由 Zhanlue Yang 提交于 3月 25, 2022
  
  f027b2ad
- Y
  
  move activation (#40913) · be5918e0
  由 YuanRisheng 提交于 3月 25, 2022
  
  be5918e0
- A
  [Phi] Migrate strided_slice into Phi (#40708) · c33b4f95
  由 Aurelius84 提交于 3月 25, 2022
```
* [Phi] Migrate strided_slice into Phi

* [Phi] Migrate strided_slice into Phi

* fix compilation problem
```
  c33b4f95
- A
  [Phi] Migrate Adam and AdamW into Phi (#40351) · 56cd3407
  由 Aurelius84 提交于 3月 25, 2022
```
* [Phi] Migrate Adam and Adamw into Phi

* fix compile error and unittest ok

* fix compile error and unittest ok

* fix undefined reference to fLI::FLAGS

* test depend on operator

* fix cmake

* fix xpu compile

* fix infrt

* fix amp_type_traits

* fix amp_type_traits

* modify according reviewer

* modify according reviewer

* fix dtype float16

* fix typo

* fix Cmake

* fix code style
```
  56cd3407
- L
  Thread data registry (#40912) · aeae81a7
  由 liutiexing 提交于 3月 25, 2022
```
* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* Update ThreadDataRegistry
Co-authored-by: Nliutiexing <liutiexing@google.com>
```
  aeae81a7
- support multi_dims for tril_triu, *test=kunlun (#40712) · 9ffedcfd
  由 z8hanghuan 提交于 3月 25, 2022
```
* support multi_dims for tril_triu, *test=kunlun

* support multi_dims for tril_triu, *test=kunlun

* support multi_dims for tril_triu, *test=kunlun

* update xpu.cmake date, support multi_dims for tril_triu, *test=kunlun
```
  9ffedcfd
- F
  add maximum limit for grid of reduce, elementwise, gather and scatter (#40813) · 608a5f55
  由 FlyingQianMM 提交于 3月 25, 2022
```
* add maximum limit for grid of reduce, elementwise and gather

* add {} after if
```
  608a5f55
- C
  
  move mul op infershape (#40917) · 609077e9
  由 Chen Weihang 提交于 3月 25, 2022
  
  609077e9
- C
  [Phi] Move part sum op kernel (#40873) · 4ab8255a
  由 Chen Weihang 提交于 3月 25, 2022
```
* move part sum op kernel

* remove deprecated names
```
  4ab8255a
- change CUDA implementation of dropout OP (#40874) · 1c01d1cc
  由 zhouweiwei2014 提交于 3月 25, 2022
  
  1c01d1cc
- J
  Refactor Dygraph Flags (#40786) · 3085d5e4
  由 Jiabin Yang 提交于 3月 25, 2022
```
* refactor eager flags

* fix flags error when we switch from eager to dygraph

* fix ci problem

* fix ci

* fix ci

* merge develop and fix code style

* merge develop and fix code style

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* merge develop
```
  3085d5e4
- F
  
  move elementwise_max/min/mod into phi (#40590) · cfadf61b
  由 FlyingQianMM 提交于 3月 25, 2022
  
  cfadf61b
- 0
  Fix loop index for FillZeroForEmptyGradInputs (#40909) · 3228fc34
  由 0x45f 提交于 3月 25, 2022
```
* Fix loop index for FillZeroForEmptyGradInputs

* Call fill zero in run_program_grad
```
  3228fc34
- S
  
  fix dependency (#40901) · c7b69fd2
  由 seemingwang 提交于 3月 25, 2022
  
  c7b69fd2
- A
  [NPU] add merged_momentum (#40875) · 2b74b739
  由 Aganlengzi 提交于 3月 25, 2022
```
* [NPU] add merged_momentum

* fix

* fix device
```
  2b74b739
- Z
  
  modify unit test in bn, stack and split. *test=kunlun (#40880) · 139a30ec
  由 Zhangjingyu06 提交于 3月 25, 2022
  
  139a30ec
- Z
  Scalar support marking data_type in yaml (#40867) · 04087012
  由 zyfncg 提交于 3月 25, 2022
```
* Scalar support marking data_type in yaml

* fix code-gene bug
```
  04087012
24 3月, 2022 13 次提交

C
[Phi] Move mean op kernel into phi (#40872) · 8df91763
由 Chen Weihang 提交于 3月 24, 2022
```
* add mean phi kernel

* remove original mean kernel

* add alias name
```
8df91763

[Phi] Move batch size like infershape into phi (#40847) · 6d3db9c7

由 Chen Weihang 提交于 3月 24, 2022

* move batch size like infershape

* revert other op change

* call infermeta in infershape

* adjust batchsize like pos

6d3db9c7

Z

p_norm transfer to phi kernels (#40819) · 92afe146
由 zhiboniu 提交于 3月 24, 2022

92afe146
L

[new-exec] enable standalone_executor_test in coverage (#40846) · 22a5035e
由 Leo Chen 提交于 3月 24, 2022

22a5035e
J
fix build_cinn_pass internal var may be control var problem (#40812) · 310b7dba
由 jiangcheng 提交于 3月 24, 2022
```
* fix build_cinn_pass internal var may be control var problem

* add annotation and vlog by review advice
```
310b7dba

Support intermediate for Sparse API (#40840) · 98244a9a

由 zyfncg 提交于 3月 24, 2022

* support intermediate for saprse api

* close intermediate in yaml

* fix dygraph_api dep for eager

98244a9a

[AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48

由 zhangbo9674 提交于 3月 24, 2022

* approve amp for intermediate_dygraph

* add amp_utils for intermediate_dygraph

* add amp needcast check for mlu & npu

* test unittest

* add SetGradNode for set_stop_gradient && add checktensor for GradientHooks

* refine code

* refien unittest of imperative_amp for new dygraph

* inplace api skip amp

* add test_imperative_qat_amp for intermediate amp

* refine code

* refine test_amp ci strategy

* refine unittest code

* refine amp_utils code

* refine amp getpromotetype for some special op

* refine unittest code

c12f7d48

J
Correct MultipleQuantizeSquash (#40717) · 753964a2
由 joanna.wozna.intel 提交于 3月 24, 2022
```
* Correct MultipleQuantizeSquash

* Correct logging
```
753964a2

[MoE]Assign pos op (#40580) · 305f32d1

由 Roc 提交于 3月 24, 2022

* # This is a combination of 10 commits.
# The first commit's message is:
add expert count op

add ut for expert_count

# This is the 2nd commit message:

update UT only for cuda

# This is the 3rd commit message:

fix for rocm

# This is the 4th commit message:

update ut

# This is the 5th commit message:

add moe module

# This is the 6th commit message:

add expert count op

add ut for expert_count

# This is the 7th commit message:

update UT only for cuda

# This is the 8th commit message:

update ut

# This is the 9th commit message:

add moe module

# This is the 10th commit message:

make expert count private

* add assign pos op

* fix upper num name

* add api _assign pos

* add ut for assign pos op

* update date

* fix for win

* update for test (timeout)

* fix ut

* update

* fix ut for number count
Co-authored-by: Nhlygit66666 <2570058140@qq.com>

305f32d1

Refine events waiter (#40876) · 36ee6dd3

由 liutiexing 提交于 3月 24, 2022

* add align for WorkQueue

* add spinlock

* merge develop

* merge

* Add EventsWaiter

* Add EventsWaiter

* update

* Revert "Add EventsWaiter"

This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.

* update

* update Error MSG

* update EventsWaiter

* update
Co-authored-by: Nliutiexing <liutiexing@google.com>

36ee6dd3

Z

Add sparse convertion api and sparse creation api (#40780) · a8f86600
由 zhangkaihuo 提交于 3月 24, 2022

a8f86600

[Phi] Migrate InferShape of multiplex, qr, tril_triu (#40102) · 2e736531

由 caozhou 提交于 3月 24, 2022

* migrate infershape

* fix tril_triu infershape error

* fix qr_op infershape

* add parse qr mode func

* move order

2e736531

[Refactor] refactored eager_gen.py PR #1 (#40815) · 68c9e3e4

由 Zhanlue Yang 提交于 3月 24, 2022

* [Refactor] refactored eager_gen.py PR #1

* [Refactor] refactored eager_gen.py PR #1

* Refactored version 2

* Added automatic code generation utils

* Fixed merge issues

68c9e3e4

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致