提交 · 5e0f199ab02e1f1458e49a9318f40fede2c0439e · Crayon鑫 / Paddle

30 9月, 2021 2 次提交
- 李
  Fix raw optim (#36176) · 5e0f199a
  由李季提交于 9月 30, 2021
```
* fix raw optim

* pre-commit test file
Co-authored-by: Nsneaxiy <sneaxiy@126.com>
```
  5e0f199a
- 李
  
  fix the undefined variable bug in dist_transformer file (#36211) · 8af939f1
  由李季提交于 9月 30, 2021
  
  8af939f1
29 9月, 2021 10 次提交

Add basic support for CUDA Graph (#36190) · 21b93c3d

由 Zeng Jinle 提交于 9月 29, 2021

* add basic support for CUDA Graph

* fix ci compile error

* fix LOG print, fix windows CI

* follow comments and update

* small fix for default ctor

* fix rocm compile error

* fix CPU compile error

21b93c3d

Z
add optest for adamw (#36148) · 69eed34d
由 zhaoyingli 提交于 9月 29, 2021
```
* update func name

* skip cpu

* update unittest

* update unittest
```
69eed34d
L
fix cusparse compile problem, test=develop (#36199) · 3eb50715
由 Liu-xiandong 提交于 9月 29, 2021
```
* fix cusparse compile problem, test=develop

* Modify file permissions
```
3eb50715

Add functional autograd API:hessian (#36108) · 1f93582c

由 levi131 提交于 9月 29, 2021

* init functional jacobian api

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* init hessian API

* save status

* polish API docstring

* modify docstring

* add utils.py

* save status

* fix dygraph double grad dtype error when calling for high differential senario

* reinvoke ci

* test_hessian.py is ok

* polish hessian API

* init vhp

* Revert "init vhp"

This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8.

* add test for partial_engine.cc

* modify numerical_delta with dtype float32

* merge fix for dtype float64

* spell fix

* polish code

* rm _stop_gradient_pre_process
Co-authored-by: NJiabinYang <360788950@qq.com>

1f93582c

Z
[npu] add box coder (#36171) · 83578cfa
由 zhulei 提交于 9月 29, 2021
```
* [npu] add box coder

* [npu] add box coder
```
83578cfa
P

fix bug of top_k npu op (#36175) · 2b8fd704
由 pangyoki 提交于 9月 29, 2021

2b8fd704

[NPU] Add group norm (#35937) · c79de728

由 zhulei 提交于 9月 29, 2021

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group norm

* [NPU] Add group_norm op

c79de728

[NPU] mod for model bert (#36165) · 7bddf2e8

由 Aganlengzi 提交于 9月 29, 2021

* merge conflict of paddle_gtest_main.cc

* modify FLAGS_npu_precision_mode and default not to call aclSetCompileopt

7bddf2e8

W

[hybrid] Fix model parallel non-distributed param broadcast (#36186) · bec9fc9a
由 WangXi 提交于 9月 29, 2021

bec9fc9a

Add op paddle.device.cuda.get_device_name and paddle.device.cuda.get_device_capability. (#35672) · f703558d

由 hlygit66666 提交于 9月 29, 2021

* add op paddle.device.cuda.get_device_name

* fix some bugs

* fix some bugs

* fix error message bugs

* fix en docs

* fix bugs

* fix bugs

* fix bugs

* add error message test case

* add get_device_name and get_device_capability

* fix review

* fix docs bug

* fix docs

* fix docs

f703558d

28 9月, 2021 8 次提交

L
Add sparse_attention api, test=develop (#35676) · 6b587e93
由 Liu-xiandong 提交于 9月 28, 2021
```
Add sparse_attention OPs, python api will be added in next pr
```
6b587e93

add API paddle.linalg.eig (#35674) · bc7e2b92

由 Lijunhui 提交于 9月 28, 2021

* Add paddle.linalg.eig op

* remove comments

* remove comments

* extend batch_size to the origin

* add real times complex functor & destroy the backward complex output bug

* terminate output diff when input real tensors

* correct tiny doc errors

* move functions from eig_helper to svd_helper and remove eig_helper

* remove tensor.Resize

* remove no longer used code

* use existing lapack functions

* reply review comments 21/27

* remove .cu as this op is only executed on CPU

* remove const_cast & add const in argument list for read-only references

* fix sample code error in CI

* remove template typename Tbase and more

* remove eig exposure in paddle.*

* add 'name=None' in eig python implementation

* handle the unittest

* try to solve the unittest

* solve CI coverage

* remove no longer used code

* polish API doc and more

* reply review comments

* polish unittest, commit plan B

* polish unittest

bc7e2b92

[hybrid] seed and dropout op support force-cpu (#35820) · 58c8f6b3

由 xiayanming 提交于 9月 28, 2021

* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid

* [HIP] fix op not support AMD GPU bug, the flag PADDLE_WITH_ROCM is invalid

* [HIP] fix op not support AMD GPU bug

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] seed and dropout op support force-cpu

* [hybrid] fix seed ci failed issue

* add AsExtra for force_cpu of seed op

58c8f6b3

remove new linalg api in paddle.__init__ (#36151) · 3bb4715e

由 zhiboniu 提交于 9月 28, 2021

remove recent linalg api in paddle.init;
add args 'name' in some new linalg api interface
same change in develop branch to #36112

3bb4715e

【Bug fix】Fix dygraph double grad dtype error (#36125) · af4f018a

由 Jiabin Yang 提交于 9月 28, 2021

* fix dygraph double grad dtype error when calling for high differential senario

* reinvoke ci

* add test for partial_engine.cc

af4f018a

K

py2 to py3 bug and iface fix for pslib (#36102) · 0e07f20e
由 kuizhiqing 提交于 9月 28, 2021

0e07f20e
W

[hybrid] optimizer sharding support optimize cast (#35878) · eef0a943
由 WangXi 提交于 9月 28, 2021

eef0a943

Add paddle.device.cuda.get_device_properties (#35661) · 4cbed9e5

由 Yanxing Shi 提交于 9月 28, 2021

* Initial Commit

* add unittest and add error information

* modify doc

* fix some error

* fix some word

* fix bug cudaDeviceProp* and modify error explanation

* fix cudaDeviceProp* error and unnitest samples

* fix hip error and PADDLE_WITH_HIP

* update style

* fix error is_compiled_with_cuda

* fix paddle.device.cuda.get_device_properties

* fix error for multi thread safe

* update style

* merge conflict

* modify after mentor review

* update style

* delete word

* fix unittest error for windows

* support string input and modify some code

* modify doc to support string input

* fix error for express information

* fix error for express information

* fix unnitest for windows

* fix device.startswith('gpu:')

* format error and doc

* fix after review

* format code

* fix error for doc compile

* fix error for doc compile

* fix error for doc compile

* fix error for doc compile

* fix error for doc compile

* fix py2 error

* fix wrong words and doc

* fix _gpuDeviceProperties

4cbed9e5

27 9月, 2021 4 次提交

fix zero tensor for unique, unstack (#36021) · efd35384

由 Jiawei Wang 提交于 9月 27, 2021

* fix extra op for expand, expand_as, tile, unstack

* fix unique unstack dim 0

* Update expand_v2_op.cc

* fix unique_op format

efd35384

Added flatten and flatten2 BF16/FP32 FWD/BWD kernels (#35892) · e427a0f1

由 jakpiase 提交于 9月 27, 2021

* refactored reshape multiop kernel and added flatten1/2 kernels

* added formatting for flatten tests

* CI fix

* disabled reshape_kernel ops after succesful CI run

* minor fix

e427a0f1

Add functional autograd API: jacobian (#35917) · ec2f68e8

由 levi131 提交于 9月 27, 2021

* init functional jacobian api

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* polish API docstring

* modify docstring

ec2f68e8

support saving model defined parameters without add scale_op (#36119) · 8db6d221

由 Haipeng Wang 提交于 9月 27, 2021

* add scale_op in model save step is not necessary, just fix the prune method to support static graph and inplace op

* fix jit.save, no need to add scale_op to each outputvar anymore.
fix prune_with_input, now it supports inplace op

* temporarily disable test_trt_dynamic_shape.TRTDynamicShapeOutOfBound2Test

* allow user to export parameters defined in model

8db6d221

26 9月, 2021 5 次提交
- J
  [new api] add func/class API psroi_pool and UT (#35352) · e45d64ec
  由 JYChen 提交于 9月 26, 2021
```
* add func/class API psroi_pool and UT

* add UT in static mode

* Remove redundant type checks in static mode

* More detailed description for test_psroi_pool_op

* fix code format of UT

* fix en-doc
```
  e45d64ec
- L
  
  Correct the misspelled part of the unit test (#36044) · 991ae3b6
  由 LJQ❤️ 提交于 9月 26, 2021
  
  991ae3b6
- Z
  
  update multi_dot exposure rules (#36018) · 52b45007
  由 zhangkaihuo 提交于 9月 26, 2021
  
  52b45007
- T
  set file_num in one shard (#35835) · 991dc67d
  由 Thunderbrook 提交于 9月 26, 2021
```
* set file_num in one shard

* format
```
  991dc67d
- W
  
  修改了示例代码错误 (#36041) · d70e45d9
  由 wangzhuang01 提交于 9月 26, 2021
  
  d70e45d9
24 9月, 2021 11 次提交

J
add gradient kernel of det op and slogdet op (#36013) · b91e8eec
由 jiangcheng 提交于 9月 24, 2021
```
* add gradient kernel of det op and slogdet op

* fix CI APPROVAL problem
```
b91e8eec

Added elementwise_sub_mkldnn operator (#35662) · 787273ed

由 piotrekobiIntel 提交于 9月 24, 2021

* Add elementwise_sub_mkldnn_op without grad

* Add test to static_mode_white_list

* Refactor code, change license years

* Remove invalid grad implementation

* Fix element_wise_sub_op test

* Fix CI Approval error

* Remove unnecessary EltwiseSubMKLDNNGradKernel class

* Fix CI Approval 2

* Fix CI Approval 3

* Fix CI Approval Attempt #4

* Fix CI Approve Attempt #5

* Fix CI Approval Attempt #6

* Fix CI Approval Attemt #7

* Change test names containing add to sub

* Fix old tests testing add instead of sub

* Copy grad implementation from elementwise_add_mkldnn

* CI test fix attempt

* Revert "CI test fix attempt"

This reverts commit c647cacf41e6a87c715385a185de5cbf65fc8900.

* Fix CI attempt 2

* Fix elementwise_sub tests, temporary mkldnn broadcast test disable

* Add working implementation of elementwise_sub grad

* Fix build errors caused by pull

* Fix format error

* Fix format error 2

* Disable elementwise_sub_mkldnn test on GPU

* Apply fix for paddle.fluid import

* Revert changes of test_elementwise_sub and Fix mkldnn test

* Revert "Apply fix for paddle.fluid import"

This reverts commit fc3b122fec8e12f2bcb32928a2685ba4d20fd742.

* fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (#35862)

* Add changes suggested by reviewers

* Change @unittest.skipIf... to @OpTestTool.skip_if_not_cpu_bf16() to satisfy Approval CI

* Remove check_dygraph=False to satisify CI Approval
Co-authored-by: Nzhangbo9674 <82555433+zhangbo9674@users.noreply.github.com>

787273ed

S

add update (#36017) · 1691dc7a
由 ShenLiang 提交于 9月 24, 2021

1691dc7a

add pool2d convert test (#35923) · 82f255d0

由 JingZhuangzhuang 提交于 9月 24, 2021

* add pool2d convert test

* modify error

* modify error

* modify error

* modify error

* modify error

* modify error

82f255d0

K

fix undefined var in test_batch_sampler. test=develop (#35924) · 4f42e5d7
由 Kaipeng Deng 提交于 9月 24, 2021

4f42e5d7
W

concat api support empty tensor. (#35845) · eb28a36d
由 wuhuachaocoding 提交于 9月 24, 2021

eb28a36d

Add paddle.linalg.solve OP (#35715) · 8caf951c

由 Weilong Wu 提交于 9月 24, 2021

* Add linalg.solve op, test=develop

* Fix a bug caused by accidental deletion

* updated description and fix a bug: missing a comma

* Add linalg.solve op, test=develop

* updated solve op backward logic

* updated solve op backward logic again

* Add linalg.solve Op, test=develop

* Updated and modified to fit CI requirements

* Fix a bug

* 1)Add more test cases; 2)Fix a wrong usage in reduces operation; 3)Remove redundant code

* Remove redundant comments

* 1)Removed redundant code; 2)Updated to enhance code robustness

* Removed redundant code

* Updated API documents

8caf951c

fix distributed ops combining problems (#35942) · 4c35f515

由 seemingwang 提交于 9月 24, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove

* when sample k is larger than neigbor num, return directly

* using random seed generator of paddle to speed up

* fix bug of random sample k

* fix code style

* fix code style

* add remove graph to fleet_py.cc

* fix blocking_queue problem

* fix style

* fix

* recover capacity check

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* fix distributed op combining problems

* optimize

* remove logs
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

4c35f515

B

add emb_eltwise_layernorm trt converter test case (#36027) · 0bbaf9bd
由 baoachun 提交于 9月 24, 2021

0bbaf9bd
B
add multihead_matmul trt converter test case (#36023) · fcaa64b3
由 baoachun 提交于 9月 24, 2021
```
* add multihead_matmul trt converter test case

* move attribute check to op_teller
```
fcaa64b3
W
add the shape check for the matmul (#35791) · 8e19d1ba
由 wawltor 提交于 9月 24, 2021
```
* add the shape check for the matmul

* remove the test case for the linear
```
8e19d1ba

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致