提交 · b9e543f885114d38f96058bcb5a98a1f5e5e1d0a · PaddlePaddle / Paddle

13 4月, 2021 7 次提交

L

upgrade to oneDNN2.2.1 (fix when prim descriptor or attr contain NaN) (#32227) · b9e543f8
由 lidanqing 提交于 4月 13, 2021

b9e543f8
Z

add statistics_UT_resource.sh for imporving UT parallel level (#32220) · 1d5d3e47
由 Zhou Wei 提交于 4月 13, 2021

1d5d3e47
Y
Fix prec on windows for long args (#32218) · 7ab47e8d
由 YUNSHEN XIE 提交于 4月 13, 2021
```
* fix error for long args

* remove unneccessary code
```
7ab47e8d

add layer.to api (#32040) · 6e946e9d

由 chentianyu03 提交于 4月 13, 2021

* add layer.to api

* add layer.to api

* add layer.to api

* add the doc for Layer.to

* add input type checking

* modify assert and import bug

* format code style

* format code style

* make place support str type

* add SetGradVarBase method to set the gradient after conversion

* modify argument palce to device

* modify argument palce to device

* modify doc of layers.to API

* add xpuplace to device argument

6e946e9d

Q

[ROCM] fix depth conv2d in rocm, test=develop (#32170) · 693c7629
由 Qi Li 提交于 4月 13, 2021

693c7629
J

optimize check_finite_and_unscale_op by fused kernel, test=develop (#31954) · fdf63b4e
由 jiangcheng 提交于 4月 13, 2021

fdf63b4e

run the sample codes added by `add_sample_code` in ops.py (#31863) · 4a09c1a1

由 Ren Wei (任卫) 提交于 4月 13, 2021

* skip paddle.Tensor.<lambda>

* some file may not exists. such as version.py, it's generated by setup.py

* debug mode

* add unittests for sampcd_processor.py

* add test cases for sampcd_processor

* add test cases for sampcd_processor

* add testcases

* add test cases

* add testcases

* add testcases

* refactor, add testcases

* add import

* all files map to pool. dont split manually

* __all__ += another list

* add testcases

* add testcases

* handle个锤子啊

* this line should not removed

https://github.com/wadefelix/Paddle/commit/882e7f7c3be6c2415f58550f82be338b84f0c0ef#diff-cb0679475bf60202fd803ae05b9146989437c3f787d1502616be6c71c69d0fb1

* print -> logger

* regulate the logging infomation

* regulate the logging infomation

* logger to file

* logger

* threads or subprocesses number config

* follow the good code style

don't touch wlist.json

* run test_sampcd_processor.py, it's a unittest for sampcd_processor.py

* update unittest for sampcd_processor.py

test=document_fix

4a09c1a1

12 4月, 2021 9 次提交

C

polish custom api content for performence (#32209) · 0624ea56
由 Chen Weihang 提交于 4月 12, 2021

0624ea56

[Rocm] fix python test of multinomial (#32158) · 4b5cb22f

由 zhulei 提交于 4月 12, 2021

* [Rocm] fix python test of multinomial

* [Rocm] fix python test of multinomial

* [Rocm] fix python test of multinomial

* [Rocm] fix python test of multinomial

4b5cb22f

Optimize the process of obtaining prec_list on windows (#32123) · 8dacfb5e

由 YUNSHEN XIE 提交于 4月 12, 2021

* test,test,notest,test=windows_ci

* test,notest,test=windows_ci

* test,notest,test=windows_ci

* test,notest,test=windows_ci

* remove test code

* delete some unnecessary logs

* fix format error

* turn on added ut check on windows

8dacfb5e

A

[CustomOp]Fix description of supporting MacOS (#32192) · bb3b7906
由 Aurelius84 提交于 4月 12, 2021

bb3b7906

[ROCM] fix some unittests (#32129) · bd2a4e23

由 ronnywang 提交于 4月 12, 2021

* [ROCM] fix test_gru_rnn_op

* [ROCM] fix test_expand_op

* [ROCM] fix test_cross_entropy_loss

* [ROCM] fix test_conv_nn_grad

* [ROCM] fix test_bilinear_tensor_product_op

* [ROCM] fix elementwise_op_function

* [ROCM] fix test_lstm_cudnn_op

* [ROCM] fix test_gpu_package_without_gpu_device

* [ROCM] fix test_gru_unit_op

* [ROCM] fix test_imperative_optimizer

* [ROCM] fix rnn

* [ROCM] fix group_norm_op

* [ROCM] fix test_pool3d_api

* [ROCM] fix test_pool3d_op

bd2a4e23

L

Optimization of bilinear backward OP CUDA kernel. (#30950) · d8afe407
由 limingshu 提交于 4月 12, 2021

d8afe407
L

follow comments to refine PR 32144 (#32174) · af374ae6
由 Leo Chen 提交于 4月 12, 2021

af374ae6
W

remove PYTHON_ABI, test=document_fix (#32190) · 80698cad
由 wuhuanzhou 提交于 4月 12, 2021

80698cad
T
fix concat_grad on kunlun (#32151) · a2387ef2
由 TTerror 提交于 4月 12, 2021
```
* fix concat_grad on kunlun

* fix concat_grad on kunlun
```
a2387ef2

10 4月, 2021 2 次提交
- A
  
  Optimize the performance of the forward of log_softmax when axis is -1 and dim <= 1024 (#31630) · f8bab5b0
  由 AshburnLee 提交于 4月 10, 2021
  
  f8bab5b0
- T
  
  Ci py3 gcc5.4 (#32045) · afa3720c
  由 tianshuo78520a 提交于 4月 10, 2021
  
  afa3720c
09 4月, 2021 9 次提交

N
make high precision for avg_pool and adaptive_avg_pool when data_type is float16 (#31887) · ec2ffb68
由 niuliling123 提交于 4月 09, 2021
```
* make high precision for avg_pool
```
ec2ffb68

[NPU] cherry-pick basic NPU components/allocator/operator/executor supports from ascendrc (#32144) · ccf5709d

由 Leo Chen 提交于 4月 09, 2021

* [feature] support npu allocator (#30840)

[feature] support npu allocator

* [feature] support npu operator (#30951)

[feature] support npu operator

* [feature] support npu allocator, part 2 (#30972)

* support npu allocator

* add npu device context

* fix some compile problem

* fix some compile problem

* add npu info

* compile ok

* fix include dir

* support naive_best_fit_allocator

* run ut ok, bug failed to exit

* call aclrtResetDevice before exit

* fix aclFinilize

* add system allocatot test

* add selected_gpus in gtest

* add tensor_test for npu

* support npu op, initial commit

* add npu stream

* add elementwise_add_op

* compile ok

* fix typo

* fix elementwise_add_op_npu_test

* support op run

* test can run but failed

* change aclopExecuteV2 to aclopCompileAndExecute

* support parsing ascend rank table file (#31000)

support parsing ascend rank table file

* Fix reshape on GE graph. (#31084)

Fix reshape on GE graph

* add npu kernel for elementwise_sub and elementwise_sub_grad (#30973)

* add npu sub op

* fix typo

* rename test

* fix bug

* fix bug

* add fp16 kernel

* fix typo

* support sub grad op

* support elementwise_sub_grad op
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>

* Fix compilation problem (#31100)

Fix compilation problem (#31100)

* fix compile

* fix code stype

* remove const_cast

* support adding correct npu op in pybind.h (#31143)

* support adding correct npu op in pybind.h

* refine code

* [NPU] Support executor with NPU (#31057)

* [NPU] Support executor with NPU

* Fix code according to reviews

* Fix code

* Add unittest for sub op npu

* refactor npu device manager (#31154)

refactor npu device manager (#31154)

* fix selected npus

* fix compile

* fix reading flags from env

* format
Co-authored-by: Nxiayanming <41795079@qq.com>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com>
Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>

ccf5709d

S

fix unittest timeour (#32161) · a73cb679
由 Shang Zhizhou 提交于 4月 09, 2021

a73cb679
A
[Dy2Stat] Fix undefined var used in For (#32153) · 4636d136
由 Aurelius84 提交于 4月 09, 2021
```
* fix undefind var in For

* fix code style
```
4636d136
Y

Advoid CPU -> CPU memory copy when start, end, step is already on CPU. (#29088) · 95122ebe
由 Yiqun Liu 提交于 4月 09, 2021

95122ebe
A
[CustomOp]Support MacOS platform and Remove libpaddle_custom_op.so dependency (#31976) · d815fbf9
由 Aurelius84 提交于 4月 09, 2021
```
* Remove old custom OP to reduce whl package volume

* [Custom OP]Remove old custom OP to reduce whl package volume

* support macos
```
d815fbf9
A
[Dy2Stat] Support DictCmp and zip grammer (#32159) · 55730d95
由 Aurelius84 提交于 4月 09, 2021
```
* support DictCmp and zip grammar

* fix code style
```
55730d95
J

Candidate fix to #31992 (#32136) · dabaca00
由 Jacek Czaja 提交于 4月 09, 2021

dabaca00
L

[ROCM] update rocm skip ut list, test=develop (#32149) · 3822247f
由 Lei.C 提交于 4月 09, 2021

3822247f

08 4月, 2021 6 次提交
- C
  Support converting the model from fp32 to fp16 (#32112) · 1bae1e74
  由 cc 提交于 4月 08, 2021
```
* Support converting the model from fp32 to fp16
```
  1bae1e74
- C
  Add LayerDict class (#31951) · e45c3fa5
  由 chentianyu03 提交于 4月 08, 2021
```
* add layerdict class

* add docs and test cases for LayerDict class

* remove the arguments type in function define

* add update inputs type check
```
  e45c3fa5
- J
  
  4D Hybrid Parallelism (#32134) · 54344964
  由 JZ-LIANG 提交于 4月 08, 2021
  
  54344964
- Z
  The unsupported_fp16_list using in AMP will be created automatically during the runtime. (#32102) · 6e65fe02
  由 Zhen Wang 提交于 4月 08, 2021
```
* Use the runtime to create the unsupported_fp16_list using in AMP.

* Add more infos about supported ops.

* Add some comments for the function of OpSupportedInfos.

* Fix the unit test of test_multi_precision_fp16_train.
```
  6e65fe02
- S
  
  fix bug (#32135) · 72302033
  由 ShenLiang 提交于 4月 08, 2021
  
  72302033
- T
  
  fix the XXX_GRAD_CASE bug by HexToString (#32004) · f74f9762
  由 Thomas Young 提交于 4月 08, 2021
  
  f74f9762
07 4月, 2021 7 次提交

D
add uint8 type for flatten op (#32120) · 297290a8
由 danleifeng 提交于 4月 07, 2021
```
* add uint8 type for flatten;test=develop
```
297290a8

move graph files (#32103) · 4935b8e7

由 seemingwang 提交于 4月 07, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

4935b8e7

Check added ut on windows (#31826) · e09f4db9

由 YUNSHEN XIE 提交于 4月 07, 2021

* added ut check on windows,notest,test=windows_ci

* debug,notest,test=windows_ci

* debug,notest,test=windows_ci

* fix bug,notest,test=windows_ci

* added ut check

* test for new ut add on windows

* test,notest,test=windows_ci

* fix bug,notest,test=windows_ci

* test

* test

* test

* test,notest,test=windows_ci

* test,notest,test=windows_ci

* check added ut on windows

* only fetch upstream develop

* modified according comment

* Update run_unittests.sh

* Update run_unittests.sh

e09f4db9

F

bugfix for unit test test_segment_ops (#32116) · d91faf29
由 furnace 提交于 4月 07, 2021

d91faf29

【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3

由 zhang wenhui 提交于 4月 07, 2021

* Ascend rc (#30483)

* Fix compilcation on CANN20.1 and older (#30494)

Fix compilcation on CANN20.1 and older

* Add distribution supported (#30578)

Add distribution supported

* Build praser for Hcom* operators (#30627)

Build praser for Hcom* operators

* Pass device_ids info from launch to trainer. (#30632)

Pass device_ids info from launch to trainer

* Add Hccl program group (#30642)

Add Hccl program group

* Add startup bash files of test_ascend_group. (#30645)

Add startup bash files of test_ascend_group

* cleanup (#30646)

cleanup test_ascend_group.py

* [Feature] Build parser to support distributed training (#30658)

[Feature] Build parser to support distributed training

* fix compilation on ascend-20.1 (#30722)

fix compilation on ascend-20.1

* Dev/fix ascend string (#30749)

Dev/fix ascend string

* code style (#30781)

code style

* Merge ascend_optimizer and ascend_parser. (#30776)

Merge ascend_optimizer and ascend_parser.

* Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)

Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug

* Add paddle ascend distribution training supported (#30796)

Add paddle ascend distribution training supported

* pass cxx_flags to gloo cmake (#30857)

* Destroy session first. (#30954)

Destroy session first.

* merge

* fix, test=develop

* fix, test=develop

* fix style, test=develop

* fix, test=develop

* fix

* fix log fatal, test=develop

* fix enforce style, test=develop

* fix, test=develop

* fix, test=develop

* fix rccl, test=develop

* fix test, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix node_num, test=develop

* fix ids str, test=develop

* fix ids str, test=develop

* fix ids str, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix style code, test=develop

* fix style code, test=develop

* fix style code, test=develop

* fix style code, test=develop
Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
Co-authored-by: Ndingsiyu <18369187719@163.com>
Co-authored-by: NOleNet <olenet@126.com>

8c7c53b3

J

[3D-parallelism] Hybrid Model Parallelism (#32074) · 1e60a0c4
由 JZ-LIANG 提交于 4月 07, 2021

1e60a0c4
O
improve performance of DepthwiseConv(NHWC) (#31677) · 363b25aa
由 Ouyang Chao 提交于 4月 07, 2021
```
* improve performance of DepthwiseConv(NWHC)
```
363b25aa

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功