提交 · 8c7c53b3d5237bcdbcb42e492ec51bc581223549 · PaddlePaddle / Paddle

07 4月, 2021 7 次提交

【NPU】Merge ascend GE&distributed code by 0208 from ascendrc (#31957) · 8c7c53b3

由 zhang wenhui 提交于 4月 07, 2021

* Ascend rc (#30483)

* Fix compilcation on CANN20.1 and older (#30494)

Fix compilcation on CANN20.1 and older

* Add distribution supported (#30578)

Add distribution supported

* Build praser for Hcom* operators (#30627)

Build praser for Hcom* operators

* Pass device_ids info from launch to trainer. (#30632)

Pass device_ids info from launch to trainer

* Add Hccl program group (#30642)

Add Hccl program group

* Add startup bash files of test_ascend_group. (#30645)

Add startup bash files of test_ascend_group

* cleanup (#30646)

cleanup test_ascend_group.py

* [Feature] Build parser to support distributed training (#30658)

[Feature] Build parser to support distributed training

* fix compilation on ascend-20.1 (#30722)

fix compilation on ascend-20.1

* Dev/fix ascend string (#30749)

Dev/fix ascend string

* code style (#30781)

code style

* Merge ascend_optimizer and ascend_parser. (#30776)

Merge ascend_optimizer and ascend_parser.

* Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug  (#30797)

Ascendrc add converted op : [range/equal/range/uniform_random/expand/squeeze], fix cast op bug

* Add paddle ascend distribution training supported (#30796)

Add paddle ascend distribution training supported

* pass cxx_flags to gloo cmake (#30857)

* Destroy session first. (#30954)

Destroy session first.

* merge

* fix, test=develop

* fix, test=develop

* fix style, test=develop

* fix, test=develop

* fix

* fix log fatal, test=develop

* fix enforce style, test=develop

* fix, test=develop

* fix, test=develop

* fix rccl, test=develop

* fix test, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix node_num, test=develop

* fix ids str, test=develop

* fix ids str, test=develop

* fix ids str, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

* fix style code, test=develop

* fix style code, test=develop

* fix style code, test=develop

* fix style code, test=develop
Co-authored-by: Nhutuxian <hutuxian2011@sina.cn>
Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
Co-authored-by: NVoid Main <voidmain1313113@gmail.com>
Co-authored-by: NLeo Chen <chenqiuliang@baidu.com>
Co-authored-by: Ndingsiyu <18369187719@163.com>
Co-authored-by: NOleNet <olenet@126.com>

8c7c53b3

J

[3D-parallelism] Hybrid Model Parallelism (#32074) · 1e60a0c4
由 JZ-LIANG 提交于 4月 07, 2021

1e60a0c4
O
improve performance of DepthwiseConv(NHWC) (#31677) · 363b25aa
由 Ouyang Chao 提交于 4月 07, 2021
```
* improve performance of DepthwiseConv(NWHC)
```
363b25aa
C
update the TraceLayer.save_inference_model method with add file suffix automatically (#31989) · 10af966a
由 CtfGo 提交于 4月 07, 2021
```
As the title
```
10af966a

update name of develop whl package and upgrade gcc 4.8.2 to gcc 5.4 (#31240) · f5186c3c

由 pangyoki 提交于 4月 07, 2021

* update develop whl package name

* distingush cpu and gpu name

* fix ref_gcc

* change whl name

* upgrade gcc 4.8 to 5.4 in ubuntu_dev

* update gcc4.8 to 5.4 in centos

* Upgrade pip from 18.0 to 20.0.1

* change 2.1.0_dev0 to 2.1.0.dev0 in gpu version

f5186c3c

print build summary (#32110) · e625f884

由 iducn 提交于 4月 07, 2021

* print build summary

* print build summary

* print build summary

* print build summary

e625f884

Struct SparseValue && Bug Fix (#31721) · a881b4d5

由 tangwei12 提交于 4月 07, 2021

* add PullSparseValue for pull sparse

* fix bug for PullSparseValue

* add test mode in lookuptable

* revert API change

* add comment for is_training

a881b4d5

06 4月, 2021 9 次提交
- T
  
  Del cudnn6 code2 (#31986) · b8b82b72
  由 tianshuo78520a 提交于 4月 06, 2021
  
  b8b82b72
- J
  
  fix fc doc (#32084) · a17c3691
  由 joejiong 提交于 4月 06, 2021
  
  a17c3691
- W
  
  optimize compilation of operators using eigen (#31851) · 187bf412
  由 wuhuanzhou 提交于 4月 06, 2021
  
  187bf412
- Z
  fix test of affine_grid with rocm (#32047) · 78af100c
  由 zhulei 提交于 4月 06, 2021
```
* fix test of affine_grid with rocm

* fix test of affine_grid with rocm
```
  78af100c
- Z
  [PaddleTRT] Yolov3 bugfix (#32064) · b17e36a4
  由 zlsh80826 提交于 4月 06, 2021
```
* fix yolobox teller condition

* fix cuda double free bug
```
  b17e36a4
- P
  
  remove pass restrictions for skip-ln pass (#32081) · 6d6ea569
  由 Pei Yang 提交于 4月 06, 2021
  
  6d6ea569
- K
  fix two error message (#32039) · 9e8f9037
  由 Kqnonrime 提交于 4月 06, 2021
```
* fix two error message

* fix two error message

* fix error

* fix error

* fix error

* fix error
```
  9e8f9037
- S
  [Hybrid Parallel] Add Topology for hybrid communicate (#32011) · 2e82b6c8
  由 ShenLiang 提交于 4月 06, 2021
```
* support hyparallel, add topology

* fix utest
```
  2e82b6c8
- R
  
  [ROCM] fix the backward maxpool (#32030) · a3b08bad
  由 ronnywang 提交于 4月 06, 2021
  
  a3b08bad
03 4月, 2021 2 次提交
- J
  
  Optimize elementwise_add_grad op, test=develop (#32051) · 1e52f324
  由 jiangcheng 提交于 4月 03, 2021
  
  1e52f324
- W
  
  delete temporary files (#32055) · 36687d7a
  由 WeiXin 提交于 4月 03, 2021
  
  36687d7a
02 4月, 2021 11 次提交

use busybox run test on windows openblas (#31728) · 290be88d

由 YUNSHEN XIE 提交于 4月 02, 2021

* use busybox run test on windows openblas

* fix error

* fix disable_quick and nightly lable issue

* add retry on windows openblas

* fix bug

* use one file to run cpu and gpu tests

* fix with grep warning

* fix syntax error

* change run_unittest to run_unittest_gpu

* Update run_unittests.sh

fix error

290be88d

J

[3D-Parallel:Sharding] Optimizations for supporting ERNIE 3.0 training (#31884) · 69c874fd
由 JZ-LIANG 提交于 4月 02, 2021

69c874fd

support save/load single tensor (#31756) · 43367e4b

由 WeiXin 提交于 4月 02, 2021

* support save/load single tensor

* compatibility modification according to unnittest

* Some python2.7 don't have 'copyreg' modules

* Handle a syntax error.

* Dealing with compatibility problems on Mac.

* Dealing with compatibility problems on Mac.

* edit unittest to improve coverage.

* Modify the code according to the review comments

* Reduce redundant code.

* support for static graph loading dygraph state_dict

* edit code according to CI

* edit unittest

* edit unnittest

* delete redundant file

* edit code according to Comments

* edit english doc

* edit english doc

* edit English DOC.

* get/set_tensor->get/set_value; return_numpy=False

* get/set_tensor->get/set_value; return_numpy=False

* edit unnittest

* edit unnittest

* polish code.

43367e4b

T

fix decorator in py2 (#32043) · bf10d563
由 tianshuo78520a 提交于 4月 02, 2021

bf10d563
C

Add more ops to calculate output scales (#32036) · cd74b207
由 cc 提交于 4月 02, 2021

cd74b207
W

update plugin creator name (#32021) · ed49b418
由 Wilber 提交于 4月 02, 2021

ed49b418
W
update trt engine addplugin name. (#32018) · d9187869
由 Wilber 提交于 4月 02, 2021
```
* update trt engine addplugin name.

* update
```
d9187869

graph engine (#31226) · 94736d60

由 seemingwang 提交于 4月 02, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>

94736d60

R

[ROCM] fix softmax_with_cross_entropy_op (#31982) · 9e06a641
由 ronnywang 提交于 4月 02, 2021

9e06a641
N
add leaky_relu forward and backward in activation_op.cu (#31841) · 4490e8af
由 niuliling123 提交于 4月 02, 2021
```
* add leaky_relu forward and backward in activation_op.cu
```
4490e8af
Z

fix random compile failed on windows (#32032) · 0b42f489
由 Zhou Wei 提交于 4月 02, 2021

0b42f489

01 4月, 2021 11 次提交

Y

delete test_data_generator (#31987) · 0e52cdfc
由 yaoxuefeng 提交于 4月 01, 2021

0e52cdfc
C

fix typo in spawn (#32017) · df5aff80
由 Chen Weihang 提交于 4月 01, 2021

df5aff80
Q

[ROCM] fix depthwise conv failure on ROCM, test=develop (#31998) · a4b30a12
由 Qi Li 提交于 4月 01, 2021

a4b30a12
C

fix use_softmax=False does not work, test=develop · 68e7de26
由 chajchaj 提交于 4月 01, 2021

68e7de26
K

fix doc preblem (#32010) · 1b6c1d39
由 kuizhiqing 提交于 4月 01, 2021

1b6c1d39
S
Support control flow in DataParallel (#31625) · 8460698b
由 ShenLiang 提交于 4月 01, 2021
```
* support control flow

* supoort sync_parameters_buffers

* fix the bug of sparse embedding
```
8460698b

fix doc of Pooling layers (#31977) · 40e6c57b

由 Wei Shengyu 提交于 4月 01, 2021

* fix doc of MaxPool1D

* fix doc

* fix doc format error

* dbg

* fix doc

* dbg doc format test=document_fix

* fix format test=document_fix

* test doc

* remove - from doc

* fix indent

* remove space before bracket

* dbg format

* fix indent test=document_fix

* remove new line

* fix descrip of Shape test=document_fix

* add description for default value test=document_fix

* fix bug test=document_fix

40e6c57b

add custom init grad for backward function (#31540) · 83b953f5

由 chentianyu03 提交于 4月 01, 2021

* add custom init grad for backward function

* add custom init grad for backward function

* handle when the grad_tensor is none

* handle when the grad_tensor is none

* fix the args type error on windows platform

* modify the args order and doc

* format code

* add grad_tensor to xpu

* modify the grad_tensor type check

* add paddle.backward api to support multi tensors gradient compute

* add paddle.backward api to support multi tensors gradient compute

* add paddle.atuograd module and backward api

* change tensor.backward func args

* modify tensor backward api

* remove create_graph intputs args

* add doc and examplex code for backward api

* when have the same tensor, throw error

* modify test Init func args

* modify the execute.Init func args in test files

* add paddle.autograd package in setup.py.in

* modify error msg, remove _run_backward method in class Tensor

* add test cases for backward api

83b953f5

H

remove useless code (#32001) · 9c5d0286
由 hutuxian 提交于 4月 01, 2021

9c5d0286
T
LOG CLEAN (#31819) · 0589ed21
由 tangwei12 提交于 4月 01, 2021
```
* upgrade vlog

* train from dataset fetch optimize
```
0589ed21

[Paddle-TRT] add anchor generator op plugin (#31730) · b807e408

由 zlsh80826 提交于 4月 01, 2021

* add anchor generator op plugin

* add anchor generator unit_test

* remove dbg info

* remove redundant line

* replace assertion with paddle enforce

* dynamic plugin replaces assertion with paddle enforce

* anchor generator support dynamic shape on spatial axis

* anchor generator test with fp16, dynamic shape

* add anchor generator test all

* add back main

* reduce test input size to not exceed the timelimit of ci

* change super to InferencePassTest for python2 compatibility

* reuse paddle operator anchor generator

* move creator construct to header with default

* add cuda ifdef

* reduce line

* change super to InferencePassTest for python2 compatibility

* fix anchor generator fp16 serialize setting

* split unittest from test_all

* restrict anchor generator input format before version 7234

* anchor generator only support greater than trt7.1

* change min_graph_size to 2

* min_graph size to 3 if dynamic shape

* reduce dynamic shape size to avoid trt search tactic too long to exceed time limit

* remove anchor from fetch list

* anchor generator support all trt version

* fix memory not allocated but if serialized

b807e408

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功