提交 · 63f5c2d4624d0df5d9934ad7965ede7740a06c5c · PaddlePaddle / Paddle

11 11月, 2021 17 次提交

[Bug fixes] Add default arg to enhance varbase ClearGradient func (#36837) · 63f5c2d4

由 Weilong Wu 提交于 11月 11, 2021

* Add default arg to enhance varbase ClearGradient func

* Removed default arg, use a Flag to enhance varbase ClearGradient func

* Renamed Flags to FLAGS_real_release

* Use default arg to enhance varbase ClearGradient func and expose two func to set/get gradient isEmpty

* Removed DECLARE_bool statement

* Polished Code

63f5c2d4

Add test property RUN_TYPE=CINN (#37114) · 7a0cc0a9

由 Huihuang Zheng 提交于 11月 11, 2021

Add test property RUN_TYPE=CINN to CINN unit tests. It will restrict Paddle-CINN CI to run these unit tests only.

7a0cc0a9

W

[fleet_executor] interceptor send message through message_bus (#37106) · 8cdd5564
由 WangXi 提交于 11月 11, 2021

8cdd5564
T
add where/where_index/masked_select for kunlun (#37053) · f5e7b02a
由 TTerror 提交于 11月 11, 2021
```
* add where/where_index/masked_select for kunlun

* fix where/where_index

* update where/masked_select
```
f5e7b02a

Added softplus + activation oneDNN fuse pass (#36657) · a346c4dc

由 jakpiase 提交于 11月 11, 2021

* added softplus + activation fuse plass

* minor change

* implemented reviewer suggestion

* minor fix

* minor fix

* added scale_out parameter

* minor fix

* fix for iScan CI

* conditionally disabled logs

* refactored pass builder

a346c4dc

fleet support elastic scale up/down (#36684) · 6af531b7

由 xiayanming 提交于 11月 11, 2021

* fleet support elastic train

* fleet support elastic train

* support elastic

* add unittest

* fix unitest bug

* fix unittest bug

* fix unittest bug

* fix unittest coverage

* fix unittest coverage

* fix unittest coverage

* fix unittest coverage

* fix unittest coverage

* fix elastic bug

* fix ci fail

* fix ci fail

* fix elastic bug

* fix elastic bug

* fix joint debugging bug

* fix joint debugging bug

* fix windows ci failed

* fix windows ci failed

6af531b7

Z
Add macro required by CINN. (#37066) · 9a9345fa
由 Zhen Wang 提交于 11月 11, 2021
```
* Add macro required by CINN.

* Remove CMAKE_BUILD_TYPE form cinn.cmake.
```
9a9345fa

[Heterps]Refactor Heter Pipeline Parameter Server (#36845) · a2da1efa

由 zmx 提交于 11月 11, 2021

* change username

* fix

* fix

* fix

* fix

* fix

* update

* update

* update unittests

* fix

* update

* fix

* update

* fix

* fix

* fix

* update

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update send_and_recv op. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* update. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix ut. test=develop

* fix unit. notest,test=coverage

* fix ut. notest, test=coverage

* update. notest,test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* fix. notest, test=coverage

* fix. notest, test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* fix ut. notest, test=coverage

* add func. notest, test=coverage

* fix ut. notest, test=coverage

* fix. test=develop

* fix. test=develop

a2da1efa

[New features] Support VarBase to expose func (#36965) · 52645667

由 Weilong Wu 提交于 11月 11, 2021

* Expose func for varbase

* Expose func for varbase and enhance varbase init func

* Change func name and add test case for _CopyGradientWith

* Rename func

* Add test cases to increase coverage

* Refine the logic of _to func

* Replace numel() with _numel(), Add test code

52645667

W

op_teller: add all convert_op to int8 (#37099) · 1580eae2
由 Wangzheee 提交于 11月 11, 2021

1580eae2

reduce sample threads (#37098) · 1ecaa793

由 seemingwang 提交于 11月 11, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove

* when sample k is larger than neigbor num, return directly

* using random seed generator of paddle to speed up

* fix bug of random sample k

* fix code style

* fix code style

* add remove graph to fleet_py.cc

* fix blocking_queue problem

* fix style

* fix

* recover capacity check

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* fix distributed op combining problems

* optimize

* remove logs

* fix MultiSlotDataGenerator error

* cache for graph engine

* fix type compare error

* more test&fix thread terminating problem

* remove header

* change time interval of shrink

* use cache when sample nodes

* remove unused function

* change unique_ptr to shared_ptr

* simplify cache template

* cache api on client

* fix

* reduce sample threads when cache is not used
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

1ecaa793

L

Get global cluster information (#37084) · 31673a92
由 LiYuRio 提交于 11月 11, 2021

31673a92
W

update ut (#37089) · 6c183a8e
由 Wilber 提交于 11月 11, 2021

6c183a8e
X

Add Sync Machanism for Scope and VaraibleScope. Fix test_fetch_var (#37085) · 32c3e61b
由 xiongkun 提交于 11月 11, 2021

32c3e61b
W
fix 2 bug: 1.skip lodtensorarray; 2.delete feed op (#37090) · d5df6bdf
由 wanghuancoder 提交于 11月 11, 2021
```
* fix 2 bug: 1.skip lodtensorarray; 2.delete feed op, test=develop

* program clone, test=develop
```
d5df6bdf
N
[PaddlePaddle Hackathon] add WideResNet (#36952) · 8395f573
由 Nyakku Shigure 提交于 11月 11, 2021
```
* add wide resnet
* update pretrained weights link
```
8395f573
J

- Enable FC int8 (#37078) · 498dbfa8
由 Jacek Czaja 提交于 11月 10, 2021

498dbfa8

10 11月, 2021 14 次提交
- J
  Added stack FP32 FWD oneDNN kernel (#37002) · 99f9224c
  由 jakpiase 提交于 11月 10, 2021
```
* added stack oneDNN FP32 op

* minor change

* CI fix

* added skipping for gpus

* fix for stack op

* CI fix

* CI fix

* Added comment

* CI fix
```
  99f9224c
- W
  
  [FleetExecutor]Add interceptor message handle (#37093) · 643fd2f4
  由 WangXi 提交于 11月 10, 2021
  
  643fd2f4
- A
  
  Fix inner_program in Executor (#37083) · 8a2ce0f2
  由 Aurelius84 提交于 11月 10, 2021
  
  8a2ce0f2
- X
  
  fix recurrent_grad tmp variable@GRAD don't exsit in VariableScope (#37061) · 81cfbddc
  由 xiongkun 提交于 11月 10, 2021
  
  81cfbddc
- Y
  [fleet_executor] Add retry to the message bus's send. Use unique_lock instead... · f5caf9c5
  由 Yuang Liu 提交于 11月 10, 2021
```
[fleet_executor] Add retry to the message bus's send. Use unique_lock instead of calling lock(). (#37087)

* use unique lock, add retry

* bug fix
```
  f5caf9c5
- H
  Add libcinnapi.so to setup.py.in (#37068) · b4e25436
  由 Huihuang Zheng 提交于 11月 10, 2021
```
Add libcinnapi.so to setup.py.in
```
  b4e25436
- A
  Simplify constructor of InterpreterCore (#37072) · 8b2c906a
  由 Aurelius84 提交于 11月 10, 2021
```
* Simplify constructor of InterpreterCore

* fix bool

* clean code
```
  8b2c906a
- C
  [PTen] Compatible runtime performance optimization (#36946) · 76d2fd1d
  由 Chen Weihang 提交于 11月 10, 2021
```
* resolve conflit with develop

* cache kernel context in tracer for perf up

* replace densetensor when build kernel context

* fix detail compile error

* append impl to static mode

* fix conflit error

* clear attrs after run kernel

* fix coverage failed

* fix cycle compile error

* remove multi-in&out adapt code

* remove tensor meta utils

* clear data when throw exception
```
  76d2fd1d
- L
  Fix fused_attention_op scope. (#37065) · ad44a40c
  由 Li Min 提交于 11月 10, 2021
```
att, bug fix
```
  ad44a40c
- B
  
  fix multihead_matmul ut for tensorrt6 (#37073) · 48d53cfc
  由 baoachun 提交于 11月 10, 2021
  
  48d53cfc
- J
  Fix rnn grad bug in cpu when dropout is zero (#37080) · 211940eb
  由 Jack Zhou 提交于 11月 10, 2021
```
* fix rnn grad bug when num_layers is set 2 and dropout_prob is set 0

* add more test for rnn
```
  211940eb
- Y
  [fleet_executor] Implementation of the message bus, the carrier and part of... · 072e7801
  由 Yuang Liu 提交于 11月 10, 2021
```
[fleet_executor] Implementation of the message bus, the carrier and part of the interceptor (#37049)
```
  072e7801
- W
  cancle threadpool before deconstruction interpretorcore (#37034) · f0c77378
  由 wanghuancoder 提交于 11月 10, 2021
```
* cancle thread when exit, test=develop

* gc to unique_ptr, test=develop

* refine, test=develop

* fix namespace, test=develop
```
  f0c77378
- L
  
  fix brpc dependences (#37064) · c9763006
  由 LiYuRio 提交于 11月 10, 2021
  
  c9763006
09 11月, 2021 9 次提交
- S
  
  fix bugs when build in windows with_inference_api_test=on (#36973) · fd15477f
  由 Sing_chan 提交于 11月 09, 2021
  
  fd15477f
- A
  
  Refactor InterpretorCore and Modify into BlockDesc (#37056) · a6e99dc7
  由 Aurelius84 提交于 11月 09, 2021
  
  a6e99dc7
- Z
  Refine param conversion logic in layer.to (#36862) · 993ec76a
  由 zhangbo9674 提交于 11月 09, 2021
```
* refine layer to

* delete comment

* refine logic

* refine code

* refine pure_fp16_init

* refine comment
```
  993ec76a
- H
  
  optimize backward (#37055) · aac00f6a
  由 Haohongxiang 提交于 11月 09, 2021
  
  aac00f6a
- H
  PR to Add Paddle-CINN CI (#36989) · 71816707
  由 Huihuang Zheng 提交于 11月 09, 2021
```
PR to Add Paddle-CINN CI
```
  71816707
- A
  
  fix CompileProgram in Executor (#37036) · 77a8c94b
  由 Aurelius84 提交于 11月 09, 2021
  
  77a8c94b
- W
  delete profiler.cuda_profiler (#36524) · d817388e
  由 wanghuancoder 提交于 11月 09, 2021
```
* delete profiler.cuda_profiler, test=develop

* delete nvprof, test=develop

* add required: gpu, test=develop

* remove cuda_profiler, test=develop
```
  d817388e
- Z
  Try to fix CUDA Graph H2D copy bug (#36987) · 2a143f84
  由 Zeng Jinle 提交于 11月 09, 2021
```
* try to fix CUDA Graph H2D copy bug

* remove useless code

* fix ci

* fix ROCM CI

* fix CUDA_VERSION

* improve CI coverage
```
  2a143f84
- T
  
  add gather_nd/tile op for kunlun (#37029) · 819b9589
  由 TTerror 提交于 11月 09, 2021
  
  819b9589

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功