提交 · 2e4cb27927a3ea0f58b25d534e90ac68989e8897 · PaddlePaddle / Paddle

28 12月, 2021 9 次提交
- W
  
  fix ci problem (#38474) · 2e4cb279
  由 Wilber 提交于 12月 28, 2021
  
  2e4cb279
- H
  Add API and op for take_along_axis (#38396) · 3310f519
  由 huangxu96 提交于 12月 28, 2021
```
* add API and op for take_along_axis

* fix compile dependency problem and add example code and doc

* add unitest

* delete some code for CI coverage

* fix code style problem

* fix as review
```
  3310f519
- T
  Add Amax and Amin API (#38417) · 340dfb26
  由 Tao Luo 提交于 12月 28, 2021
```
* add amax/amin

* support axis is list
```
  340dfb26
- C
  [pten] remove in_type arg in cast kernel (#38486) · 0637b9a6
  由 chentianyu03 提交于 12月 28, 2021
```
* remove intype arg in cast kernel

* modify conj config in api.yaml by dictionary order

* rm unused code in cast_kernel.cu
```
  0637b9a6
- H
  add reduce_prod_xpu. fix reduce_mean_xpu bug. (#38481) · 78836bb7
  由 houj04 提交于 12月 28, 2021
```
* add reduce_prod_xpu. fix reduce_mean_xpu bug.

* iadd reduce_prod_xpu. fix reduce_mean_xpu bug. test=kunlun
```
  78836bb7
- B
  add mul_lstm_fuse_pass ut (#37795) · 1db61c3e
  由 baoachun 提交于 12月 28, 2021
```
* add mul_lstm_fuse_pass ut

* update mul_lstm_fuse_pass ut

* update ut

* update ut

* update ut

* add CPU ut cmake setting

* update ut
```
  1db61c3e
- Z
  add pass base unittest (#38504) · ee5f3641
  由 zhaoyingli 提交于 12月 28, 2021
```
* add pass base unittest

* update gpt model
```
  ee5f3641
- S
  
  fix compile dir conflict with include_dirs (#38479) · e42ed7d1
  由 sneaxiy 提交于 12月 28, 2021
  
  e42ed7d1
- L
  Fix scatter_op fp16 perf problem. (#38499) · 33ce249f
  由 Li Min 提交于 12月 28, 2021
```
* Fix scatter_op fp16 perf problem.

* Add scatter into black list.

* Add scatter into black list for dygraph.
```
  33ce249f
27 12月, 2021 9 次提交

fix english doc of some API (#38468) · 5b6b88ab
由 zhouweiwei2014 提交于 12月 27, 2021

5b6b88ab
S

fix bugs in fp16 for dp (#38405) · 1ab5c511
由 ShenLiang 提交于 12月 27, 2021

1ab5c511
P
fix accumulator bug when multiple inplace OPs are executed continuously (#38406) · 113c8b93
由 pangyoki 提交于 12月 27, 2021
```
* fix accumulator bug

* fix unittest
```
113c8b93
Z
Refine clip_by_global_norm (#38209) · 65f7fa0d
由 zhangbo9674 提交于 12月 27, 2021
```
* refine clip

* delete unused code

* refine logic for clip
```
65f7fa0d
B

update mkldnn matmul_transpose_reshape fuse pass ut (#38467) · 9cfdae91
由 baoachun 提交于 12月 27, 2021

9cfdae91

add matmulv2_transpose_reshape_pass ut (#37416) · f664a533

由 baoachun 提交于 12月 27, 2021

* update mkldnn matmul_v2_transpose_reshape_fuse_pass ut

* update mkldnn matmul_v2_transpose_reshape_fuse_pass ut

* update ut

* update ut

f664a533

fix renorm (#38459) · b0c7144a

由 seemingwang 提交于 12月 27, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove

* when sample k is larger than neigbor num, return directly

* using random seed generator of paddle to speed up

* fix bug of random sample k

* fix code style

* fix code style

* add remove graph to fleet_py.cc

* fix blocking_queue problem

* fix style

* fix

* recover capacity check

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* fix distributed op combining problems

* optimize

* remove logs

* fix MultiSlotDataGenerator error

* cache for graph engine

* fix type compare error

* more test&fix thread terminating problem

* remove header

* change time interval of shrink

* use cache when sample nodes

* remove unused function

* change unique_ptr to shared_ptr

* simplify cache template

* cache api on client

* fix

* reduce sample threads when cache is not used

* reduce cache memory

* cache optimization

* remove test function

* remove extra fetch function

* graph-engine data transfer optimization

* support graph_split load&query

* remove logs

* change shards to pointer vector

* use inference

* remove test code

* renorm op

* simplify renorm op

* recover local changes

* recover renorm op kernel

* fix init

* add blanklines in renorm doc

* fix import

* fix import

* add renorm to init.py
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

b0c7144a

S

refine CUDA Graph (#38401) · 5f7e4a21
由 sneaxiy 提交于 12月 27, 2021

5f7e4a21
Z
[AMP] Fix amp.decorate bug: parameters for non leaf layers cannot be decotated (#38402) · 5d902954
由 zhangbo9674 提交于 12月 27, 2021
```
* fix bug

* refine code

* refine code

* refine code
```
5d902954

24 12月, 2021 15 次提交

add nansum api to math (#38137) · 6554cc10

由 wangguanqun 提交于 12月 24, 2021

* add nansum api

* delete layerhelper

* add nansum to all and tensor_method_func

* update doc

* update doc

* update doc

6554cc10

renorm op (#38130) · 6982871d

由 seemingwang 提交于 12月 24, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove

* when sample k is larger than neigbor num, return directly

* using random seed generator of paddle to speed up

* fix bug of random sample k

* fix code style

* fix code style

* add remove graph to fleet_py.cc

* fix blocking_queue problem

* fix style

* fix

* recover capacity check

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* fix distributed op combining problems

* optimize

* remove logs

* fix MultiSlotDataGenerator error

* cache for graph engine

* fix type compare error

* more test&fix thread terminating problem

* remove header

* change time interval of shrink

* use cache when sample nodes

* remove unused function

* change unique_ptr to shared_ptr

* simplify cache template

* cache api on client

* fix

* reduce sample threads when cache is not used

* reduce cache memory

* cache optimization

* remove test function

* remove extra fetch function

* graph-engine data transfer optimization

* support graph_split load&query

* remove logs

* change shards to pointer vector

* use inference

* remove test code

* renorm op

* simplify renorm op

* recover local changes

* recover renorm op kernel

* fix init

* add blanklines in renorm doc

* fix import

* fix import
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

6982871d

T
add gradient unittest and update code example for max/min (#38393) · ee69f437
由 Tao Luo 提交于 12月 24, 2021
```
* add gradient unittest and update code example for max/min

* update docs

* remove _get_reduce_all_value
```
ee69f437
Z

[AMP] Add multi_precision for sgd (#38231) · a4d07bb9
由 zhangbo9674 提交于 12月 24, 2021

a4d07bb9
L

set env for test_standalone_executor (#38430) · 5ab6ebaf
由 Leo Chen 提交于 12月 24, 2021

5ab6ebaf
J

[Auto Paralle] partitioner refactor (#37853) · c4fdb057
由 JZ-LIANG 提交于 12月 24, 2021

c4fdb057
Z

new API inner&outer (#37706) · b463dff4
由 zhiboniu 提交于 12月 24, 2021

b463dff4
add new API/OP:paddle.Tensor.exponential_ (#38256) · 33185000
由 zhouweiwei2014 提交于 12月 24, 2021
```
* add new API/OP:paddle.Tensor.exponential_

* fix CI
```
33185000
Y
add pull gpups sparse op (#37124) · 572b3e90
由 yaoxuefeng 提交于 12月 24, 2021
```
 add pull gpups sparse op
```
572b3e90
Z

Add new API cholesky_solve (#38167) · 39f7c41f
由 zhiboniu 提交于 12月 24, 2021

39f7c41f
add new API/OP: paddle.poisson (#38117) · bcf86e5c
由 zhouweiwei2014 提交于 12月 24, 2021
```
* add new API/OP:paddle.poisson

* fix comment
```
bcf86e5c

[Dy2stat]Fix error when calling sublayer's non-forward func in dy2stat (#37296) · 7339a124

由 0x45f 提交于 12月 24, 2021

* fix error when calling sublayer's non-forward func in dy2stat

* fix circular import using an inelegant way

* deal with parameters

* remove param_guard in __call__

* remove comment

* fix error when jit.load

* rename block var

* remove wrong code

* add unit test

7339a124

A
[Dy2Stat]Consider InputSpec.name to calculate Cachekey hash id (#38273) · 8e6d5d2b
由 Aurelius84 提交于 12月 24, 2021
```
* Consider InputSpec.name to calculate Cachekey hash id

* fix function
```
8e6d5d2b

add conv+hard_sigmoid and conv+hard_swish fuse pass ut (#37553) · a858326a

由 baoachun 提交于 12月 24, 2021

* add conv+hard_sigmoid fuse pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_hard_sigmoid_mkldnn_fuse_pass ut

* update conv+hard_sigmoid and conv+hard_swish fuse pass ut

* update ut

* update ut

a858326a

Support test imperative basic in eager (#38313) · d48f7c89

由 Jiabin Yang 提交于 12月 24, 2021

* Rearranged Eager AutoCodeGen directory structure

* Removed USE_OP in Eager AutoCodeGen

* Enabled generation for Operators without Grad/Inputs/Outputs

* Resolved operators without input

* Fixed merge conflicts

* Enabled Eager AutoCodeGen for 10+ more operators

* Refactored Eager AutoCodeGen with more organized helper objects

* Enabled Eager AutoCodeGen for operators with multiple OpBases

* Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument

* Handled Dispensable Inputs/Outputs in Eager AutoCodeGen

* Adjusted function generation/call between Python-C API & Dygraph API

* Synchronized auto-generated Python-C API with Dygraph Forward Functions

* support more eager tensor api

* fix merge compile error

* fix compile error and fit develop code

* support pure CPU

* fix some logic error in eager_mode

* support _varbase_creator in eager mode

* Added safe_initialized interface to EagerTensor for use in processing dispensable inputs

* for eager mode

* refine

* support multiple constructor for eager tensor

* add place related code

* polish code

* specific randint with dtype of int64

* Support pure cpu test

* eager logic

* refine test in pure cpu

* eager logic

* eager logic

* eager logic, test=develop

* skip core.eager when in inference, test=develop

* refine, test=develop

* refine, test=develop

* call RetainGrad after run forward kernel, test=develop

* refine, test=develop

* support dygraph util, meta, guard test

* support inference test

* refine test and fix initializer failed
Co-authored-by: Njim19930609 <jim19930609@gmail.com>
Co-authored-by: NWang Huan <wanghuan29@baidu.com>

d48f7c89

23 12月, 2021 7 次提交

X
move distribution.py into distribution package and split into different file... · a3e6f18c
由 Xiaoxu Chen 提交于 12月 23, 2021
```
move distribution.py into distribution package and split into different file for better scalability (#38047)
```
a3e6f18c

add control/status API (#37885) · 21b7ed3e

由 wuhuanzhou 提交于 12月 23, 2021

* add control/status API, test=develop

* fix import error, test=develop

* add is_grad_enabled unittest, test=develop

* add code comment for example code and API, test=develop

* add checking for type, test=develop

* add api description, test=develop

* fix docs index_en, test=document_fix

* fix doc of is_floating_point, test=document_fix

21b7ed3e

Add erfinv API (#38295) · 6b59b58c

由 wuhuanzhou 提交于 12月 23, 2021

* add erfinv API, test=develop

* fix gradient accuracy error, test=develop

* fix cuda compilation error on Windows, test=develop

* fix M_2_SQRTPI undeclared identifier on Windows, test=develop

6b59b58c

Z
【PTen】Add empty and empty_like kernel in pten (#38334) · 4221cd33
由 zyfncg 提交于 12月 23, 2021
```
* add empty and empty_like kernel in pten

* add empty dev_api
```
4221cd33

add mkldnn conv_elementwise_add_mkldnn_fuse_pass ut (#37612) · f88065d3

由 baoachun 提交于 12月 23, 2021

* add mkldnn conv_elementwise_add_mkldnn_fuse_pass ut

* update mkldnn conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update conv_elementwise_add_mkldnn_fuse_pass ut

* restrict conv2d data_format in conv_elementwise_add_mkldnn_fuse_pass

* update conv_elementwise_add_mkldnn_fuse_pass OpCompat

* update conv_elementwise_add_mkldnn_fuse_pass ut

* update ut

f88065d3

S

Fixed corner case in fill_constant (#38284) · 4e4d58b3
由 Siming Dai 提交于 12月 23, 2021

4e4d58b3
add new API: paddle.clone;Tensor.element_size;nn.utils.parameters_to_vector (#38020) · 0eb03ed7
由 zhouweiwei2014 提交于 12月 23, 2021
```
* add new API: paddle.clone;Tensor.element_size;nn.utils.parameters_to_vector

* fix comment
```
0eb03ed7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功