提交 · 8caf951cd9795b9f02180236711490b203fe4ee5 · 机器未来 / Paddle

24 9月, 2021 6 次提交

Add paddle.linalg.solve OP (#35715) · 8caf951c

由 Weilong Wu 提交于 9月 24, 2021

* Add linalg.solve op, test=develop

* Fix a bug caused by accidental deletion

* updated description and fix a bug: missing a comma

* Add linalg.solve op, test=develop

* updated solve op backward logic

* updated solve op backward logic again

* Add linalg.solve Op, test=develop

* Updated and modified to fit CI requirements

* Fix a bug

* 1)Add more test cases; 2)Fix a wrong usage in reduces operation; 3)Remove redundant code

* Remove redundant comments

* 1)Removed redundant code; 2)Updated to enhance code robustness

* Removed redundant code

* Updated API documents

8caf951c

fix distributed ops combining problems (#35942) · 4c35f515

由 seemingwang 提交于 9月 24, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove

* when sample k is larger than neigbor num, return directly

* using random seed generator of paddle to speed up

* fix bug of random sample k

* fix code style

* fix code style

* add remove graph to fleet_py.cc

* fix blocking_queue problem

* fix style

* fix

* recover capacity check

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* fix distributed op combining problems

* optimize

* remove logs
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

4c35f515

L

fix cusparse compile bug in windows CUDA11.2, test=develop (#35941) · 7c3567ea
由 Liu-xiandong 提交于 9月 24, 2021

7c3567ea
B

add emb_eltwise_layernorm trt converter test case (#36027) · 0bbaf9bd
由 baoachun 提交于 9月 24, 2021

0bbaf9bd
B
add multihead_matmul trt converter test case (#36023) · fcaa64b3
由 baoachun 提交于 9月 24, 2021
```
* add multihead_matmul trt converter test case

* move attribute check to op_teller
```
fcaa64b3
W
add the shape check for the matmul (#35791) · 8e19d1ba
由 wawltor 提交于 9月 24, 2021
```
* add the shape check for the matmul

* remove the test case for the linear
```
8e19d1ba

23 9月, 2021 7 次提交
- L
  Optimize workqueue (#35931) · 4e7bd9c3
  由 liutiexing 提交于 9月 23, 2021
```
* add align for WorkQueue

* WorkQueue update

* Revert "WorkQueue update"

This reverts commit 14ce793dbb204f8ddec63c34b3b72a73c7cdb93a.

* optimize WorkQueue
```
  4e7bd9c3
- P
  
  fix ernie-int8 compile error on windows (#35972) · d6d2dafa
  由 Peihan 提交于 9月 23, 2021
  
  d6d2dafa
- W
  
  fix trt problem (#35938) · 8d0922ed
  由 Wilber 提交于 9月 23, 2021
  
  8d0922ed
- F
  
  Replace Eigen with Lapack library for eigvals OP kernel (#35909) · 9b8aafe5
  由 From00 提交于 9月 23, 2021
  
  9b8aafe5
- L
  
  Add fused_attention_op: add impl wrappers. (#35903) · 88ea8e6f
  由 Li Min 提交于 9月 23, 2021
  
  88ea8e6f
- T
  add argmax and iou_similarity for kunlun (#35836) · 7bf84e2d
  由 TTerror 提交于 9月 23, 2021
```
* add argmax and iou_similarity for kunlun

* add argmax and iou_similarity for kunlun

* add argmax and iou_similarity for kunlun
```
  7bf84e2d
- W
  add pass_desc_py_proto depends, test=develop (#35864) · 1548407d
  由 wuhuanzhou 提交于 9月 23, 2021
```
add pass_desc_py_proto depends
```
  1548407d
22 9月, 2021 24 次提交
- T
  Fix copy elision warning (#35885) · 47d6bc86
  由 Tomasz Socha 提交于 9月 22, 2021
```
* Fix copy elision warning

* Remove redundand code
```
  47d6bc86
- Z
  
  ResnetUnitOp implemented by cuDNN fused op(backend code) (#35557) · 736a7388
  由 Zhang Zheng 提交于 9月 22, 2021
  
  736a7388
- S
  move variable UPLOAD_TP_FILE to the beginning or it cant be initialized when... · 482f062d
  由 Sing_chan 提交于 9月 22, 2021
```
move variable UPLOAD_TP_FILE to the beginning or it cant be initialized when running build-whl task (#35895)
```
  482f062d
- Z
  
  fix adamw DeprecationWarining (#35869) · f67a50bd
  由 zhaoyingli 提交于 9月 22, 2021
  
  f67a50bd
- Z
  [AMP]split minimize and add unscale_ for GradScaler (#35825) · bf6f0e54
  由 zhangbo9674 提交于 9月 22, 2021
```
* split minimize() to step() + update()

* add unscale and step for grad_scaler

* add unittest

* refine code in minimize

* delete step in loss_scaler

* fix example bug

* refine comment

* refine unittest

* add unittest
```
  bf6f0e54
- R
  [NPU] add randperm_op_npu (#35763) · 4f0c3278
  由 ronnywang 提交于 9月 22, 2021
```
* add randperm_op_npu

* fix test_set_value_op_npu
```
  4f0c3278
- T
  op:transpose_op supports bool type (#35886) · 0c6ee945
  由 TeslaZhao 提交于 9月 22, 2021
```
* Pass compat of conv_transpose_bias_mkldnn_fuse_pass

* Fix a bug of strided_slice op, about the axes parameter access memory out of bounds

* Fix a bug of transpose op, about accessing memory out of bounds of the perm param

* op:transpose_op supports bool type
```
  0c6ee945
- H
  Det &Slogdet (#34992) · 9ce45ddd
  由 huangxu96 提交于 9月 22, 2021
```
Add new API : paddle.linalg.det & paddle.linalg.slogdet

API Alias：paddle.det& paddle.slogdet
```
  9ce45ddd
- Y
  
  update paddle2onnx version to 0.8.2 in unittest_py/requirements.txt (#35837) · 00e0e358
  由 yeliang2258 提交于 9月 22, 2021
  
  00e0e358
- P
  support ernie-int8 test and prune op attribute test (#35890) · e8789c11
  由 Peihan 提交于 9月 22, 2021
```
* support ernie-int8 test and prune op attribute test

* remove using and use namespace

* remove macro and use shell instead

* Revert "remove macro and use shell instead"

This reverts commit 615964b149d7de7825b341936b42be22a4bc0091.

* fix grammar error

* fix shell error
```
  e8789c11
- W
  
  add no need buffer check, test=develop (#35790) · 7ebbcbbc
  由 wanghuancoder 提交于 9月 22, 2021
  
  7ebbcbbc
- Z
  
  refine FLAGS approval (#35904) · 7ba69249
  由 Zeng Jinle 提交于 9月 22, 2021
  
  7ba69249
- J
  
  [Inference] Support NNAdapter and ascend310 (#35226) · 10e53044
  由 JingZhuangzhuang 提交于 9月 22, 2021
  
  10e53044
- W
  
  fix: delete_quant_dequant_filter_op_pass, delete_quant_dequant_op_pass (#35879) · 5cda6b2b
  由 Wangzheee 提交于 9月 22, 2021
  
  5cda6b2b
- J
  fix conv2d convert test (#35627) · 1238115e
  由 JingZhuangzhuang 提交于 9月 21, 2021
```
* support nnadapter and ascend310

* modify code

* add anchor_generator convert test

* add gelu convert test

* add conv2d convert test

* modify anchor_operator convert test

* modify conv2d test

* modify con2d convert test

* modify conv2d convert test

* modify conv2d convert test

* modify conv2d test

* fix WITH_PYTHON compile error

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file

* modify test file
Co-authored-by: Nxiaoxiaohehe001 <hiteezsf@163.com>
Co-authored-by: Njiweibo <jiweibo@baidu.com>
```
  1238115e
- J
  
  Add quant2 int8 lstm model test (#35887) · be4d0026
  由 joanna.wozna.intel 提交于 9月 22, 2021
  
  be4d0026
- W
  fix feed for new executor (#35803) · 4c2a06df
  由 wanghuancoder 提交于 9月 21, 2021
```
* fix feed, test=develop

* delete one test case, test=develop
```
  4c2a06df
- W
  
  add timeline(recordevent) for new executor, test=develop (#35831) · 5574c8cf
  由 wanghuancoder 提交于 9月 21, 2021
  
  5574c8cf
- W
  refine gc for new_executor (#35764) · fab1a029
  由 wanghuancoder 提交于 9月 21, 2021
```
* refine gc for new_executor, test=develop

* refine, test=develop

* refine, test=develop

* merge, test=develop
```
  fab1a029
- A
  Modify H2D and D2H as kQueue::Sync and Polish Schedule logic (#35866) · fe35496b
  由 Aurelius84 提交于 9月 22, 2021
```
* Modify H2D and D2H as kQueue::Sync

* fix interface error
```
  fe35496b
- [2.2]support extern third_party lapack API on Linux/Windows/Mac (#35690) · ae65257d
  由 zhouweiwei2014 提交于 9月 22, 2021
```
* support extern third_party lapack on Linux/Windows/Mac

* fix ci
```
  ae65257d
- F
  
  disable tests for fft on windows with gpu (#35872) · 5af6081a
  由 Feiyu Chan 提交于 9月 22, 2021
  
  5af6081a
- Z
  
  fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (#35862) · 12ab017e
  由 zhangbo9674 提交于 9月 22, 2021
  
  12ab017e
- W
  
  add dilation check for conv (#35838) · 77134300
  由 wangguanzhong 提交于 9月 22, 2021
  
  77134300
21 9月, 2021 2 次提交

G

support fp16 (#35888) · 087c23a9
由 Guoxia Wang 提交于 9月 21, 2021

087c23a9

Reuse OneDNN handler for SGD and SUM for SelectedRows input tensors. (#35510) · 799f3861

由 Adam Osewski 提交于 9月 20, 2021

* Create stateful OneDNNAXPYHandler object.

This makes it possible to call it multiple times without recreating the
oneDNN primitives every time.

* Prepare SGDOpKernel to reuse its implementation from OneDNN kernel.

* OneDNN SGD kernel.

* Update call to use new OneDNNAXPYHandler object api.

* Setup seed in proper place.

* Enable OneDNN kernel only for single case.

* For dense param and sparse grad.

* Small refactor.

* Enable oneDNN by op attr or by cmd line flag.

* Use int64_t type for number of elements.

* Support dense param and grad from OneDNN kernel.

* Enable SGD OneDNN kernel when use MP BF16 optimizer.

* Force non-copyable/movable OneDNNAXPYHandler.

* Reuse OneDNNAXPYHandler for spare tensors in SUM op.

* Fix SFINAE rules.

* Remove recording event inside AXPY.

* Get rid of internal primitive caching.

* Stop use PP cache mechanims to store mem and primitive obj.
* Handler obj store and reuse needed desc & prim

* Do not derive from MKLDNNHandlerT

799f3861

19 9月, 2021 1 次提交

Optimization of pool2d grad (#35389) · 86685190

由 limingshu 提交于 9月 19, 2021

* Optimization of pool2d grad, first commit.

* remove useless print codes

* refine codes

* refine codes

* seal more operation into template specialization

* fix template struct error in MaxPool2dGrad.

* Fix header including error

* refine code with comment

* Seal the param-preparation codes into function for common use.

* Seal the param-preparation codes into function for common use.

* Seal the param-preparation into funciton and make it common for other kernels

* polish code and erase useless template speicalization

* Rerun triger

* rerun trigger

86685190

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致