提交 · 79800978d923643249fde02840708edc13a0f2a6 · BaiXuePrincess / Paddle

23 11月, 2021 18 次提交
- Q
  [XPU] Reorganize xpu device codes in platform, test=develop (#37428) · 79800978
  由 Qi Li 提交于 11月 23, 2021
```
* [XPU] Reorganize xpu device codes in platform, test=develop

* fix xpu_header.h, test=develop
```
  79800978
- L
  Add support bias is none for fused_attention op. (#37411) · 1a8786cf
  由 Li Min 提交于 11月 23, 2021
```
Add support for bias is none for fused_attention op.
```
  1a8786cf
- W
  
  set feed var skip inplace, test=develop (#37467) · 4812eda5
  由 wanghuancoder 提交于 11月 23, 2021
  
  4812eda5
- Y
  
  [fleet_executor] Update with collective (#37462) · df14dbf0
  由 Yuang Liu 提交于 11月 23, 2021
  
  df14dbf0
- F
  
  use ShareBufferWith instead of ShareDataWith for ops with view mechanism (#37464) · 81349970
  由 Feiyu Chan 提交于 11月 23, 2021
  
  81349970
- W
  fix problem of dcnv2 trt (#37345) · e91141fb
  由 wangxinxin08 提交于 11月 23, 2021
```
* modify code about fp16 of dcnv2 trt
```
  e91141fb
- Z
  
  Removed debug code (#37447) · 586bafbd
  由 Zhanlue Yang 提交于 11月 23, 2021
  
  586bafbd
- L
  [new-exec] sync scope and variable_scope when init executor (#37445) · 33653195
  由 Leo Chen 提交于 11月 23, 2021
```
* sync scope and variable_scope when init executor

* set var_desc for new var
```
  33653195
- C
  
  fix test_egr_ds_auotgrad_meta compile failed (#37459) · 399ddf99
  由 Chen Weihang 提交于 11月 23, 2021
  
  399ddf99
- W
  [Paddle Inference] Fix_nearest: align_corners != true (#37368) · bc150edc
  由 Wangzheee 提交于 11月 23, 2021
```
* fix_nearest

* fix_nearest

* fix_nearest

* fix_nearest
```
  bc150edc
- Z
  
  fix CMakeLists. test=develop (#37454) · ccad31f5
  由 zmx 提交于 11月 23, 2021
  
  ccad31f5
- S
  Enhance the error message of scatter op (#37429) · 11b17c88
  由 sneaxiy 提交于 11月 23, 2021
```
* enhance scatter err msg check

* fix ci error
```
  11b17c88
- Y
  [PTen]Elementwise_div Kernel Refactor (#37418) · 32d9beef
  由 YuanRisheng 提交于 11月 23, 2021
```
* elementwise_div refactor

* fix compile bugs in windows ci
```
  32d9beef
- J
  Refactor dygraph to eager -- Autograd info (#37406) · c5ad3d06
  由 Jiabin Yang 提交于 11月 23, 2021
```
* Add EagerTensor and tests

* remove useless enforce

* remove comment in cmake

* support autograd meta

* support grad node info test

* support grad_node_info

* add more edge test

* remove Python.h

* refine error code

* add error type in error msg

* given default null name for tensor
```
  c5ad3d06
- R
  [NPU] Added HCCL backend support in dygraph mode (#36285) · 83e55cff
  由 ronnywang 提交于 11月 23, 2021
```
* Added HCCL backend support in dynamic graph mode

* fix segmentation fault

* add ut
```
  83e55cff
- Z
  Bug fix for snapshotting VariableWrapper with initialized tensor but e… (#37410) · e58ac121
  由 Zhanlue Yang 提交于 11月 23, 2021
```
* Bug fix for snapshoting VariableWrapper with initialized tensor but empty allocation

* Added unittest for inplace&clear_gradient
```
  e58ac121
- C
  [PTen] Adapt to inference api dir for pten (#37415) · 73f4601d
  由 Chen Weihang 提交于 11月 22, 2021
```
* adapt to inference api dir for pten

* fix conflit with develop

* fix test_egr_ds_eager_tensor compile failed
```
  73f4601d
- A
  [NewExe] Support layout/dtype transform by adding transfer_layout/transfer_dtype op (#37299) · 2a1f009e
  由 Aurelius84 提交于 11月 23, 2021
```
* Add transfer_layout/dtype op

* clean useless codes

* fix unused var

* add optest in white.txt

* split into data_transfer.cc

* fix cmake

* modify according reviewer comment

* replace cast_op with transfer_dtype_op
```
  2a1f009e
22 11月, 2021 13 次提交

disable copying of datatype when sharing buffer between two tensors. (#37247) · 9ec1432d

由 Feiyu Chan 提交于 11月 22, 2021

* disable copying of datatype when sharing buffer between two tensors.
* fix for mkldnn operator kernels (elementwise_add, sum, softplus, softmax, scale, activation), mannually set the data type when reusing memory by ShareBufferWith.

9ec1432d

Add isclose op (#37135) · d2200e97

由 andyjpaddle 提交于 11月 22, 2021

* add isclose op, test=develop

* add isclose op, test=develop

* add isclose api, test=develop

* rm useless code

* rm useless code

* update python api of isclose

* add some unittest of isclose op, test=develop

d2200e97

Z

elu support alpha < 0 (#37316) · e3503de8
由 zhupengyang 提交于 11月 22, 2021

e3503de8
Z
Support zero value in dimension for slice (#37313) · e788c7b5
由 zyfncg 提交于 11月 22, 2021
```
* support zero dim for slice op

* support zero dim Tensor in set_value op

* polish some debug log
```
e788c7b5
Z

fix bug of indexing tensor with None (#37400) · de0cb386
由 zyfncg 提交于 11月 22, 2021

de0cb386
Z

Add backward function hook to dygraph (#37141) · 31344ab7
由 Zhanlue Yang 提交于 11月 22, 2021

31344ab7

Renamed Func and removed ENFORCE statement (#37348) · 2702af21

由 Weilong Wu 提交于 11月 22, 2021

* Removed one ENFORCE statement

* Changed func name to _share_buffer_to

* Improve error reporting information

* Updated the logic of _is_share_buffer_to func

2702af21

Refactor dygraph to eager (#37405) · a258badb

由 Jiabin Yang 提交于 11月 22, 2021

* Add EagerTensor and tests

* remove useless enforce

* remove comment in cmake

* fix test_error

* add depends on python

* Remove python.h

* Merge develop and add Eager tensor with test back

a258badb

[PTen] Add variable transform to/from ptenTensor and add cast kernel (#36916) · 5caa6fc5

由 chentianyu03 提交于 11月 22, 2021

* add cast kernel

* add cast cuda kernel

* add cast kernel

* make cast kernel output dtype undefined

* get cast dtype from vardesc

* move cast to manipulation and add test case

* add castinfershape

* avoid reinitilaze variable

* InitializeVariable support datatype

* merge develop branch

* fix merge bug

* revert modify initializeVariable

* revert modify on InitializeVariable

* revert modify on InitializeVariable

* mutable support reset dtype

* enable make pten tensor from variable when def_arg.type is undefined

* fix build pten ctx start_idx error

* copy pten out tensor to variable

* merge develop branch

* fix non pten kernel cast failed

* add reset allocation place for remake tensor

* fix inplace realloc error

* add mutable on pten kernles and remove unused cast files

* rename function names

* fix output type error

* fix conflict with develop branch

* set data type to variable with pten's dtype

* fix test_cast_api type mismatch

* densorTensro mutable_data support 0 bytes value

* fix the inplace bug of reshape kernel

* fix pten.backend != variable.place when moving storage, palce mismatch bug

* fix conflict with develop branch

* Fix bug of paddle::experimental::MovesStorage

* fix ReMakePtenDenseTensor place mismatch bug

* Revert "fix ReMakePtenDenseTensor place mismatch bug"

This reverts commit 86336032f60b8a15eacd2c1ff2fa513f5d8dfd1a.

* fix ReMakePtenDenseTensor place mismatch bug

* reverts the set_lod interface, test=develop

* modify by the review options

* modify error message

* add & for const input arguments

* add reference in params

* elementwise_sub add mutable_data

* fix ResetHolderWithType check size bug

* add dependence pten_tensor to test_cast_api object

* remove unused code to pass ci coverage
Co-authored-by: NChen Weihang <chenweihang@baidu.com>
Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
Co-authored-by: Nshixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

5caa6fc5

J

fix memeory_optimize_pass bug (#37324) · 075c22f6
由 JingZhuangzhuang 提交于 11月 22, 2021

075c22f6
L

[new feature] add local scope for interpretercore (#37379) · 1f0512be
由 Leo Chen 提交于 11月 22, 2021

1f0512be
W

[fleet_executor] Add compute interceptor (#37376) · 964e20e0
由 WangXi 提交于 11月 22, 2021

964e20e0
W

fix cuda_virtual_mem_allocator a bug, test=develop (#37390) · e28d5b89
由 wanghuancoder 提交于 11月 22, 2021

e28d5b89

20 11月, 2021 1 次提交
- J
  
  Revert "Refactor dygraph to eager (#37318)" (#37386) · 128bdf66
  由 Jiabin Yang 提交于 11月 20, 2021
  
  128bdf66
19 11月, 2021 8 次提交

J

Add corner case in scale calculation (#37352) · 4d891c00
由 joanna.wozna.intel 提交于 11月 19, 2021

4d891c00
L

bug fix shard_index (#37042) · b505ff96
由 lilong12 提交于 11月 19, 2021

b505ff96
L

Fix runtime graph on gpt, add debug message (#37361) · af83e79a
由 LiYuRio 提交于 11月 19, 2021

af83e79a
J
Optimize cinn_cache_key by replace GraphToProgram to Dot string (#37317) · edc3496f
由 jiangcheng 提交于 11月 19, 2021
```
* optimize cache-key by replace GraphToProgram to Dot string

* fix compile failure bug
```
edc3496f

Add fuse_resnet_unit pass (#36818) · 3cd3bf29

由 wuhuanzhou 提交于 11月 19, 2021

* GeneratePass support attr condition and mapping, test=develop

* fix coverage, test=develop

* Add fuse_resnet_unit pass, test=develop

* fix CI errors, test=develop

* fix CI errors, test=develop

* fix unittest error when compiling without CUDA, test=develop

* fix static ci error, test=develop

* limit kernel size must equal 1, test=develop

3cd3bf29

F

fix for cufft: some early versions of cufft do not define CUFFT_VERSION in the header (#37312) · d8191d06
由 Feiyu Chan 提交于 11月 19, 2021

d8191d06

Refactor dygraph to eager (#37318) · b962f5fe

由 Jiabin Yang 提交于 11月 19, 2021

* Add EagerTensor and tests

* remove useless enforce

* remove comment in cmake

* fix test_error

* add depends on python

b962f5fe

optimize graph-engine sample api's data-transfer process (#37341) · 9fc11db7

由 seemingwang 提交于 11月 19, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove

* when sample k is larger than neigbor num, return directly

* using random seed generator of paddle to speed up

* fix bug of random sample k

* fix code style

* fix code style

* add remove graph to fleet_py.cc

* fix blocking_queue problem

* fix style

* fix

* recover capacity check

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* add remove graph node; add set_feature

* fix distributed op combining problems

* optimize

* remove logs

* fix MultiSlotDataGenerator error

* cache for graph engine

* fix type compare error

* more test&fix thread terminating problem

* remove header

* change time interval of shrink

* use cache when sample nodes

* remove unused function

* change unique_ptr to shared_ptr

* simplify cache template

* cache api on client

* fix

* reduce sample threads when cache is not used

* reduce cache memory

* cache optimization

* remove test function

* remove extra fetch function

* graph-engine data transfer optimization
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

9fc11db7

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致