提交 · b8e4ec7de1e04f1a7fcf6c85ce016f9fef37ee8d · PaddlePaddle / Paddle

24 5月, 2021 4 次提交

[oneDNN] bump up oneDNN to 2.2.2 (#32685) · b8e4ec7d

由 Jacek Czaja 提交于 5月 24, 2021

* - bump up oneDNN to 2.2.2 (should reduce perf drops of mobilenet)

* - more recnet onednn 2.2.2 (some more bugfixes)

b8e4ec7d

G

open launch ps test=develop (#33044) · d0d5586d
由 gongweibao 提交于 5月 24, 2021

d0d5586d

fix potential overflow problem & node add & node remove & node clear (#33055) · 60ac1602

由 seemingwang 提交于 5月 24, 2021

* graph engine demo

* upload unsaved changes

* fix dependency error

* fix shard_num problem

* py client

* remove lock and graph-type

* add load direct graph

* add load direct graph

* add load direct graph

* batch random_sample

* batch_sample_k

* fix num_nodes size

* batch brpc

* batch brpc

* add test

* add test

* add load_nodes; change add_node function

* change sample return type to pair

* resolve conflict

* resolved conflict

* resolved conflict

* separate server and client

* merge pair type

* fix

* resolved conflict

* fixed segment fault; high-level VLOG for load edges and load nodes

* random_sample return 0

* rm useless loop

* test:load edge

* fix ret -1

* test: rm sample

* rm sample

* random_sample return future

* random_sample return int

* test fake node

* fixed here

* memory leak

* remove test code

* fix return problem

* add common_graph_table

* random sample node &test & change data-structure from linkedList to vector

* add common_graph_table

* sample with srand

* add node_types

* optimize nodes sample

* recover test

* random sample

* destruct weighted sampler

* GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* WeightedGraphEdgeBlob to GraphEdgeBlob

* pybind sample nodes api

* pull nodes with step

* fixed pull_graph_list bug; add test for pull_graph_list by step

* add graph table;name

* add graph table;name

* add pybind

* add pybind

* add FeatureNode

* add FeatureNode

* add FeatureNode Serialize

* add FeatureNode Serialize

* get_feat_node

* avoid local rpc

* fix get_node_feat

* fix get_node_feat

* remove log

* get_node_feat return  py:bytes

* merge develop with graph_engine

* fix threadpool.h head

* fix

* fix typo

* resolve conflict

* fix conflict

* recover lost content

* fix pybind of FeatureNode

* recover cmake

* recover tools

* resolve conflict

* resolve linking problem

* code style

* change test_server port

* fix code problems

* remove shard_num config

* remove redundent threads

* optimize start server

* remove logs

* fix code problems by reviewers' suggestions

* move graph files into a folder

* code style change

* remove graph operations from base table

* optimize get_feat function of graph engine

* fix long long count problem

* remove redandunt graph files

* remove unused shell

* recover dropout_op_pass.h

* fix potential stack overflow when request number is too large & node add & node clear & node remove
Co-authored-by: NHuang Zhengjie <270018958@qq.com>
Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
Co-authored-by: Nluobin06 <luobin06@baidu.com>
Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
Co-authored-by: Ntangwei12 <tangwei12@baidu.com>

60ac1602

L

Support OutType tmeplate argument in elementwise_broadcast branch (#33060) · d6aea4ac
由 limingshu 提交于 5月 24, 2021

d6aea4ac

22 5月, 2021 2 次提交

W

refine conv2d doc (#33045) · a6dc68b7
由 wangguanzhong 提交于 5月 22, 2021

a6dc68b7

Added oneDNN matmul grad BF16/FP32 kernel (#32968) · e2a3a6f7

由 jakpiase 提交于 5月 22, 2021

* added support for most matmul cases

* added more functionality

* full functionality of matmul op, fp32 only

* added bf16 tests and functionality

* added formatting

* changes after review

* minor change

* added reviewers suggestions

e2a3a6f7

21 5月, 2021 7 次提交
- C
  replace complex64/128 with complex template in cast Op (#33019) · 79d918d9
  由 chentianyu03 提交于 5月 21, 2021
```
* replace complex in set tensor from and to numpy

* replace complex template in cast op
```
  79d918d9
- 王
  
  add method for enhance pass,test=develop (#33004) · 79ed7177
  由王明冬提交于 5月 21, 2021
  
  79ed7177
- F
  optimize softmax with cross entropy hard label (#32290) · 7be6191b
  由 Feng Xing 提交于 5月 21, 2021
```
* optimize softmax with cross entropy hard label

* label ignore_index cleaning
```
  7be6191b
- T
  fix model_benchmark ci (#33035) · 0e5d832c
  由 tianshuo78520a 提交于 5月 21, 2021
```
* fix model_bhecnmark ci

* fix model_bhecnmark ci
```
  0e5d832c
- L
  
  paddle.to_tensor supports LoDTensor (#33027) · a85edddb
  由 Leo Chen 提交于 5月 21, 2021
  
  a85edddb
- Y
  update conda build script for cuda11 (#29594) · 44668a7a
  由 YUNSHEN XIE 提交于 5月 21, 2021
```
* update conda build script for cuda11

* update conda build script

* modified wheel name

* update conda_build

* fix error

* add cudnn8.1 for cuda11.2

* fix format error
```
  44668a7a
- P
  [NPU] cast indices and label if their type is not consistent in accuracy npu op (#33016) · 70dc5f49
  由 pangyoki 提交于 5月 21, 2021
```
* cast indices and label if their type is not consistent

* fix bug

* add unittest
```
  70dc5f49
20 5月, 2021 8 次提交

fix gather op and add logsumexp op on kunlun (#32931) · a96e8bc9

由 TTerror 提交于 5月 20, 2021

* fix gather op and add logsumexp op on kunlun

* update xpu depence

* update tests and fix elementwise_add

a96e8bc9

B

revert_matmulv2_npu (#33014) · be8e94aa
由 Baibaifan 提交于 5月 20, 2021

be8e94aa
A
[Dy2Stat]Support convert sublayers in Sequential Container (#32978) · e409c7ce
由 Aurelius84 提交于 5月 20, 2021
```
* Support convert sublayers in Sequential Container

* remove paddle.jit.set_code_level
```
e409c7ce
L

Polish code for setitem and getitem (#32911) · 848cabfc
由 liym27 提交于 5月 20, 2021

848cabfc

Add complex template type (#32857) · 738bf20e

由 chentianyu03 提交于 5月 20, 2021

* add complex template file

* add numtraits for complex template

* add complex template type register

* modify specify template of complex

* modify specify template of complex

* modify specify template of complex

* modify specify template of complex

* make TensorCheckerVisitor support complex type

* fix operator= error

* add complex template

* add complex template type

* add complex template type to pyarray transform

* add complex template type to pyarray transform

* remove complex type for dlpack register

* set dlpack supprot complex type

* set dlpack supprot complex type

* set dlpack supprot complex type

* remove explict for complex constructor

* add complex unit test file

738bf20e

S

remove unused shell (#32954) · 8854786a
由 seemingwang 提交于 5月 20, 2021

8854786a
Z

handle remove files in pr (#32940) · 7e27b5aa
由 zhangchunle 提交于 5月 20, 2021

7e27b5aa
L

Binary functor envoking of elementwise broadcast (#32928) · 14949521
由 limingshu 提交于 5月 20, 2021

14949521

19 5月, 2021 10 次提交

fix test_paddle_save_load and test_paddle_save_load_binary (#32949) · 6f8de31d

由 WeiXin 提交于 5月 19, 2021

* fix test_paddle_save_load and test_paddle_save_load_binary

* fix unittest:test_paddle_save_load and test_paddle_save_load_binary

* delete *.pyc

* add comment for unittest

6f8de31d

CI skip inference test if only python files modified (#32962) · 7896b51a

由 wuhuanzhou 提交于 5月 19, 2021

* CI skip inference test if only python files modified, test=develop

* fix compilation error on ROCM, test=develop

* fix cmake error on PR-CI-ROCM-Compile, test=develop

7896b51a

石

fix the jetson allocator strategy, test=develop (#32932) · 1e1600eb
由石晓伟提交于 5月 19, 2021

1e1600eb

[Rocm] fix test of random_crop_op & logsumexp (#32824) · aa4a56fc

由 zhulei 提交于 5月 19, 2021

* [Rocm] fix test of random_crop_op

* [Rocm] fix test of random_crop_op

* [Rocm] fix test of random_crop_op & simple_rnn_op

* [Rocm] fix test of random_crop_op & simple_rnn_op & logsumexp

* [Rocm] fix test of random_crop_op & simple_rnn_op & logsumexp

* [Rocm] fix test of random_crop_op & simple_rnn_op & logsumexp

* [Rocm] fix test of random_crop_op & logsumexp

aa4a56fc

Optimize 102Flowers dataset reading speed (#31408) · 67c2700f

由 GT-Zhang 提交于 5月 19, 2021

* Fix slow data reading, In the old version, one epoch read time of this data set was about 5371 seconds(MacBook Pro Retina, 13-inch, Early 2015 2.7 GHz), and a batch took 211 seconds, It's too painful to use. Now decompress the data in advance (about 10 seconds). Each epoch of reading takes about 3 seconds(MacBook Pro Retina, 13-inch, Early 2015 2.7 GHz), and a batch takes 0.017 seconds more.

* Run CI, test=allcase

* fix qq group number. test=document_fix

 fix qq group number. test=document_fix

* fix qq group number. test=document_fix 

fix qq group number. test=document_fix

67c2700f

Y
remove ut from parallel_ut list (#32788) · f0b2f598
由 YUNSHEN XIE 提交于 5月 19, 2021
```
* remove ut from parallel_ut list

* remove some timeout ut
```
f0b2f598
A
[Dy2Stat]BugFix StaticAanlysis with gast.Subscript (#32969) · c2852610
由 Aurelius84 提交于 5月 19, 2021
```
* BugFix StaticAanlysis with gast.Subscript

* remove codes
```
c2852610
J

[oneDNN] Pool softmax and LRN access to cache optimized (#32922) · 56008aa1
由 Jacek Czaja 提交于 5月 19, 2021

56008aa1
C

add enforce check for set_value (#32972) · af89a943
由 Chen Weihang 提交于 5月 19, 2021

af89a943
Z

Fix Link unittest exe random fail (#32891) · d7d7fae1
由 Zhou Wei 提交于 5月 19, 2021

d7d7fae1

18 5月, 2021 9 次提交
- P
  [NPU] fix accuracy npu op bug and change top_k's output to int64 (#32935) · c66586b4
  由 pangyoki 提交于 5月 18, 2021
```
* Output indices of top_k npu op change to int64

* fix accuracy npu bug

* fix errors

* change cast method to FillNpuTensorWithConstant

* change cast method to FillNpuTensorWithConstant
```
  c66586b4
- J
  Update paths to Quant models (#32870) · 5d627488
  由 joanna.wozna.intel 提交于 5月 18, 2021
```
* Update paths to Quant models

* Update description
```
  5d627488
- L
  
  add unit8 for concat (#32850) · 53580bb4
  由 liuyuhui 提交于 5月 18, 2021
  
  53580bb4
- W
  
  relu supports bfloat16 data type (#32542) · bcd40f21
  由 wuhuanzhou 提交于 5月 18, 2021
  
  bcd40f21
- A
  [UnitTest]Enhance grep syntax to avoid random failed of test_dist_mnist_dgc_nccl (#32946) · b5882c6e
  由 Aurelius84 提交于 5月 18, 2021
```
* Enhance grep syntax to avoid random failed

* Enhance grep syntax to avoid random failed
```
  b5882c6e
- A
  [Dy2Static] Refactor param_guard logic of @to_static (#32867) · b8d493df
  由 Aurelius84 提交于 5月 18, 2021
```
* Add param_guard in ParameterList to support @to_static

* Refactor param_guard of @to_static

* fix unittest failed

* add more unittest
```
  b8d493df
- Q
  
  update kunlun bkcl to support multi-machine (#32577) · 59b74ee7
  由 QingshuChen 提交于 5月 18, 2021
  
  59b74ee7
- T
  unit double (#32902) · 29bbeb07
  由 Thunderbrook 提交于 5月 18, 2021
```
* unit double

* unit double
```
  29bbeb07
- Z
  
  notest;test=zcltest (#32821) · 59997d53
  由 zhangchunle 提交于 5月 18, 2021
  
  59997d53

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功