提交 · a373aa76451b83a9eeb7617f92bdc9c1ea6e9ff6 · BaiXuePrincess / Paddle

24 2月, 2021 16 次提交
- L
  fix the bug in expand_v2 op (#30984) · a373aa76
  由 lilong12 提交于 2月 24, 2021
```
* update, test=develop
```
  a373aa76
- T
  support multi node in heterps (#31102) · c4f279fe
  由 Thunderbrook 提交于 2月 24, 2021
```
* push multi node

* multi node

* MultiThread

* remove log

* solve bug in 30829
```
  c4f279fe
- L
  Add cublas_handle() to expose cublas_handle to ops (#31157) · ae2be49f
  由 liu zhengxi 提交于 2月 24, 2021
```
* add get_cublas_handle() api

* update format

* add unittests

* alter function name
```
  ae2be49f
- A
  [CustomOp] Support to specific extra_cflags and exctra_cuda_flags independently (#31059) · 406f4a75
  由 Aurelius84 提交于 2月 24, 2021
```
* split cxx/nvcc compile flags

* enhance input argument check

* rename extra_cflags into extrac_cxx_flags

* add name checking in setup

* fix test_dispatch failed

* fix word typo and rm usless import statement

* refine import statement

* fix unittest failed

* fix cuda flags error
```
  406f4a75
- Q
  Update doc for 2.0 API and some callback (#31180) · 572cc8bd
  由 qingqing01 提交于 2月 24, 2021
```
test=document_fix
```
  572cc8bd
- P
  [Paddle-TRT] support group_norm (#31040) · 00b09e86
  由 Pei Yang 提交于 2月 24, 2021
```
* add group norm plugin

* fix compile problems

* move concat axis check to trt op teller

* add nbDims for scale and bias nv dims

* add group norm unit test

* fix unittest

* add trt version restriction for group norm op teller

* fix unittest
```
  00b09e86
- C
  
  change test_multiprocess_reader_exception cmake (#31174) · c209751c
  由 Chen Weihang 提交于 2月 24, 2021
  
  c209751c
- Y
  
  fix ut timeout (#31061) · 15312145
  由 YUNSHEN XIE 提交于 2月 24, 2021
  
  15312145
- C
  [CustomOp] Add new paddle custom op so (#31141) · 1ce96fa1
  由 Chen Weihang 提交于 2月 23, 2021
```
* add new custom op so

* fix use new method error

* fix test failed
```
  1ce96fa1
- T
  fix entry (#31079) · ebbdf525
  由 tangwei12 提交于 2月 24, 2021
```
* fix entry

* fix distributed lookup table fuse case

* fix entry bug at first time

* move entry from paddle.fluid -> paddle.distributed

* fix ut with paddle.enable_static()
Co-authored-by: Nmalin10 <malin10@baidu.com>
```
  ebbdf525
- Q
  
  [ROCM] update fluid collective op for rocm, test=develop (#31075) · ee76ea72
  由 Qi Li 提交于 2月 24, 2021
  
  ee76ea72
- Y
  
  fix heter compile (#30518) · d8fa65a3
  由 yaoxuefeng 提交于 2月 24, 2021
  
  d8fa65a3
- A
  [CustomOp] Split build directory for each setup.py (#31124) · dce2db48
  由 Aurelius84 提交于 2月 24, 2021
```
* split build directory for each setup.py

* fix template string
```
  dce2db48
- Z
  [Custom OP]Fix problem of custom op unitests on Windows CI (#31114) · 4b220550
  由 Zhou Wei 提交于 2月 24, 2021
```
* fix some problem of Windows custom op

* fix some problem of Windows custom op

* fix some problem of Windows custom op
```
  4b220550
- C
  add warning message when dtypes of operator are not same (#31136) · 70131b47
  由 chentianyu03 提交于 2月 24, 2021
```
* add error msg when dtypes of operator are not same

* add error msg when dtypes of operator are not same

* change error msg to warning msg when dtypes of operator are not same

* modify test case to fit for python2
```
  70131b47
- Z
  
  support build whl and inference library nightly,test=windows3 (#30616) · be61c2d0
  由 Zhou Wei 提交于 2月 24, 2021
  
  be61c2d0
23 2月, 2021 13 次提交
- A
  added support for fake_quantize_dequantize_abs_max op in quantization… (#30896) · 5d6a8c7b
  由 alncat 提交于 2月 23, 2021
```
* added support for fake_quantize_dequantize_abs_max op in quantization inference pass

* remove const_cast to pass ci

* remove compare operator to pass ci-coverage

* added detailed error message for unregistered tensorrt_subgrah_pass
```
  5d6a8c7b
- C
  [CustomOp] Split test and add inference test (#31078) · e60fd1f6
  由 Chen Weihang 提交于 2月 23, 2021
```
* split test & add inference test

* add timeout config

* change to setup install

* change to jit compile

* add verbose for test

* fix load setup name repeat

* polish details

* resolve conflict

* fix code format error
```
  e60fd1f6
- J
  
  Update of onednn to 2.2 (#31067) · d3f09ad7
  由 Jacek Czaja 提交于 2月 23, 2021
  
  d3f09ad7
- G
  
  merge develop conflict (#31122) · 24ba5ee0
  由 Guanghua Yu 提交于 2月 23, 2021
  
  24ba5ee0
- X
  Optimization of Transformer API (#30957) · edacb629
  由 xiemoyuan 提交于 2月 23, 2021
```
* Support 'bool' and 'int' for attention mask.

* Update docs.

* Add unittest for Transformer.

* fix bugs.
```
  edacb629
- W
  Save load/save pickle protocol (#31044) · ee1801c1
  由 WeiXin 提交于 2月 23, 2021
```
* add default argument  for paddle.save/static.save

* edit documentation of

* Add comments for special processing for protocol=2 and protocol=3.

* Update python/paddle/fluid/io.py
Co-authored-by: Nlanxianghit <47554610+lanxianghit@users.noreply.github.com>
Co-authored-by: Nlanxianghit <47554610+lanxianghit@users.noreply.github.com>
```
  ee1801c1
- Q
  
  [ROCM] update fluid operators for rocm (part1), test=develop (#31077) · cced930b
  由 Qi Li 提交于 2月 23, 2021
  
  cced930b
- Y
  fix flops api (#31081) · 99fd9815
  由 yukavio 提交于 2月 23, 2021
```
* remove PrettyTable dependence from paddle.flops

* fix bug in python2.7

* fix flops

* fix flops

* fix bug

* fix bug
```
  99fd9815
- W
  fix windows for optimization of elementwise_add Op (#31068) · 364cfa26
  由 wangchaochaohu 提交于 2月 23, 2021
```
* fix windows for optimization of elementwise_add Op
```
  364cfa26
- J
  Unification of BF16 enablement process (#31034) · 781df300
  由 joanna.wozna.intel 提交于 2月 23, 2021
```
* Unification of bfloat16 enablement process and refactor

* Remove unnecessary function

* Standardize the output name search
```
  781df300
- Z
  fix softmax cross entropy integer overflow (#30590) · 16fe11d7
  由 Zhong Hui 提交于 2月 23, 2021
```
[BUG FIX] Fix softmax cross entropy overflow problem.
```
  16fe11d7
- Z
  
  fix UNIX cmake problem (#31113) · 44ee251f
  由 Zhou Wei 提交于 2月 23, 2021
  
  44ee251f
- Q
  
  [ROCM] update fluid framework for rocm (part2), test=develop (#31010) · a60d93fb
  由 Qi Li 提交于 2月 23, 2021
  
  a60d93fb
22 2月, 2021 11 次提交

T
support save multi sparse table in one path (#31108) · 565354f6
由 Thunderbrook 提交于 2月 22, 2021
```
* save multi table one path

* format
```
565354f6
Q

[ROCM] update fluid framework for rocm (part3), test=develop (#31011) · 50967135
由 Qi Li 提交于 2月 22, 2021

50967135

[Dy2stat] Refactoring tensor_shape_transformer.py to Fix Change after Assign Bug (#31082) · cf43a321

由 Huihuang Zheng 提交于 2月 22, 2021

**Problem**
In our old shape transformer logic, if user write:
```
s = tensor.shape
...
y = paddle.some_api(s)
```
Dy2stat will change it to
```
...
y = paddle.some_api(convert_var_shape(tensor))
```
However it will cause fatal bug if user changes the shape of `x` after assign. For example:
```
s = tensor.shape
...
tensor = paddle.some_change_shape_api(tensor)
...
y = paddle.some_api(s)
```
Then the Dy2stat will get wrong result because the code is translated into:
```
tensor = paddle.some_change_shape_api(tensor)
...
y = paddle.some_api(convert_var_shape(tensor)) # tensor shape has been changed, not origin `s` value
```

**Solution Logic**

It can not be solved in the old logic, so I refactoring tensor_shape_transformer logic. Now we will use `s` to store shape attribute and generate a var `s__STATIC_CONVERT_VAR_SHAPE_SUFFIX` to store static shape API `shape(tensor)`
```
s = tensor.shape
...
y = paddle.some_api(s)
```
Dy2stat will change it to
```
s = tensor.shape
s__STATIC_CONVERT_VAR_SHAPE_SUFFIX = shape(tensor)
...
y = paddle.some_api(choose_shape_attr_or_api(s, s__STATIC_CONVERT_VAR_SHAPE_SUFFIX ))
```
In this case, the code is consistent with origin dygraph meaning and it fixed the change after assign bug.

**Code Key Note**

To help reviewers, the key change of this PR is changing `self.name_to_var_shape` from "mapping name to shape node" to "mapping name to its STATIC_CONVERT_VAR_SHAPE_SUFFIX name", then if a variable name has the SUFFIX, we can choose to use attribute shape or shape api. Other changes go with the key change.

**Consideration**
The issue of this PR is that we store extra static `shape` API result, will it harms the speed of Dy2stat? In some cases it will, but we argue that the benefit would be greater than the cost.

1. The extra calling to static `shape` API will happen when coder assign among shape variables. Take the following dygraph code as an instance:
```
s1 = tensor.shape
s2 = s1
s3 = s2
...
```
Then we called extra static `shape` APIs again and again, however users seldom write code like this.

2. If the shape variable is used a lot, for example:
```
s = tensor.shape
y1 = paddle.some_api1(s)
y2 = paddle.some_api2(s)
y3 = paddle.some_api3(s)
```
Our old logic will create 3 shape APIs but now just 1. This is more common user code pattern. In fact, if reviewers take a look at the current unit test in this PR, you could see the op numbers decrease after this PR. So we argue that this PR can also improve speed in this code pattern.

cf43a321

fix dist fleet ctr ut (#31087) · 0e4b1542

由 tangwei12 提交于 2月 22, 2021

* fix dist fleet ctr ut

Change-Id: I59bf5123c7bd47bd0e8f1ca2a26295257597c0f5

* fix dist fleet ctr ut

Change-Id: Iafcdd172364be47fe67b753774ce09af050bcbce

* Update CMakeLists.txt

0e4b1542

Q

[ROCM] update fluid framework for rocm (part1), test=develop (#31009) · 8fe09faf
由 Qi Li 提交于 2月 22, 2021

8fe09faf
Q

[ROCM] update fluid platform for rocm39 (part4), test=develop (#30936) · 33429630
由 Qi Li 提交于 2月 22, 2021

33429630
S
update trt int8 calibrator to IEntropyCalibratorV2 (#31060) · a5c56d83
由 Shang Zhizhou 提交于 2月 22, 2021
```
* update trt int8 calibrator to IEntropyCalibratorV2

* add delele opt_cache for trt_split_converter_test
```
a5c56d83

[2.0Custom OP]Support New Custom OP on Windows (#31063) · adaec007

由 Zhou Wei 提交于 2月 22, 2021

* [2.0.1]Support New Custom OP on windows

* fix CI

* fix code style

* fix CI

* fix CI

* fix coverage

* fix CI

* fix CI

adaec007

C

add optional for param attr args, test=document_fix (#31105) · 2168f08a
由 Chen Weihang 提交于 2月 22, 2021

2168f08a

[ROCM] update fluid imperative for rocm (part1), test=develop (#31017) · 1d996637

由 Qi Li 提交于 2月 22, 2021

* [ROCM] update fluid imperative for rocm (part1), test=develop

* [ROCM] update reducer.cc after merge, test=develop

* update reducer cmake after merge, test=develop

1d996637

J

fix the bug in backward OP of index_sample. (#31026) · b95eb38b
由 JamesLim 提交于 2月 22, 2021

b95eb38b

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致