提交 · 330b173c38b39db6930bb80956b26d05489de3b3 · PaddlePaddle / Paddle

18 11月, 2019 10 次提交

Better TensorRT support (#20858) · 330b173c

由 Jeng Bai-Cheng 提交于 11月 18, 2019

* Fix TensorRT detection bug

1. Add new search path for TensorRT at tensorrt.cmake
2. Add better debug message
3. Fix the bug of detection of TensorRT version

In NVIDIA official docker image, TensorRT headers are located at
`/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
fail to detect TensorRT.

There is no debug/warning message to tell developer that TensorRT
is failed to be detected.

In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
compatibility fix.

* Fix TensorRT variables in CMake

1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`

Manually type path may locate incorrect path of TensorRT. Use the
paths detected by system instead.

* Fix TensorRT library path

1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
2. Fix TensorRT library path

inference_lib.cmake and setup.py.in need the path of TensorRT library
instead of the file of TensorRT library, so add new variable to fix it.

* Add more general search rule for TensoRT

Let system detect architecture instead of manually assign it, so
replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.

* Add more general search rule for TensorRT

Remove duplicate search rules for TensorRT libraries. Use
`${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so

test=develop

330b173c

fix sporadically hang issue on windows(#21201) · d8b6cf2b

由 liuwei1031 提交于 11月 18, 2019

cudaStreamSynchronize randomly hang when used in multi-thread environment, replace it with cudaStreamQuery API on windows

d8b6cf2b

D

add store_true to use_paddlecloud argument in launch.py (#21168) · 3fe63d67
由 danleifeng 提交于 11月 18, 2019

3fe63d67

modified error message and API doc for channel_last supported Op (#21002) · 9cbe7bcc

由 Zhang Ting 提交于 11月 18, 2019

* modified error message for conv and conv_transpose, test=develop

* modified doc of conv and conv_transpose op, test=develop

* modified the expression for error message, test=develop

* modified error message for group_norm op, test=develop

* modified detail of Attr(data_format) or Attr(data_layout)

* add ValueError in API doc for maxout op, test=develop

9cbe7bcc

Control flow API: switch_case (#21103) · 92475282

由 liym27 提交于 11月 18, 2019

* add API switch_case. test=develop

add Nest

* modify code according to reviews:
1.Attr(branch_index) support 'uint8' and 'int64' besides 'int32'.
2.remove useless code.
test=develop

* replace fluid.layers.data with fluid.data and polish API document. test=develop

92475282

Z
TRT int8: refine trt int8 for dynamic range set (#21112) · 65f70525
由 Zhaolong Xing 提交于 11月 18, 2019
```
* refine trt int8 for dynamic range set
test=develop

* refine trt int8
test=develop
```
65f70525
G

Fix the error of init variable in StaticRNN when stop_gradient=ON (#21118) · 56b5d147
由 guofei 提交于 11月 18, 2019

56b5d147
W

Fix INF bug of softmax_cross_entropy_op (#21165) · 3c98ec90
由 WangXi 提交于 11月 18, 2019

3c98ec90
Z

fix dygraph trace bug, test=develop (#21193) · 0f30d3a2
由 Zeng Jinle 提交于 11月 18, 2019

0f30d3a2

Add CI check for error message writing specification (#21107) · 7269ffe3

由 Chen Weihang 提交于 11月 18, 2019

* add ci check for error message specification, test=develop, test=document_fix

* replace spec url & refine failed message, test=develop, test=document_fix

7269ffe3

16 11月, 2019 1 次提交

Support more ops in post training quantization, test=develop (#21073) · 00b11a4a

由 juncaipeng 提交于 11月 16, 2019

* Support  more ops in post training quantization, and save the output scale in quantized op.
* Update docs in post training quantization and qat

00b11a4a

15 11月, 2019 5 次提交
- X
  fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052) · 23876de5
  由 xujiaqi01 提交于 11月 15, 2019
```
* fix cache table bug
* add save_paddle_inference_model
* fix hdfs util bug
* test=develop
```
  23876de5
- Y
  
  Fix jit tls issue (#21151) · eec9c9cb
  由 Yihua Xu 提交于 11月 15, 2019
  
  eec9c9cb
- G
  fix cmake fails on inference_download_and_uncompress (#21185) · a9d4eed3
  由 GaoWei8 提交于 11月 15, 2019
```
* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop
```
  a9d4eed3
- X
  add copy table (#21086) · 9e045170
  由 xujiaqi01 提交于 11月 15, 2019
```
* copy some feasigns and corresponding embeddings from one sparse table to another
* copy all feasigns and corresponding embeddings from one sparse table to another
* copy all dense params from one table to another
* copy some local vars to other local vars
```
  9e045170
- R
  
  Refine edit distance cn (#21121) · aeb88791
  由 ruri 提交于 11月 15, 2019
  
  aeb88791
14 11月, 2019 12 次提交
- K
  
  fix elementwise_mod float point kernel. test=develop (#21183) · 98b59cb8
  由 Kaipeng Deng 提交于 11月 14, 2019
  
  98b59cb8
- H
  
  disable reshape inplace in dygraph model; test=develop (#21157) · 835119c7
  由 hong 提交于 11月 14, 2019
  
  835119c7
- Z
  Add friendly dygraph trace API (#21091) · 5fdfbe34
  由 Zeng Jinle 提交于 11月 14, 2019
```
* friendly trace interface, test=develop

* refine TracedLayer, test=develop

* add some docs, test=develop
```
  5fdfbe34
- C
  
  add paddle enforce count sh, test=develop, test=document_fix (#21178) · 44a0a4ad
  由 Chen Weihang 提交于 11月 14, 2019
  
  44a0a4ad
- C
  
  fix detail error message error, test=develop (#21170) · 4bd94636
  由 Chen Weihang 提交于 11月 14, 2019
  
  4bd94636
- W
  
  Fix warpctc in padding mode. (#21033) · cfdd1fc2
  由 whs 提交于 11月 14, 2019
  
  cfdd1fc2
- C
  Add examples for error message writing specification - NotFound, OutOfRange,... · 8da0cd53
  由 Chen Weihang 提交于 11月 14, 2019
```
Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134)

* add examples for error msg spec, test=develop

* change ENFORCE to ENFORCE_**, test=develop

* add more already exists examples, test=develop
```
  8da0cd53
- Z
  Improve topk performance. (#21087) · b93870e6
  由 zhaoyuchen2018 提交于 11月 13, 2019
```
* Improve topk performance.

give 200000 data to compute topk,
before opt: cost 1s
after opt: cost 0.0028s.

* Refine return value.
* Add cuda util funtions.
* Fix ComputeBlockSize bug & refine comments.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  b93870e6
- A
  Add relative error measure when (value > 1) (#21144) · d74ea085
  由 Adam 提交于 11月 14, 2019
```
* Add relative error measure when value > 1
test=develop

* Move code to CheckError function
test=develop
```
  d74ea085
- T
  add input type and dtype check template, and update some APIs check (#21161) · 3976bbe2
  由 Tao Luo 提交于 11月 14, 2019
```
* add input type and dtype check template, and update some APIs check

* refine check template, and update some APIs check in nn.py

* update some APIs check in loss.py

test=develop
```
  3976bbe2
- C
  
  change cuda enforce & add example (#21142) · b3a3e6f6
  由 Chen Weihang 提交于 11月 14, 2019
  
  b3a3e6f6
- J
  QAT int8 accuracy little improvement (#21074) · 37e0e7a9
  由 joanna.wozna.intel 提交于 11月 14, 2019
```
test=develop
```
  37e0e7a9
13 11月, 2019 4 次提交
- C
  Add examples for error message writing specification - PreconditionNotMet,... · 8414575b
  由 Chen Weihang 提交于 11月 13, 2019
```
Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137)

* add examples for error spec, test=develop

* change ENFORCE to ENFORCE_**, test=develop
```
  8414575b
- C
  Add examples for error message writing specification - InvalidArgument (#21132) · 7e5f74b8
  由 Chen Weihang 提交于 11月 13, 2019
```
* add examples for error msg spec, test=develop

* change ENFORCE to ENFORCE_**, test=develop

* fix error, test=develop
```
  7e5f74b8
- G
  Use 2 cards for hallreduce unit test. (#21085) · a5fc291f
  由 gongweibao 提交于 11月 13, 2019
```
use 2 cards test=develop
```
  a5fc291f
- C
  
  add examples for resource exhausted error, test=develop (#21140) · 27fa9c10
  由 Chen Weihang 提交于 11月 13, 2019
  
  27fa9c10
12 11月, 2019 8 次提交
- T
  Split some APIs from nn.py to loss.py (#21117) · 8f659d43
  由 Tao Luo 提交于 11月 12, 2019
```
* Split some APIs from nn.py to loss.py

test=develop

* fix test_detection unit-test

test=develop
```
  8f659d43
- Z
  Add Asypadding for conv fusion. (#21041) · 4a544762
  由 zhaoyuchen2018 提交于 11月 12, 2019
```
* Add Asypadding for conv fusion.

test=develop

reference: pr/20042

* Fix eigen build link error

* Change back file mode

* Use math function & add more checks.
```
  4a544762
- Z
  
  Remove useless code of openblas and fix the previous incorrect message (#21092) · d2573550
  由 zhouwei25 提交于 11月 12, 2019
  
  d2573550
- W
  
  Fix dgc buffer illegal & reuse velocity (#21012) · de5d3ff6
  由 WangXi 提交于 11月 12, 2019
  
  de5d3ff6
- L
  modify the implementation of save_persistables and save_inference_model for... · 53148e06
  由 lilong12 提交于 11月 12, 2019
```
modify the implementation of save_persistables and save_inference_model for fleet collective mode (#20802)

* modify the implementation of  save_persistables and save_inference_model functions for fleet collective, test=develop

* add ut, test=develop
```
  53148e06
- B
  
  fix distiller typo, test=develop (#21070) · bd8b0eba
  由 Bai Yifan 提交于 11月 12, 2019
  
  bd8b0eba
- C
  fix instance norm (#21042) · f62a9291
  由 ceci3 提交于 11月 12, 2019
```
* fix instance norm

* update unitest,test=develop
```
  f62a9291
- Z
  
  Update README.md and README_cn.md to latest version 1.6.1, test=develop (#21119) · 7041eb21
  由 zhongpu 提交于 11月 12, 2019
  
  7041eb21

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功