提交 · 2e2f92a5b10d1c5cf7b1d5384bc3c7db5e6ed25b · PaddlePaddle / Paddle

20 11月, 2019 6 次提交
- P
  fix trt weight bug (#21231) · 2e2f92a5
  由 Pei Yang 提交于 11月 20, 2019
```
added splitter "__" between weight name and suffix number to avoid conflicts.
```
  2e2f92a5
- J
  support set model_filename and params_filename in post_training_quantization, test=develop (#21213) · 29b63f0a
  由 juncaipeng 提交于 11月 20, 2019
```
* support set model_filename and params_filename in post_training_quantization, test=develop
```
  29b63f0a
- D
  update worker_num for MPISymetricRoleMaker (#20798) · ccbdd7aa
  由 Dong Daxiang 提交于 11月 20, 2019
```
test=develop
```
  ccbdd7aa
- Z
  optimize assign op to avoid copy data from GPU to GPU (#21181) · 01a96463
  由 Zhang Ting 提交于 11月 20, 2019
```
* optimize assign op to avoid copy data from GPU to GPU, test=develop

* modified GetkernelTypeForVar and just avoid device transform, test=develop
```
  01a96463
- L
  
  fix load checkpoint error in test_reader (#20924) · c91cb6c5
  由 Liufang Sang 提交于 11月 20, 2019
  
  c91cb6c5
- Z
  Change GCC version to be 8.2 in Dockerfile.GCC8 (#21222) · 925280b9
  由 Zeng Jinle 提交于 11月 20, 2019
```
* make Docker to gcc 8.2, test=develop

* add -std=c11 to grpc.cmake, test=develop
```
  925280b9
19 11月, 2019 8 次提交
- Z
  
  Determine whether to copy and link inference lib by ON_INFER (#20931) · c0dcb090
  由 zhouwei25 提交于 11月 19, 2019
  
  c0dcb090
- C
  Fix PADDLE_ENFORCE ci check bug (#21233) · 2dfcbb8b
  由 Chen Weihang 提交于 11月 19, 2019
```
* fix PADDLE_ENFORCE ci check bug, test=develop, test=document_fix

* fix PADDLE_ENFORCE match error, test=develop, test=document_fix
```
  2dfcbb8b
- K
  
  add custom_op include: imperative, error_codes.pb.h, mkldnn.h. test=develop (#21227) · 4747940b
  由 Kaipeng Deng 提交于 11月 19, 2019
  
  4747940b
- D
  
  extend elementwise broadcast function (#20957) · 0e7baabe
  由 danleifeng 提交于 11月 19, 2019
  
  0e7baabe
- A
  Fix GELU grad error (#21204) · d623e863
  由 Adam 提交于 11月 19, 2019
```
test=develop
```
  d623e863
- Z
  
  refine Tensor method, test=develop (#21031) · a152315b
  由 Zeng Jinle 提交于 11月 19, 2019
  
  a152315b
- Y
  fix data_norm op to avoid impractical normalization result test=develop (#21152) · b5d8ba83
  由 yaoxuefeng 提交于 11月 19, 2019
```
* fix auc drop first commit test=develop

* update datanorm op

* update datanorm with enforce test=develop

* update test=develop

* update format test=develop

* update format

* update format test=develop

* add unit test test=develop

* update unit test test=develop

* update format test=develop

* update format test=develop

* update API description test=develop

* update API description test=develop

* update format test=develop

* fix codes as comments test=develop

* fix description as comments test=develop

* fix description as comments test=develop

* update codes.. test=develop
```
  b5d8ba83
- Z
  Polish jit trace codes (#21218) · 67e88424
  由 Zeng Jinle 提交于 11月 19, 2019
```
* polish jit trace codes, test=develop

* polish codes again by removing var_id, test=develop
```
  67e88424
18 11月, 2019 12 次提交

由 Zeng Jinle 提交于 11月 18, 2019

* fix warnings oof gcc 8 compilation, test=develop

* fix boost::bad_get, test=develop

* refine PADDLE_ENFORCE, test=develop

cdb3d279

Z
fix bug when build openblas with a computer that has installed openblas... · 5d821578
由 zhouwei25 提交于 11月 18, 2019
```
fix bug when build openblas with a computer that has installed openblas before,test=develop (#21160)
```
5d821578

Better TensorRT support (#20858) · 330b173c

由 Jeng Bai-Cheng 提交于 11月 18, 2019

* Fix TensorRT detection bug

1. Add new search path for TensorRT at tensorrt.cmake
2. Add better debug message
3. Fix the bug of detection of TensorRT version

In NVIDIA official docker image, TensorRT headers are located at
`/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
fail to detect TensorRT.

There is no debug/warning message to tell developer that TensorRT
is failed to be detected.

In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
compatibility fix.

* Fix TensorRT variables in CMake

1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`

Manually type path may locate incorrect path of TensorRT. Use the
paths detected by system instead.

* Fix TensorRT library path

1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
2. Fix TensorRT library path

inference_lib.cmake and setup.py.in need the path of TensorRT library
instead of the file of TensorRT library, so add new variable to fix it.

* Add more general search rule for TensoRT

Let system detect architecture instead of manually assign it, so
replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.

* Add more general search rule for TensorRT

Remove duplicate search rules for TensorRT libraries. Use
`${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so

test=develop

330b173c

fix sporadically hang issue on windows(#21201) · d8b6cf2b

由 liuwei1031 提交于 11月 18, 2019

cudaStreamSynchronize randomly hang when used in multi-thread environment, replace it with cudaStreamQuery API on windows

d8b6cf2b

D

add store_true to use_paddlecloud argument in launch.py (#21168) · 3fe63d67
由 danleifeng 提交于 11月 18, 2019

3fe63d67

modified error message and API doc for channel_last supported Op (#21002) · 9cbe7bcc

由 Zhang Ting 提交于 11月 18, 2019

* modified error message for conv and conv_transpose, test=develop

* modified doc of conv and conv_transpose op, test=develop

* modified the expression for error message, test=develop

* modified error message for group_norm op, test=develop

* modified detail of Attr(data_format) or Attr(data_layout)

* add ValueError in API doc for maxout op, test=develop

9cbe7bcc

Control flow API: switch_case (#21103) · 92475282

由 liym27 提交于 11月 18, 2019

* add API switch_case. test=develop

add Nest

* modify code according to reviews:
1.Attr(branch_index) support 'uint8' and 'int64' besides 'int32'.
2.remove useless code.
test=develop

* replace fluid.layers.data with fluid.data and polish API document. test=develop

92475282

Z
TRT int8: refine trt int8 for dynamic range set (#21112) · 65f70525
由 Zhaolong Xing 提交于 11月 18, 2019
```
* refine trt int8 for dynamic range set
test=develop

* refine trt int8
test=develop
```
65f70525
G

Fix the error of init variable in StaticRNN when stop_gradient=ON (#21118) · 56b5d147
由 guofei 提交于 11月 18, 2019

56b5d147
W

Fix INF bug of softmax_cross_entropy_op (#21165) · 3c98ec90
由 WangXi 提交于 11月 18, 2019

3c98ec90
Z

fix dygraph trace bug, test=develop (#21193) · 0f30d3a2
由 Zeng Jinle 提交于 11月 18, 2019

0f30d3a2

Add CI check for error message writing specification (#21107) · 7269ffe3

由 Chen Weihang 提交于 11月 18, 2019

* add ci check for error message specification, test=develop, test=document_fix

* replace spec url & refine failed message, test=develop, test=document_fix

7269ffe3

16 11月, 2019 1 次提交

Support more ops in post training quantization, test=develop (#21073) · 00b11a4a

由 juncaipeng 提交于 11月 16, 2019

* Support  more ops in post training quantization, and save the output scale in quantized op.
* Update docs in post training quantization and qat

00b11a4a

15 11月, 2019 5 次提交
- X
  fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052) · 23876de5
  由 xujiaqi01 提交于 11月 15, 2019
```
* fix cache table bug
* add save_paddle_inference_model
* fix hdfs util bug
* test=develop
```
  23876de5
- Y
  
  Fix jit tls issue (#21151) · eec9c9cb
  由 Yihua Xu 提交于 11月 15, 2019
  
  eec9c9cb
- G
  fix cmake fails on inference_download_and_uncompress (#21185) · a9d4eed3
  由 GaoWei8 提交于 11月 15, 2019
```
* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop
```
  a9d4eed3
- X
  add copy table (#21086) · 9e045170
  由 xujiaqi01 提交于 11月 15, 2019
```
* copy some feasigns and corresponding embeddings from one sparse table to another
* copy all feasigns and corresponding embeddings from one sparse table to another
* copy all dense params from one table to another
* copy some local vars to other local vars
```
  9e045170
- R
  
  Refine edit distance cn (#21121) · aeb88791
  由 ruri 提交于 11月 15, 2019
  
  aeb88791
14 11月, 2019 8 次提交
- K
  
  fix elementwise_mod float point kernel. test=develop (#21183) · 98b59cb8
  由 Kaipeng Deng 提交于 11月 14, 2019
  
  98b59cb8
- H
  
  disable reshape inplace in dygraph model; test=develop (#21157) · 835119c7
  由 hong 提交于 11月 14, 2019
  
  835119c7
- Z
  Add friendly dygraph trace API (#21091) · 5fdfbe34
  由 Zeng Jinle 提交于 11月 14, 2019
```
* friendly trace interface, test=develop

* refine TracedLayer, test=develop

* add some docs, test=develop
```
  5fdfbe34
- C
  
  add paddle enforce count sh, test=develop, test=document_fix (#21178) · 44a0a4ad
  由 Chen Weihang 提交于 11月 14, 2019
  
  44a0a4ad
- C
  
  fix detail error message error, test=develop (#21170) · 4bd94636
  由 Chen Weihang 提交于 11月 14, 2019
  
  4bd94636
- W
  
  Fix warpctc in padding mode. (#21033) · cfdd1fc2
  由 whs 提交于 11月 14, 2019
  
  cfdd1fc2
- C
  Add examples for error message writing specification - NotFound, OutOfRange,... · 8da0cd53
  由 Chen Weihang 提交于 11月 14, 2019
```
Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134)

* add examples for error msg spec, test=develop

* change ENFORCE to ENFORCE_**, test=develop

* add more already exists examples, test=develop
```
  8da0cd53
- Z
  Improve topk performance. (#21087) · b93870e6
  由 zhaoyuchen2018 提交于 11月 13, 2019
```
* Improve topk performance.

give 200000 data to compute topk,
before opt: cost 1s
after opt: cost 0.0028s.

* Refine return value.
* Add cuda util funtions.
* Fix ComputeBlockSize bug & refine comments.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  b93870e6

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功