提交 · 0a4002f5dcd901cf799952153fa7c0eb418a3a56 · BaiXuePrincess / Paddle

06 12月, 2019 1 次提交

CHERRY_PICK: Better TensorRT support (#20858) (#21578) · 0a4002f5

由 Zhaolong Xing 提交于 12月 06, 2019

* Fix TensorRT detection bug

1. Add new search path for TensorRT at tensorrt.cmake
2. Add better debug message
3. Fix the bug of detection of TensorRT version

In NVIDIA official docker image, TensorRT headers are located at
`/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
fail to detect TensorRT.

There is no debug/warning message to tell developer that TensorRT
is failed to be detected.

In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
compatibility fix.

* Fix TensorRT variables in CMake

1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`

Manually type path may locate incorrect path of TensorRT. Use the
paths detected by system instead.

* Fix TensorRT library path

1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
2. Fix TensorRT library path

inference_lib.cmake and setup.py.in need the path of TensorRT library
instead of the file of TensorRT library, so add new variable to fix it.

* Add more general search rule for TensoRT

Let system detect architecture instead of manually assign it, so
replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.

* Add more general search rule for TensorRT

Remove duplicate search rules for TensorRT libraries. Use
`${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so

test=release/1.6

0a4002f5

05 12月, 2019 2 次提交
- Z
  
  fix xbyak control by -DWITH_XBYAK,test=develop (#21560) · 2afe928a
  由 zhouwei25 提交于 12月 05, 2019
  
  2afe928a
- Z
  
  let WHTI_XBYAK can be adjusted by -D when cmake,test=develop (#21538) · e3dd13b1
  由 zhouwei25 提交于 12月 05, 2019
  
  e3dd13b1
04 12月, 2019 1 次提交
- Z
  [cherry-pick] NV JETSON support and auto_growth strategy for inference. (#21500) · 20a09375
  由 Zhaolong Xing 提交于 12月 04, 2019
```
* ADD NV JETSON SUPPORT
test=release/1.6

* CHERRY_PICK: specify the auto growth allocator for inference.
test=release/1.6
```
  20a09375
21 11月, 2019 1 次提交

Cherry-pick error type support for release1.6 (#21294) · 974b8a83

由 Chen Weihang 提交于 11月 21, 2019

* delete paddle infershape enforce marco (#20832)

* Polish and arrange code in enforce.h (#20901)

* Enrich the type of error and declare the error type interfaces (#21024)

* Enrich the type of error and declare the error type interfaces, test=develop

* adjust tests to adapt new form, test=develop

* add inference deps with error_codes.pb.h, test=develop

* restore stack iter start pos, test=develop

* polish code based review comments, test=develop

* Add dependency for error_codes.proto (#21084)

* fix activation_functions deps, test=develop, test=document_fix

* add error_codes_proto deps, test=develop, test=document_fix

* try delete enforce.h, test=develop, test=document_fix

* change cuda enforce & add example (#21142)
test=release/1.6

974b8a83

30 10月, 2019 1 次提交
- L
  [cherry-pick] Add support to gcc8, add docker env (#20892) · 6fb04e8a
  由 liu zhengxi 提交于 10月 30, 2019
```
* add support to gcc8, add docker env
* remove the warning issue
```
  6fb04e8a
21 10月, 2019 1 次提交
- W
  
  [Cherry-pick 1.6] Fix DGC test and DGC nan bug (#20708) · 2378aa8a
  由 WangXi 提交于 10月 21, 2019
  
  2378aa8a
14 10月, 2019 1 次提交
- 6
  
  support convert tensor to cudf depends on dlpack test=release/1.6 (#20611) · 5da8db61
  由 633WHU 提交于 10月 14, 2019
  
  5da8db61
08 10月, 2019 1 次提交
- T
  trainer from dataset fetch targets (#19760) (#20182) · 546a0d3c
  由 tangwei12 提交于 10月 08, 2019
```
add executor.FetchHandler for train/infer from the dataset
```
  546a0d3c
03 10月, 2019 2 次提交

[cherry-pick] Add multihead op for ernie opt (#19933) (#20151) · c5fe228c

由 zhaoyuchen2018 提交于 10月 03, 2019

test=release/1.6

* Add multihead op for ernie opt

* Refine softmax

* Refine kernel.

* Refine cuda kernel

* Refine cuda version

* Refine cmake

c5fe228c

fix conv2d and conv3d: (#20042) (#20121) · 2faa38cd

由 liym27 提交于 10月 03, 2019

1.support asymmetric padding;
2.support padding algorithm:"SAME" and "VALID";
3.support channel_last: data_format NHWC and NDHWC;
4.change doc of python API and c++;

test=release/1.6

2faa38cd

27 9月, 2019 1 次提交

石

update operator compatible info, test=develop (#19978) · 01b9d079

由石晓伟提交于 9月 27, 2019

* update operator compatible info, test=develop

* revert cmake/version.cmake, test=develop

* add unit_tests and fix bugs, test=develop

* update ../paddle/fluid/framework/framework.proto, test=develop

* fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop

* update paddle/fluid/framework/version_test.cc, test=develop

* add comments and rename interfaces, test=develop

01b9d079

20 9月, 2019 1 次提交
- G
  Add dgc source code to bos platform. (#19892) · ae593e57
  由 gongweibao 提交于 9月 20, 2019
```
* add dgc.tgz to bos
```
  ae593e57
19 9月, 2019 1 次提交

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

17 9月, 2019 1 次提交
- C
  add deformable conv v1 op and cpu version of deformable conv v2 (#18500) · 00efd1d8
  由 chengjuntao 提交于 9月 17, 2019
```
* add deformable conv v1 op, test=develop
```
  00efd1d8
16 9月, 2019 1 次提交
- Z
  
  fix the dependencies of third party and inference lib (#19684) · b5a5d93b
  由 zhouwei25 提交于 9月 16, 2019
  
  b5a5d93b
11 9月, 2019 2 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

10 9月, 2019 1 次提交
- B
  
  upgrade ngraph to support mkldnn v1.0 (#19689) · 87f13f75
  由 baojun 提交于 9月 09, 2019
  
  87f13f75
07 9月, 2019 1 次提交

remove -Wmaybe-uninitialized warning (#19653) · bcddbc78

由 Tao Luo 提交于 9月 07, 2019

* remove -Wmaybe-uninitialized warning

test=develop

* remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc

test=develop

bcddbc78

04 9月, 2019 3 次提交
- T
  fix inference_lib deps error (#19632) · 3aaea4c5
  由 Tao Luo 提交于 9月 04, 2019
```
test=develop
```
  3aaea4c5
- L
  
  fix the warning caused by mistach arguments of flags.cmake (#19576) · 9c885708
  由 liuwei1031 提交于 9月 04, 2019
  
  9c885708
- S
  Enable online compilation of openblas on windows (#19602) · e79cf3bc
  由 silingtong123 提交于 9月 04, 2019
```
* test=develop, Support for online compilation of openblas

* test=develop, Modify the prefix of openblas static library
```
  e79cf3bc
31 8月, 2019 1 次提交

Paddlebox Framework (#18982) · c756b5d2

由 hutuxian 提交于 8月 31, 2019

* Support looking up embeddings from BoxPS.
* Add a _pull_box_sparse op, for now this op is not exposed to users.
* Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
* Add 'BoxPSDataset' in python code.
* Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
* Add UT.
* More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982

c756b5d2

30 8月, 2019 1 次提交
- L
  
  add dynamic C runtime support on windows, test=develop (#19502) · d6cb1a41
  由 liuwei1031 提交于 8月 30, 2019
  
  d6cb1a41
20 8月, 2019 1 次提交

Use sparse matrix to implement fused emb_seq_pool operator (#19064) · b9203958

由 Yihua Xu 提交于 8月 20, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* Ignore the deprecated status for windows

test=develop

b9203958

19 8月, 2019 3 次提交
- Z
  
  merge develop to solve conflict, also fix API doc, test=develop (#18823) · 5b6673c4
  由 Zeng Jinle 提交于 8月 19, 2019
  
  5b6673c4
- L
  fix compilation issue in windows vs2017 (#19183) · 50582071
  由 liuwei1031 提交于 8月 19, 2019
```
* fix compilation issue in windows vs2017, test=develop

* fix gtest lib not found issue, test=develop
```
  50582071
- Z
  fix the bug that PYTHON_EXECUTABLE not exists (#19225) · 2f0dc846
  由 zhouwei25 提交于 8月 19, 2019
```
* test=develop,fix the inference library compilation bug on windows

* test=develop,Fix the inference library compilation bug on windows

* test=develop,fix the bug that PYTHON_EXECUTABLE not exists
```
  2f0dc846
14 8月, 2019 2 次提交
- Z
  Fix the inference library compilation bug on windows (#19190) · ef46918a
  由 zhouwei25 提交于 8月 14, 2019
```
* test=develop,fix the inference library compilation bug on windows
```
  ef46918a
- T
  remove WITH_FAST_MATH option (#19149) · 32a670ba
  由 Tao Luo 提交于 8月 14, 2019
```
test=develop
```
  32a670ba
12 8月, 2019 1 次提交
- W
  add tensorrt support for windows (#19084) · 80b7ef6f
  由 wopeizl 提交于 8月 12, 2019
```
* add tensorrt support for windows
```
  80b7ef6f
01 8月, 2019 1 次提交

[PROPOSAL] Add support for dynamic code analysis (Sanitizers) (#18303) · e1b5833b

由 Krzysztof Binias 提交于 8月 01, 2019

* Add support for dynamic code analysis (Sanitizers)
test=develop

* Move options to one option

test=develop

* Missing check

test=develop

e1b5833b

31 7月, 2019 1 次提交
- B
  upgrade ngraph version and simplify ngraph engine (#18853) · adcfc53b
  由 baojun 提交于 7月 30, 2019
```
* upgrade ngraph to v0.24 test=develop

* simplify io test=develop
```
  adcfc53b
29 7月, 2019 1 次提交
- H
  
  Try to modify external gflags to solve CI compilation (#18872) · 0d3f16f5
  由 Huihuang Zheng 提交于 7月 29, 2019
  
  0d3f16f5
24 7月, 2019 1 次提交
- T
  remove package.cmake (#18760) · 8de5aa1b
  由 Tao Luo 提交于 7月 24, 2019
```
test=develop
```
  8de5aa1b
23 7月, 2019 1 次提交
- T
  remove unused cmake file (#18744) · 0ae45f0b
  由 Tao Luo 提交于 7月 23, 2019
```
test=develop
```
  0ae45f0b
22 7月, 2019 1 次提交
- T
  remove unused gzstream.cmake (#18705) · c457a69d
  由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
  c457a69d
19 7月, 2019 2 次提交
- J
  MKL-DNN upgrade to 0.20 (#18370) · 0d8e6c9b
  由 Jacek Czaja 提交于 7月 19, 2019
```
test=develop
```
  0d8e6c9b
- G
  
  Change to use brpc rdma branch instead of personal branch. (#18683) · ec1000cc
  由 gongweibao 提交于 7月 19, 2019
  
  ec1000cc

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致