提交 · b639a882c3d20131131456f9ccf7292c39bf4fec · 机器未来 / Paddle

26 11月, 2019 2 次提交
- T
  make CUDA_ARCH_NAME default Auto (#21352) · d8e7d252
  由 Tao Luo 提交于 11月 26, 2019
```
* make CUDA_ARCH_NAME default Auto

test=develop

* refine warning

test=develop
```
  d8e7d252
- S
  
  package the CAPI inference library and third_party (#21299) · 4b429c19
  由 silingtong123 提交于 11月 26, 2019
  
  4b429c19
25 11月, 2019 2 次提交
- Z
  
  remove warning LNK4006 and warning LNK4221 (#21226) · 345b67b5
  由 zhouwei25 提交于 11月 25, 2019
  
  345b67b5
- Z
  
  Cache 3rd source code, improve stability, reduce the compilation time (#21190) · 341dee06
  由 zhouwei25 提交于 11月 25, 2019
  
  341dee06
20 11月, 2019 1 次提交
- Z
  Change GCC version to be 8.2 in Dockerfile.GCC8 (#21222) · 925280b9
  由 Zeng Jinle 提交于 11月 20, 2019
```
* make Docker to gcc 8.2, test=develop

* add -std=c11 to grpc.cmake, test=develop
```
  925280b9
19 11月, 2019 1 次提交
- Z
  
  Determine whether to copy and link inference lib by ON_INFER (#20931) · c0dcb090
  由 zhouwei25 提交于 11月 19, 2019
  
  c0dcb090
18 11月, 2019 3 次提交

由 Zeng Jinle 提交于 11月 18, 2019

* fix warnings oof gcc 8 compilation, test=develop

* fix boost::bad_get, test=develop

* refine PADDLE_ENFORCE, test=develop

cdb3d279

Z
fix bug when build openblas with a computer that has installed openblas... · 5d821578
由 zhouwei25 提交于 11月 18, 2019
```
fix bug when build openblas with a computer that has installed openblas before,test=develop (#21160)
```
5d821578

Better TensorRT support (#20858) · 330b173c

由 Jeng Bai-Cheng 提交于 11月 18, 2019

* Fix TensorRT detection bug

1. Add new search path for TensorRT at tensorrt.cmake
2. Add better debug message
3. Fix the bug of detection of TensorRT version

In NVIDIA official docker image, TensorRT headers are located at
`/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
fail to detect TensorRT.

There is no debug/warning message to tell developer that TensorRT
is failed to be detected.

In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
compatibility fix.

* Fix TensorRT variables in CMake

1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`

Manually type path may locate incorrect path of TensorRT. Use the
paths detected by system instead.

* Fix TensorRT library path

1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
2. Fix TensorRT library path

inference_lib.cmake and setup.py.in need the path of TensorRT library
instead of the file of TensorRT library, so add new variable to fix it.

* Add more general search rule for TensoRT

Let system detect architecture instead of manually assign it, so
replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.

* Add more general search rule for TensorRT

Remove duplicate search rules for TensorRT libraries. Use
`${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so

test=develop

330b173c

12 11月, 2019 1 次提交
- Z
  
  Remove useless code of openblas and fix the previous incorrect message (#21092) · d2573550
  由 zhouwei25 提交于 11月 12, 2019
  
  d2573550
11 11月, 2019 1 次提交
- M
  Add Shallow clone to ExternalProjects (#21060) · 6cc544aa
  由 Michał Gallus 提交于 11月 11, 2019
```
test=develop
```
  6cc544aa
08 11月, 2019 3 次提交

Add transpose2 INT8 for mkl-dnn (#19424) · 77c20835

由 joanna.wozna.intel 提交于 11月 08, 2019

* Add transpose2 INT8 for mkl-dnn

test=develop

* Fix test_transpose_int8_mkldnn

test=develop

* Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"

This reverts commit 34011bdb, reversing
changes made to 2ce6473f.

* Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2""

This reverts commit 23754dd7.

* Add template to TransposeMKLDNNHandler

test=develop

* Resolve conflict

test=develop

* Restore get_size and refactor

test=develop

77c20835

Z

move more third party library related logic to third_party.cmake (#20927) · 89bc18ee
由 zhouwei25 提交于 11月 08, 2019

89bc18ee

Enrich the type of error and declare the error type interfaces (#21024) · 7ee25189

由 Chen Weihang 提交于 11月 08, 2019

* Enrich the type of error and declare the error type interfaces, test=develop

* adjust tests to adapt new form, test=develop

* add inference deps with error_codes.pb.h, test=develop

* restore stack iter start pos, test=develop

* polish code based review comments, test=develop

7ee25189

05 11月, 2019 1 次提交

Support NoNeedBufferVarsInference in dygraph backward (#20868) · 878a40f5

由 Zeng Jinle 提交于 11月 05, 2019

* support no need buffer vars in dygraph, test=develop

* fix inference compilation error, test=develop

* update no_need_buffer_vars_inference, test=develop

* add unittests for no_need_buffer_vars_context, test=develop

* refine no_need_buffer_vars by return ref, test=develop

* polish some codes, test=develop

878a40f5

04 11月, 2019 1 次提交
- Z
  
  fix mklml and cblas bug,test=develop (#20970) · 394edd86
  由 zhouwei25 提交于 11月 04, 2019
  
  394edd86
31 10月, 2019 2 次提交

GradMaker for dygraph (#19706) · 8c4573a3

由 hong 提交于 10月 31, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* optimize grad maker; test=develop

* optimize grad maker

* test

* grad make optim; test=develop

* fix unittest bugs; test=develop

* add dygraph grad op maker and split_op

* grad op maker refactor; test=develop

* add dygraph grad maker; test=develop

* fix op deformable_conv_v1_op bug; test=develop

* fix deformable_conv prroi pool bugs;

* fix new op grad op maker bug; test=develop

* fix split by ref bug; test=develop

* fix dygraph auto prune bug; test=develop

* fix test_trace bug; test=develop

* fix fused emb seq pool bug; test=develop

* remove useless code in op_desc file; test=develop

* remove useless code, StrVarBaseNode; test=develop

* fix review issues; test=develop

* fix rank_loss grad maker; test=develop

* remove flag in VarBase; test=develop

* fix distributed_notify_op compile bug ; test=develop

* fix reshape op double grad; test=develop

* fix expand as op; test=develop

* add impertive type_defs.h for demo_train; test=develop

* fix inference lib cmake; test=develop

* fix inference lib; test=develop

* fix infernce_lib; test=develop

* fix inference cmake; test=develop

* fix inference lib; test=develop

* fix inference lib; test=develop

* remove condition dygraph grad maker, modify local name; test=develop

* fix split grad maker bug; test=develop

* fix pyramid_op bug; test=develop

* change travis time out limit; test=develop

* restore travis; test=develop

* change timeout limit; test=develop

8c4573a3

Z

Integration of third_party compilation structure (#20887) · b7417610
由 zhouwei25 提交于 10月 31, 2019

b7417610

28 10月, 2019 1 次提交
- W
  
  remove the warning issue test=develop (#20718) · 3b31b74e
  由 wopeizl 提交于 10月 28, 2019
  
  3b31b74e
22 10月, 2019 1 次提交
- Z
  
  Cmake_generotor support has been added to enable multi-version VS support (#20755) · bcd77e14
  由 zhouwei25 提交于 10月 22, 2019
  
  bcd77e14
18 10月, 2019 2 次提交
- W
  add support to gcc8, add docker env test=develop (#19807) · 9e594823
  由 wopeizl 提交于 10月 18, 2019
```
* add support to gcc8, add docker env test=develop
```
  9e594823
- W
  
  Fix dgc nan by stripping nccl from sparseReduce. (#20630) · 507afa8a
  由 WangXi 提交于 10月 17, 2019
  
  507afa8a
15 10月, 2019 1 次提交
- 石
  
  fix version.cmake, test=develop (#20606) · 48b27229
  由石晓伟提交于 10月 15, 2019
  
  48b27229
14 10月, 2019 1 次提交

Dlpack support (#20039) · 12e4be03

由 633WHU 提交于 10月 14, 2019

* support dlpack to tensor and implement python interface test=develop

* add unittest for _to_dlpack and from_dlpack test=develop

12e4be03

07 10月, 2019 1 次提交
- T
  trainer from dataset fetch targets (#19760) · c9139c3d
  由 tangwei12 提交于 10月 07, 2019
```
add executor.FetchHandler for train/infer from the dataset
```
  c9139c3d
02 10月, 2019 1 次提交

Add multihead op for ernie opt (#19933) · e8673668

由 zhaoyuchen2018 提交于 10月 02, 2019

* Add multihead op for ernie opt

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine softmax

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine kernel.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine cuda kernel

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine cuda version

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine cmake

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

e8673668

29 9月, 2019 1 次提交

fix conv2d and conv3d: (#20042) · 3aa331d9

由 liym27 提交于 9月 29, 2019

1.support asymmetric padding;
    2.support padding algorithm:"SAME" and "VALID";
    3.support channel_last: data_format NHWC and NDHWC;
    4.change doc of python API and c++;

    test=develop, test=document_preview

3aa331d9

27 9月, 2019 1 次提交

石

update operator compatible info, test=develop (#19978) · 01b9d079

由石晓伟提交于 9月 27, 2019

* update operator compatible info, test=develop

* revert cmake/version.cmake, test=develop

* add unit_tests and fix bugs, test=develop

* update ../paddle/fluid/framework/framework.proto, test=develop

* fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop

* update paddle/fluid/framework/version_test.cc, test=develop

* add comments and rename interfaces, test=develop

01b9d079

20 9月, 2019 1 次提交
- G
  Add dgc source code to bos platform. (#19892) · ae593e57
  由 gongweibao 提交于 9月 20, 2019
```
* add dgc.tgz to bos
```
  ae593e57
19 9月, 2019 1 次提交

Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6

由 Yiqun Liu 提交于 9月 19, 2019

* Add fc_elementwise_layernorm_fuse pass and unittest.

* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop

* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.

* Add the setting of attrs in the definition of binary_op.
test=develop

* Add comment.

* Implement the unittest.
test=develop

* Change the unittest name of layer_norm.
test=develop

3cd985a6

17 9月, 2019 1 次提交
- C
  add deformable conv v1 op and cpu version of deformable conv v2 (#18500) · 00efd1d8
  由 chengjuntao 提交于 9月 17, 2019
```
* add deformable conv v1 op, test=develop
```
  00efd1d8
16 9月, 2019 1 次提交
- Z
  
  fix the dependencies of third party and inference lib (#19684) · b5a5d93b
  由 zhouwei25 提交于 9月 16, 2019
  
  b5a5d93b
11 9月, 2019 2 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

Implement the GPU kernel of fc operator (#19687) · a65c728e

由 Yiqun Liu 提交于 9月 11, 2019

* Refine the codes related to fc op.

* Add GPU implementation for fc functor.

* Apply fc_fuse_pass in GPU inference.
test=develop

* Change the cmake for fc op.

* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.

* Add an attribute to set the activation type in fc_op.

* Enhance the unittest of fc_op.
test=develop

* Remove the declaration of FCOpGrad back to the header file.
test=develop

* Set default value for newly added arguments in test_fc_op.
test=develop

a65c728e

10 9月, 2019 1 次提交
- B
  
  upgrade ngraph to support mkldnn v1.0 (#19689) · 87f13f75
  由 baojun 提交于 9月 09, 2019
  
  87f13f75
07 9月, 2019 1 次提交

remove -Wmaybe-uninitialized warning (#19653) · bcddbc78

由 Tao Luo 提交于 9月 07, 2019

* remove -Wmaybe-uninitialized warning

test=develop

* remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc

test=develop

bcddbc78

04 9月, 2019 3 次提交
- T
  fix inference_lib deps error (#19632) · 3aaea4c5
  由 Tao Luo 提交于 9月 04, 2019
```
test=develop
```
  3aaea4c5
- L
  
  fix the warning caused by mistach arguments of flags.cmake (#19576) · 9c885708
  由 liuwei1031 提交于 9月 04, 2019
  
  9c885708
- S
  Enable online compilation of openblas on windows (#19602) · e79cf3bc
  由 silingtong123 提交于 9月 04, 2019
```
* test=develop, Support for online compilation of openblas

* test=develop, Modify the prefix of openblas static library
```
  e79cf3bc
31 8月, 2019 1 次提交

Paddlebox Framework (#18982) · c756b5d2

由 hutuxian 提交于 8月 31, 2019

* Support looking up embeddings from BoxPS.
* Add a _pull_box_sparse op, for now this op is not exposed to users.
* Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
* Add 'BoxPSDataset' in python code.
* Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
* Add UT.
* More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982

c756b5d2

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致