提交 · e81f0228df7a15010230e2193460de0616e55bd3 · Crayon鑫 / Paddle

10 12月, 2019 1 次提交

MKL-DNN 1.0 Update (#20162) · e81f0228

由 Adam 提交于 12月 10, 2019

* MKLDNN v1.0 rebase to Paddle 1.6
test=develop

* Add hacky paddle::string::to_string() implementation

* vectorize<int64-t>() -> vectorize() cleanup
test=develop

* PADDLE_ENFORCE and void_cast fixes
test=develop

* Rebase changes
test=develop

* Cosmetics
test=develop

* Delete MKL from mkldnn.cmake
test=develop

* CMake debug commands
test=develop

* Delete MKLDNN_VERBOSE and rebase fixes
test=develop

* Rebase fixes
test=develop

* Temporarily disable int8 resnet101 vgg16 and vgg19 tests
test=develop

* Add libmkldnn.so.1 to python setup
test=develop

* Add libmkldnn.so.1 to inference_lib cmake after rebase
test=develop

* Post rebase fixes + FC int8 changes
test=develop

* Fix LRN NHWC
test=develop

* Fix NHWC conv3d
test=develop

* Windows build fix + next conv3d fix
test=develop

* Fix conv2d on AVX2 machines
test=develop

e81f0228

09 12月, 2019 1 次提交

dygraph_grad_maker supports varbase without grad_var (#21524) · 84b72671

由 Leo Chen 提交于 12月 09, 2019

* dygraph_grad_maker supports varbase without grad_var, test=develop

* fix compile, test=develop

* fix test_tracer, test=develop

* follow comments, test=develop

84b72671

05 12月, 2019 1 次提交

Split VarBase from Python Variable for Dygraph (#21359) · cdd46d7e

由 Leo Chen 提交于 12月 05, 2019

* test=develop, fix docker with paddle nccl problem

* don't expose numerous Tensor.set(), test=develop

* fix condition, test=develop

* fix float16 bug, test=develop

* feed should be Tensor or np.array, not Variable or number, test=develop

* use forcecast to copy numpy slice to new array, test=develop

* remove float16-uint16 hacking, test=develop

* add variable method to varbase and refactor to_variable to support return varbase

* support kwargs in varbase constructor

* add VarBase constructor to support default python args

* refine varbase initial method

* reset branch

* fix ut for change VarBase error info to PaddleEnforce

* cherry is parameter change before

* overload isinstance to replace too many change of is_variable

* rm useless files

* rm useless code merged by git

* test=develop, fix some ut failed error

* test=develop, fix test_graph_wrapper

* add some tests, test=develop

* refine __getitem__, test=develop

* add tests, test=develop

* fix err_msg, test=develop

cdd46d7e

04 12月, 2019 1 次提交

modify the personal repo address of eigen and warpctc (#21445) · 46401786

由 silingtong123 提交于 12月 04, 2019

* modify the repo address of eigen and warpctc

* fix the eigen not work on windows

* fix the eigen and warpctc can't recompile

46401786

03 12月, 2019 2 次提交
- Z
  NV jetson(nano, tx2, xavier) inference compile support (#21393) · c5f0293c
  由 Zhaolong Xing 提交于 12月 03, 2019
```
* add jeston compile support
test=develop

* refine the cmake
test=develop
```
  c5f0293c
- T
  Revert "revert flags.cmake (#21437)" (#21485) · 060bf8d0
  由 Tao Luo 提交于 12月 03, 2019
```
This reverts commit c93c9e5b.
test=develop
```
  060bf8d0
02 12月, 2019 2 次提交
- G
  
  revert flags.cmake test=develop (#21437) · c93c9e5b
  由 gongweibao 提交于 12月 02, 2019
  
  c93c9e5b
- Z
  update openblas version (#21450) · 6aa13f46
  由 Zhaolong Xing 提交于 12月 02, 2019
```
test=develop
```
  6aa13f46
30 11月, 2019 1 次提交
- Z
  
  fix cub/threadpool include_dir to match setup.py.in,test=develop (#21436) · fce24315
  由 zhouwei25 提交于 11月 30, 2019
  
  fce24315
28 11月, 2019 2 次提交
- T
  remove -Wno-error=sign-compare, make warning as error (#21358) · c0656dcb
  由 Tao Luo 提交于 11月 28, 2019
```
* remove -Wno-error=sign-compare, make warning as error

test=develop test=document_fix

* fix exist compile warning

test=develop
```
  c0656dcb
- Z
  
  Eliminate the impact on incremental compilation (#21410) · b39f9476
  由 zhouwei25 提交于 11月 28, 2019
  
  b39f9476
27 11月, 2019 1 次提交

INT8 Fully-connected (#17641) · 5d7d5482

由 Michał Gallus 提交于 11月 27, 2019

* Implement Int8 FC

* Integrate FC into INT8v2

test=develop

* int8 FC: transpose weights before computing scales

test=develop

* Add support for activation_type string in FC

test=develop

* Disable MKL-DNN's FC in VGG16 and 19

test=develop

* Disable FC quantization when mkldnn FC is disabled

test=develop

* Solve PADDLE_ENFORCES in FC int8

* Fix Paddle enforces and remove const cast

test=develop

* Fix style changes

test=develop

* Fix quantizer_tester test and add fc quantization

test=develop

* Fix FC test fail on CUDA

* Remove unnecessary log from quantize placement pass

test=develop

* Add Thread ID to FC hash key

test=develop

* Add comments to MKL-DNN FC Kernel

test=develop

* Refactor quantizer

test=develop

* Fix linter issues

test=develop

* Fix crash in slim googlenet

test=develop

* Fix PADDLE_ENFORCE messages

test=develop

5d7d5482

26 11月, 2019 2 次提交
- T
  make CUDA_ARCH_NAME default Auto (#21352) · d8e7d252
  由 Tao Luo 提交于 11月 26, 2019
```
* make CUDA_ARCH_NAME default Auto

test=develop

* refine warning

test=develop
```
  d8e7d252
- S
  
  package the CAPI inference library and third_party (#21299) · 4b429c19
  由 silingtong123 提交于 11月 26, 2019
  
  4b429c19
25 11月, 2019 2 次提交
- Z
  
  remove warning LNK4006 and warning LNK4221 (#21226) · 345b67b5
  由 zhouwei25 提交于 11月 25, 2019
  
  345b67b5
- Z
  
  Cache 3rd source code, improve stability, reduce the compilation time (#21190) · 341dee06
  由 zhouwei25 提交于 11月 25, 2019
  
  341dee06
20 11月, 2019 1 次提交
- Z
  Change GCC version to be 8.2 in Dockerfile.GCC8 (#21222) · 925280b9
  由 Zeng Jinle 提交于 11月 20, 2019
```
* make Docker to gcc 8.2, test=develop

* add -std=c11 to grpc.cmake, test=develop
```
  925280b9
19 11月, 2019 1 次提交
- Z
  
  Determine whether to copy and link inference lib by ON_INFER (#20931) · c0dcb090
  由 zhouwei25 提交于 11月 19, 2019
  
  c0dcb090
18 11月, 2019 3 次提交

Fix warn of gcc8 (#21205) · cdb3d279

由 Zeng Jinle 提交于 11月 18, 2019

* fix warnings oof gcc 8 compilation, test=develop

* fix boost::bad_get, test=develop

* refine PADDLE_ENFORCE, test=develop

cdb3d279

Z
fix bug when build openblas with a computer that has installed openblas... · 5d821578
由 zhouwei25 提交于 11月 18, 2019
```
fix bug when build openblas with a computer that has installed openblas before,test=develop (#21160)
```
5d821578

Better TensorRT support (#20858) · 330b173c

由 Jeng Bai-Cheng 提交于 11月 18, 2019

* Fix TensorRT detection bug

1. Add new search path for TensorRT at tensorrt.cmake
2. Add better debug message
3. Fix the bug of detection of TensorRT version

In NVIDIA official docker image, TensorRT headers are located at
`/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
fail to detect TensorRT.

There is no debug/warning message to tell developer that TensorRT
is failed to be detected.

In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
compatibility fix.

* Fix TensorRT variables in CMake

1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`

Manually type path may locate incorrect path of TensorRT. Use the
paths detected by system instead.

* Fix TensorRT library path

1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
2. Fix TensorRT library path

inference_lib.cmake and setup.py.in need the path of TensorRT library
instead of the file of TensorRT library, so add new variable to fix it.

* Add more general search rule for TensoRT

Let system detect architecture instead of manually assign it, so
replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.

* Add more general search rule for TensorRT

Remove duplicate search rules for TensorRT libraries. Use
`${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so

test=develop

330b173c

12 11月, 2019 1 次提交
- Z
  
  Remove useless code of openblas and fix the previous incorrect message (#21092) · d2573550
  由 zhouwei25 提交于 11月 12, 2019
  
  d2573550
11 11月, 2019 1 次提交
- M
  Add Shallow clone to ExternalProjects (#21060) · 6cc544aa
  由 Michał Gallus 提交于 11月 11, 2019
```
test=develop
```
  6cc544aa
08 11月, 2019 3 次提交

Add transpose2 INT8 for mkl-dnn (#19424) · 77c20835

由 joanna.wozna.intel 提交于 11月 08, 2019

* Add transpose2 INT8 for mkl-dnn

test=develop

* Fix test_transpose_int8_mkldnn

test=develop

* Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"

This reverts commit 34011bdb, reversing
changes made to 2ce6473f.

* Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2""

This reverts commit 23754dd7.

* Add template to TransposeMKLDNNHandler

test=develop

* Resolve conflict

test=develop

* Restore get_size and refactor

test=develop

77c20835

Z

move more third party library related logic to third_party.cmake (#20927) · 89bc18ee
由 zhouwei25 提交于 11月 08, 2019

89bc18ee

Enrich the type of error and declare the error type interfaces (#21024) · 7ee25189

由 Chen Weihang 提交于 11月 08, 2019

* Enrich the type of error and declare the error type interfaces, test=develop

* adjust tests to adapt new form, test=develop

* add inference deps with error_codes.pb.h, test=develop

* restore stack iter start pos, test=develop

* polish code based review comments, test=develop

7ee25189

05 11月, 2019 1 次提交

Support NoNeedBufferVarsInference in dygraph backward (#20868) · 878a40f5

由 Zeng Jinle 提交于 11月 05, 2019

* support no need buffer vars in dygraph, test=develop

* fix inference compilation error, test=develop

* update no_need_buffer_vars_inference, test=develop

* add unittests for no_need_buffer_vars_context, test=develop

* refine no_need_buffer_vars by return ref, test=develop

* polish some codes, test=develop

878a40f5

04 11月, 2019 1 次提交
- Z
  
  fix mklml and cblas bug,test=develop (#20970) · 394edd86
  由 zhouwei25 提交于 11月 04, 2019
  
  394edd86
31 10月, 2019 2 次提交

GradMaker for dygraph (#19706) · 8c4573a3

由 hong 提交于 10月 31, 2019

* refactor dygraph,test=develop

* fix failed unittest,test=develop

* polish code,test=develop

* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop

* polish vlog and profiler, test=develop

* try to fix preceding ops order,test=develop

* test transformer in windows ci, test=develop

* use python c-api to speed up tracer.trace,test=develop

* test=develop, fix docker with paddle nccl problem

* test=develop, add ut for debug string and gradient_accumulator

* test=develop, add tests for layer/gradient_accumulator/prepared_op

* test=develop, fix complie error for test_prepared_op

* test=develop, add more ut for dygraph

* test=develop, create API.spec for dygraph api change

* optimize grad maker; test=develop

* optimize grad maker

* test

* grad make optim; test=develop

* fix unittest bugs; test=develop

* add dygraph grad op maker and split_op

* grad op maker refactor; test=develop

* add dygraph grad maker; test=develop

* fix op deformable_conv_v1_op bug; test=develop

* fix deformable_conv prroi pool bugs;

* fix new op grad op maker bug; test=develop

* fix split by ref bug; test=develop

* fix dygraph auto prune bug; test=develop

* fix test_trace bug; test=develop

* fix fused emb seq pool bug; test=develop

* remove useless code in op_desc file; test=develop

* remove useless code, StrVarBaseNode; test=develop

* fix review issues; test=develop

* fix rank_loss grad maker; test=develop

* remove flag in VarBase; test=develop

* fix distributed_notify_op compile bug ; test=develop

* fix reshape op double grad; test=develop

* fix expand as op; test=develop

* add impertive type_defs.h for demo_train; test=develop

* fix inference lib cmake; test=develop

* fix inference lib; test=develop

* fix infernce_lib; test=develop

* fix inference cmake; test=develop

* fix inference lib; test=develop

* fix inference lib; test=develop

* remove condition dygraph grad maker, modify local name; test=develop

* fix split grad maker bug; test=develop

* fix pyramid_op bug; test=develop

* change travis time out limit; test=develop

* restore travis; test=develop

* change timeout limit; test=develop

8c4573a3

Z

Integration of third_party compilation structure (#20887) · b7417610
由 zhouwei25 提交于 10月 31, 2019

b7417610

28 10月, 2019 1 次提交
- W
  
  remove the warning issue test=develop (#20718) · 3b31b74e
  由 wopeizl 提交于 10月 28, 2019
  
  3b31b74e
22 10月, 2019 1 次提交
- Z
  
  Cmake_generotor support has been added to enable multi-version VS support (#20755) · bcd77e14
  由 zhouwei25 提交于 10月 22, 2019
  
  bcd77e14
18 10月, 2019 2 次提交
- W
  add support to gcc8, add docker env test=develop (#19807) · 9e594823
  由 wopeizl 提交于 10月 18, 2019
```
* add support to gcc8, add docker env test=develop
```
  9e594823
- W
  
  Fix dgc nan by stripping nccl from sparseReduce. (#20630) · 507afa8a
  由 WangXi 提交于 10月 17, 2019
  
  507afa8a
15 10月, 2019 1 次提交
- 石
  
  fix version.cmake, test=develop (#20606) · 48b27229
  由石晓伟提交于 10月 15, 2019
  
  48b27229
14 10月, 2019 1 次提交

Dlpack support (#20039) · 12e4be03

由 633WHU 提交于 10月 14, 2019

* support dlpack to tensor and implement python interface test=develop

* add unittest for _to_dlpack and from_dlpack test=develop

12e4be03

07 10月, 2019 1 次提交
- T
  trainer from dataset fetch targets (#19760) · c9139c3d
  由 tangwei12 提交于 10月 07, 2019
```
add executor.FetchHandler for train/infer from the dataset
```
  c9139c3d
02 10月, 2019 1 次提交

Add multihead op for ernie opt (#19933) · e8673668

由 zhaoyuchen2018 提交于 10月 02, 2019

* Add multihead op for ernie opt

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine softmax

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine kernel.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine cuda kernel

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine cuda version

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine cmake

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

e8673668

29 9月, 2019 1 次提交

fix conv2d and conv3d: (#20042) · 3aa331d9

由 liym27 提交于 9月 29, 2019

1.support asymmetric padding;
    2.support padding algorithm:"SAME" and "VALID";
    3.support channel_last: data_format NHWC and NDHWC;
    4.change doc of python API and c++;

    test=develop, test=document_preview

3aa331d9

27 9月, 2019 1 次提交

石

update operator compatible info, test=develop (#19978) · 01b9d079

由石晓伟提交于 9月 27, 2019

* update operator compatible info, test=develop

* revert cmake/version.cmake, test=develop

* add unit_tests and fix bugs, test=develop

* update ../paddle/fluid/framework/framework.proto, test=develop

* fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop

* update paddle/fluid/framework/version_test.cc, test=develop

* add comments and rename interfaces, test=develop

01b9d079

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致