提交 · 549e6de7ac2e846fe7c7c7cca78bb8d88bc434d5 · 机器未来 / Paddle

14 1月, 2020 2 次提交
- Z
  faster build by reduce by-product, reduce linking library and fix compile... · 549e6de7
  由 zhouwei25 提交于 1月 14, 2020
```
faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164)
```
  549e6de7
- X
  add collective communication library in fleet (#22211) · e3a457d3
  由 xujiaqi01 提交于 1月 14, 2020
```
* add collective communication library in fleet to replace mpi
* test=develop
```
  e3a457d3
11 1月, 2020 1 次提交
- W
  support fluid-lite subgraph run resnet test=develop (#22191) · 5750152e
  由 Wilber 提交于 1月 11, 2020
```
- 添加了fluid-lite子图方式运行resnet的单测
- 修改了依赖Lite的git commit id
```
  5750152e
10 1月, 2020 2 次提交

Add bn and relu fuse pass (#22048) · 46189b16

由 Zhen Wang 提交于 1月 10, 2020

* add bn and relu fuse pass

* add op attr assert and dtype assert

* fix some inputs&&outputs bugs for the fused op and pattern.

* add the unittest for fuse_bn_act_pass. test=develop

* use normative enforce statements. test=develop

* add the cpu test. test=develop

* add the support of batch_size=1 for the bn with relu op. test=develop

* add the error type for paddle throws. test=develop

* add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop

46189b16

B

Upgrade nGraph to use mkldnn v1.1 (#22154) · f8516ccb
由 baojun 提交于 1月 09, 2020

f8516ccb

09 1月, 2020 2 次提交
- 石
  
  [Feature] Lite subgraph (#22114) · ad0dfb17
  由石晓伟提交于 1月 09, 2020
  
  ad0dfb17
- Z
  tweak the interface of cache_third_party function - expose the SOURCE_DIR for... · 4f7a2bd0
  由 zhouwei25 提交于 1月 09, 2020
```
tweak the interface of cache_third_party function - expose the SOURCE_DIR for each external library (#21899)
```
  4f7a2bd0
06 1月, 2020 1 次提交
- A
  
  MKL-DNN 1.1 for Windows (#22089) · 700fdb18
  由 Adam 提交于 1月 06, 2020
  
  700fdb18
04 1月, 2020 1 次提交
- A
  
  Update MKL-DNN to 1.1 (#21754) · c112b645
  由 Adam 提交于 1月 04, 2020
  
  c112b645
03 1月, 2020 1 次提交

Add the first implememtation of fusion_group op (#19621) · d4832077

由 Yiqun Liu 提交于 1月 03, 2020

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Refine the calling of PADDLE_ENFORCE.
test=develop

d4832077

26 12月, 2019 3 次提交
- Z
  
  remove patch command and file of warpctc to Improved quality of Paddle Repo (#21929) · 8b15acd7
  由 zhouwei25 提交于 12月 26, 2019
  
  8b15acd7
- Z
  Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902) · 2df4be5d
  由 zhouwei25 提交于 12月 26, 2019
```
* Fix openblas to support compile on Windows when WITH_MKL=OFF
```
  2df4be5d
- Z
  
  remove patch command and file of grpc to Improved quality of Paddle Repo (#21778) · cad058ce
  由 zhouwei25 提交于 12月 26, 2019
  
  cad058ce
25 12月, 2019 1 次提交
- Z
  
  remove patch command and file of cares to Improved quality of Paddle Repo (#21776) · a01663ca
  由 zhouwei25 提交于 12月 25, 2019
  
  a01663ca
24 12月, 2019 1 次提交
- Z
  
  fix cp bug of warpctc repository,test=develop (#21901) · 3e1404d2
  由 zhouwei25 提交于 12月 24, 2019
  
  3e1404d2
16 12月, 2019 2 次提交
- X
  fix compile error when WITH_PSLIB=ON (#21702) · 37896e90
  由 xujiaqi01 提交于 12月 16, 2019
```
* fix compile error when WITH_PSLIB=ON
* test=develop
```
  37896e90
- Z
  
  fix wrong commitID with patch file of warpctc (#21755) · 34dc7106
  由 zhouwei25 提交于 12月 16, 2019
  
  34dc7106
12 12月, 2019 1 次提交
- Z
  
  fix the bug that cannot pathch command for the second time (#21596) · 03133c2c
  由 zhouwei25 提交于 12月 12, 2019
  
  03133c2c
11 12月, 2019 1 次提交
- B
  
  update ngraph to v0.27 test=develop (#21677) · 45d2fa4e
  由 baojun 提交于 12月 10, 2019
  
  45d2fa4e
10 12月, 2019 1 次提交

MKL-DNN 1.0 Update (#20162) · e81f0228

由 Adam 提交于 12月 10, 2019

* MKLDNN v1.0 rebase to Paddle 1.6
test=develop

* Add hacky paddle::string::to_string() implementation

* vectorize<int64-t>() -> vectorize() cleanup
test=develop

* PADDLE_ENFORCE and void_cast fixes
test=develop

* Rebase changes
test=develop

* Cosmetics
test=develop

* Delete MKL from mkldnn.cmake
test=develop

* CMake debug commands
test=develop

* Delete MKLDNN_VERBOSE and rebase fixes
test=develop

* Rebase fixes
test=develop

* Temporarily disable int8 resnet101 vgg16 and vgg19 tests
test=develop

* Add libmkldnn.so.1 to python setup
test=develop

* Add libmkldnn.so.1 to inference_lib cmake after rebase
test=develop

* Post rebase fixes + FC int8 changes
test=develop

* Fix LRN NHWC
test=develop

* Fix NHWC conv3d
test=develop

* Windows build fix + next conv3d fix
test=develop

* Fix conv2d on AVX2 machines
test=develop

e81f0228

09 12月, 2019 1 次提交

dygraph_grad_maker supports varbase without grad_var (#21524) · 84b72671

由 Leo Chen 提交于 12月 09, 2019

* dygraph_grad_maker supports varbase without grad_var, test=develop

* fix compile, test=develop

* fix test_tracer, test=develop

* follow comments, test=develop

84b72671

05 12月, 2019 1 次提交

Split VarBase from Python Variable for Dygraph (#21359) · cdd46d7e

由 Leo Chen 提交于 12月 05, 2019

* test=develop, fix docker with paddle nccl problem

* don't expose numerous Tensor.set(), test=develop

* fix condition, test=develop

* fix float16 bug, test=develop

* feed should be Tensor or np.array, not Variable or number, test=develop

* use forcecast to copy numpy slice to new array, test=develop

* remove float16-uint16 hacking, test=develop

* add variable method to varbase and refactor to_variable to support return varbase

* support kwargs in varbase constructor

* add VarBase constructor to support default python args

* refine varbase initial method

* reset branch

* fix ut for change VarBase error info to PaddleEnforce

* cherry is parameter change before

* overload isinstance to replace too many change of is_variable

* rm useless files

* rm useless code merged by git

* test=develop, fix some ut failed error

* test=develop, fix test_graph_wrapper

* add some tests, test=develop

* refine __getitem__, test=develop

* add tests, test=develop

* fix err_msg, test=develop

cdd46d7e

04 12月, 2019 1 次提交

modify the personal repo address of eigen and warpctc (#21445) · 46401786

由 silingtong123 提交于 12月 04, 2019

* modify the repo address of eigen and warpctc

* fix the eigen not work on windows

* fix the eigen and warpctc can't recompile

46401786

03 12月, 2019 2 次提交
- Z
  NV jetson(nano, tx2, xavier) inference compile support (#21393) · c5f0293c
  由 Zhaolong Xing 提交于 12月 03, 2019
```
* add jeston compile support
test=develop

* refine the cmake
test=develop
```
  c5f0293c
- T
  Revert "revert flags.cmake (#21437)" (#21485) · 060bf8d0
  由 Tao Luo 提交于 12月 03, 2019
```
This reverts commit c93c9e5b.
test=develop
```
  060bf8d0
02 12月, 2019 2 次提交
- G
  
  revert flags.cmake test=develop (#21437) · c93c9e5b
  由 gongweibao 提交于 12月 02, 2019
  
  c93c9e5b
- Z
  update openblas version (#21450) · 6aa13f46
  由 Zhaolong Xing 提交于 12月 02, 2019
```
test=develop
```
  6aa13f46
30 11月, 2019 1 次提交
- Z
  
  fix cub/threadpool include_dir to match setup.py.in,test=develop (#21436) · fce24315
  由 zhouwei25 提交于 11月 30, 2019
  
  fce24315
28 11月, 2019 2 次提交
- T
  remove -Wno-error=sign-compare, make warning as error (#21358) · c0656dcb
  由 Tao Luo 提交于 11月 28, 2019
```
* remove -Wno-error=sign-compare, make warning as error

test=develop test=document_fix

* fix exist compile warning

test=develop
```
  c0656dcb
- Z
  
  Eliminate the impact on incremental compilation (#21410) · b39f9476
  由 zhouwei25 提交于 11月 28, 2019
  
  b39f9476
27 11月, 2019 1 次提交

INT8 Fully-connected (#17641) · 5d7d5482

由 Michał Gallus 提交于 11月 27, 2019

* Implement Int8 FC

* Integrate FC into INT8v2

test=develop

* int8 FC: transpose weights before computing scales

test=develop

* Add support for activation_type string in FC

test=develop

* Disable MKL-DNN's FC in VGG16 and 19

test=develop

* Disable FC quantization when mkldnn FC is disabled

test=develop

* Solve PADDLE_ENFORCES in FC int8

* Fix Paddle enforces and remove const cast

test=develop

* Fix style changes

test=develop

* Fix quantizer_tester test and add fc quantization

test=develop

* Fix FC test fail on CUDA

* Remove unnecessary log from quantize placement pass

test=develop

* Add Thread ID to FC hash key

test=develop

* Add comments to MKL-DNN FC Kernel

test=develop

* Refactor quantizer

test=develop

* Fix linter issues

test=develop

* Fix crash in slim googlenet

test=develop

* Fix PADDLE_ENFORCE messages

test=develop

5d7d5482

26 11月, 2019 2 次提交
- T
  make CUDA_ARCH_NAME default Auto (#21352) · d8e7d252
  由 Tao Luo 提交于 11月 26, 2019
```
* make CUDA_ARCH_NAME default Auto

test=develop

* refine warning

test=develop
```
  d8e7d252
- S
  
  package the CAPI inference library and third_party (#21299) · 4b429c19
  由 silingtong123 提交于 11月 26, 2019
  
  4b429c19
25 11月, 2019 2 次提交
- Z
  
  remove warning LNK4006 and warning LNK4221 (#21226) · 345b67b5
  由 zhouwei25 提交于 11月 25, 2019
  
  345b67b5
- Z
  
  Cache 3rd source code, improve stability, reduce the compilation time (#21190) · 341dee06
  由 zhouwei25 提交于 11月 25, 2019
  
  341dee06
20 11月, 2019 1 次提交
- Z
  Change GCC version to be 8.2 in Dockerfile.GCC8 (#21222) · 925280b9
  由 Zeng Jinle 提交于 11月 20, 2019
```
* make Docker to gcc 8.2, test=develop

* add -std=c11 to grpc.cmake, test=develop
```
  925280b9
19 11月, 2019 1 次提交
- Z
  
  Determine whether to copy and link inference lib by ON_INFER (#20931) · c0dcb090
  由 zhouwei25 提交于 11月 19, 2019
  
  c0dcb090
18 11月, 2019 3 次提交

Fix warn of gcc8 (#21205) · cdb3d279

由 Zeng Jinle 提交于 11月 18, 2019

* fix warnings oof gcc 8 compilation, test=develop

* fix boost::bad_get, test=develop

* refine PADDLE_ENFORCE, test=develop

cdb3d279

Z
fix bug when build openblas with a computer that has installed openblas... · 5d821578
由 zhouwei25 提交于 11月 18, 2019
```
fix bug when build openblas with a computer that has installed openblas before,test=develop (#21160)
```
5d821578

Better TensorRT support (#20858) · 330b173c

由 Jeng Bai-Cheng 提交于 11月 18, 2019

* Fix TensorRT detection bug

1. Add new search path for TensorRT at tensorrt.cmake
2. Add better debug message
3. Fix the bug of detection of TensorRT version

In NVIDIA official docker image, TensorRT headers are located at
`/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
fail to detect TensorRT.

There is no debug/warning message to tell developer that TensorRT
is failed to be detected.

In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
compatibility fix.

* Fix TensorRT variables in CMake

1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`

Manually type path may locate incorrect path of TensorRT. Use the
paths detected by system instead.

* Fix TensorRT library path

1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
2. Fix TensorRT library path

inference_lib.cmake and setup.py.in need the path of TensorRT library
instead of the file of TensorRT library, so add new variable to fix it.

* Add more general search rule for TensoRT

Let system detect architecture instead of manually assign it, so
replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.

* Add more general search rule for TensorRT

Remove duplicate search rules for TensorRT libraries. Use
`${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so

test=develop

330b173c

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致