提交 · 5b07ca9cdd602ec56584fa201d7869337fc02601 · BaiXuePrincess / Paddle

24 9月, 2019 1 次提交

- ReImplemented pooling fwd mkldnn (#19911) · 5b07ca9c

由 Jacek Czaja 提交于 9月 24, 2019

- First implementation of BWD and FWD of pooling mkl-dnn

- Compilation fix

- Fix

- Fix

 - Fix

- Fix to crash

- Compilation fix

- Combined AcquireBacward with Fwd

test=develop

5b07ca9c

23 9月, 2019 1 次提交
- C
  Delete local execution scopes (#19749) · d7251a8e
  由 chengduo 提交于 9月 23, 2019
```
* Add RecordHistoryLocalExecScopes
test=develop
```
  d7251a8e
22 9月, 2019 1 次提交

Add lock to cudnn handle calls (#19845) · c7f36e7c

由 Zeng Jinle 提交于 9月 22, 2019

* refine reallocate of workspace size, test=develop

* add lock to cudnn handle calls, test=develop

c7f36e7c

20 9月, 2019 2 次提交

Z

remove enforce.h file written, test=develop (#19897) · b25d1e75
由 Zeng Jinle 提交于 9月 20, 2019

b25d1e75

[MKL-DNN] LRN refactoring (#19798) · 619c797a

由 Jacek Czaja 提交于 9月 20, 2019

- LRN mkl-dnn kernel refactor

test=develop

- compilation fix

- Another compilation fix

- Compilation fix

- another compilation fix

- compilation fix

- Crash fix

- optional LRN mkldnn workspace

- Added mid allocation

- Workaround for tests

- Removed gradient from is_test ut

- Removed mid for inference

- Reverted LRN mid removal for is_test

- PADDLE_ENFORCE adjusted

- Rebase to templatization commit

- Compilation fix

- compilation fix

test=develop

- lint

test=develop

- Fix to crash

- Rebase to recent codebase

 - lin

- lint

- compilation fix

619c797a

19 9月, 2019 2 次提交

Refactor conv computeINT8 (#19574) · 2c32c2d6

由 lidanqing 提交于 9月 19, 2019

* fix conflicts
test=develop

* change mask_bias_reorder
test=develop

* add ComputeMask function to make code clear
test=develop

* change according to reviews
test=develop

* change according to reviews
test=develop

2c32c2d6

Add template functions for Acquire primitive/primitive_desc (#19867) · c7e68892

由 Adam 提交于 9月 19, 2019

* Add template functions for Acquire primitive/primitive_desc
test=develop

* Move acquire primitive descriptor to protected section
test=develop

c7e68892

18 9月, 2019 2 次提交
- Z
  
  remove some flags and add comments to some flags, test=develop (#19813) · 13ca364c
  由 Zeng Jinle 提交于 9月 18, 2019
  
  13ca364c
- Z
  
  refine reallocate of workspace size, test=develop (#19843) · 5eb381a3
  由 Zeng Jinle 提交于 9月 18, 2019
  
  5eb381a3
17 9月, 2019 1 次提交
- A
  Add MKLDNNhandlerT templatized class (#19801) · dfdd73cb
  由 Adam 提交于 9月 17, 2019
```
test=develop
```
  dfdd73cb
16 9月, 2019 1 次提交
- Z
  
  reduce default value of cudnn workspace size, test=develop (#19780) · 32b1151f
  由 Zeng Jinle 提交于 9月 16, 2019
  
  32b1151f
14 9月, 2019 2 次提交
- A
  Add common CreateKey for mkldnn handlers (#19767) · d4413a54
  由 Adam 提交于 9月 14, 2019
```
test=develop
```
  d4413a54
- Y
  Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774) · 0d6ea529
  由 Yihua Xu 提交于 9月 13, 2019
```
test=develop
```
  0d6ea529
12 9月, 2019 1 次提交
- J
  Refactoring activation mkldnn op (#19748) · 9e4c9585
  由 Jacek Czaja 提交于 9月 12, 2019
```
test=develop

- fix to BWD

test=develop
```
  9e4c9585
11 9月, 2019 1 次提交

Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320

由 Huihuang Zheng 提交于 9月 11, 2019

TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.

We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.

Also added data_feed_proto to operator to fix CI in CPU compilation

12542320

10 9月, 2019 2 次提交

A
MKLDNN handler cleanup (#19713) · 428b2b9e
由 Adam 提交于 9月 10, 2019
```
* MKLDNN handler cleanup

* MKLDNN handler cleanup
test=develop
```
428b2b9e

Add document annotations for FLAGS that need to be open to external developers... · 27235cf2

由 XiaoguangHu 提交于 9月 10, 2019

Add document annotations for FLAGS that need to be open to external developers test=develop (#19692)

Add document annotations for FLAGS that need to be open to external developers

27235cf2

09 9月, 2019 1 次提交

paddle::framework::vectorize() templatization [PART3] (#19643) · f05d2c51

由 Tao Luo 提交于 9月 09, 2019

* paddle::framework::vectorize() templatization

test=develop

* update pybind/imperative.cc

test=develop

* revert update on unsqueeze_op.cc and warpctc_cudnn_op.cu.cc

test=develop

f05d2c51

05 9月, 2019 2 次提交

Integrate NVRTC to support compiling CUDA kernel at runtime (#19422) · 42b5bec6

由 Yiqun Liu 提交于 9月 05, 2019

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

42b5bec6

unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631) · 3ae939e4

由 Tao Luo 提交于 9月 05, 2019

* remove assert.h

* change PADDLE_ASSERT_MSG to PADDLE_ENFORCE

test=develop

* fix tensorrt paddle_enforce

test=develop

3ae939e4

03 9月, 2019 3 次提交
- T
  refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603) · 75d15719
  由 Tao Luo 提交于 9月 03, 2019
```
test=develop
```
  75d15719
- A
  using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568) · e94b26da
  由 Adam 提交于 9月 03, 2019
```
* using MKLDNNMemoryFormat = mkldnn::memory::format changes
test=develop

* PADDLE_ENFORCE update
test=develop
```
  e94b26da
- T
  replace PADDLE_ASSERT with PADDLE_ASSERT_MSG (#19586) · 49523ea1
  由 Tao Luo 提交于 9月 03, 2019
```
* remove unused PADDLE_ASSERT(_IS_NOT_ERROR)

* replace PADDLE_ASSERT with PADDLE_ASSERT_MSG

test=develop
```
  49523ea1
02 9月, 2019 1 次提交
- Z
  
  fix the compilation issue on windows caused by mkl_CSRMM (#19533) · 84c72801
  由 zhouwei25 提交于 9月 02, 2019
  
  84c72801
01 9月, 2019 2 次提交

[MKL-DNN] Refactoring Softmax (#19312) · cef95ee3

由 Jacek Czaja 提交于 9月 01, 2019

* - First set of modifications

- Compilation fixes

- compilation fix

- Another compilation fix

- Moved AcquireSoftmaxPrimitiveDescriptor call into handler

- MKL-DNN Softmax PD refactor

test=develop

- Compilation fix

test=develop

- another compilation fix

- cosmetcis

test=develop

- Compilation fix

- Fix to crash when softmax backward is created

* - Fixes after review of softmax refactoring

test=develop

cef95ee3

Add retry_allocator for gpu (#19409) · 0a73f720

由 Zeng Jinle 提交于 9月 01, 2019

* add retry_allocator for gpu, test=develop

* follow chengduoZH's comments, test=develop

* follow huihuang's comments,test=develop

* change f,l in enforce.h to be file,line, test=develop

* increase code coverage by adding unittests, test=develop

* fix CMakeLists.txt, test=develop

0a73f720

30 8月, 2019 3 次提交

[MKL-DNN] Fix to face model on AVX512 platforms (#19282) · ecd9f330

由 Jacek Czaja 提交于 8月 30, 2019

- Refactor step 1

- Compilation fix

- Yet another compilation fix

- Even more compilation fix

- Lint fixes

test=develop

- Removed deprectaed PADDLE_ENFORCE occurance

test=develop

- Candidate fix to BN forward

- Lint fixes

test=develop

- Refactoring in data_layout_transform

- compilation fix

- Another comppilation fix

- Step further into darkness

- Yet another compilation fix

- Yet another compilation fix

- missing header

- compilation fix

- Added MKLDNN -> Paddle conversion in fetch op

test=develop

- Compilation fix

test=develop

- Lint

test=develop

- Mul fix

- Fix to MKLDNN MUL op and Elementwise MUL UT

test=develop

- Workaround for diffrent weights with groups representation Paddle vs
  MKL-DNN.

test=develop

- Candidate fix for 5D convolution with groups

- Refactor of fix for conv3d and conv2d in fetch op

test=develop

- Compilation fix

- Still same compilation fix

- Compilation fix

- Compilation fix

- Reverted refactoring of fixes

- Adapted test_conv2d_int8_mkldnn so it exects data in NCHW format
  not NHWC

test=develop

- minor fix in UT

test=develop

- Lint fixes

test=develop

ecd9f330

L

add dynamic C runtime support on windows, test=develop (#19502) · d6cb1a41
由 liuwei1031 提交于 8月 30, 2019

d6cb1a41
Z

remove signal raise msg, test=develop (#19527) · c2c5b1b9
由 Zeng Jinle 提交于 8月 30, 2019

c2c5b1b9

28 8月, 2019 1 次提交

Add signal message to stderr (#19421) · caf59d0f

由 Zeng Jinle 提交于 8月 28, 2019

* add signal message to stderr, test=develop

* add unittests for ugly SignalHandle, test=develop

caf59d0f

27 8月, 2019 2 次提交
- Y
  supports multiple NCCL communicators preserved in NCCLCommContext (#19407) · efb05ba2
  由 Yi Liu 提交于 8月 27, 2019
```
* supports multiple NCCL communicators preserved in NCCLCommContext
test=develop

* add ut for c_comm_init_all operator and fix cuda resource release problem
test=develop
```
  efb05ba2
- W
  save the callstack information to file when exception throws test=dev… (#19324) · b8aa37d5
  由 wopeizl 提交于 8月 27, 2019
```
* save the callstack information to file when exception throws test=develop
```
  b8aa37d5
20 8月, 2019 2 次提交

replace part of PADDLE_ASSERT to PADDLE_ENFORCE (#19285) · 6527a7df

由 Tao Luo 提交于 8月 20, 2019

* replace part of PADDLE_ASSERT to PADDLE_ENFORCE

test=develop

* remove unused fallback_alloc_size_

* add unit-test of CUDAPinnedAllocator

test=develop

6527a7df

Use sparse matrix to implement fused emb_seq_pool operator (#19064) · b9203958

由 Yihua Xu 提交于 8月 20, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* Ignore the deprecated status for windows

test=develop

b9203958

19 8月, 2019 1 次提交
- Z
  Make PADDLE_ENFORCE_EQ support types that cannot be converted to std::string (#19243) · 91a0911c
  由 Zeng Jinle 提交于 8月 19, 2019
```
* make PADDLE_ENFORCE_EQ support cannot to string types, test=develop

* follow huihuang's comments, test=develop
```
  91a0911c
16 8月, 2019 2 次提交
- Z
  
  move_flags_to_unified_files_for_management, test=develop (#19224) · 708bd979
  由 Zeng Jinle 提交于 8月 16, 2019
  
  708bd979
- Z
  
  add PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#19211) · 002f325d
  由 Zeng Jinle 提交于 8月 16, 2019
  
  002f325d
15 8月, 2019 1 次提交
- A
  Add generalized Conv+Activation MKLDNN fuse pass creation (#19072) · b837689e
  由 Adam 提交于 8月 15, 2019
```
test=develop
```
  b837689e
12 8月, 2019 2 次提交
- G
  Polish fleet API to support cuda collective mode and nccl2 mode. (#18966) · 29d87812
  由 gongweibao 提交于 8月 12, 2019
```
Polish fleet API to support cuda collective mode and nccl2 mode
```
  29d87812
- W
  add tensorrt support for windows (#19084) · 80b7ef6f
  由 wopeizl 提交于 8月 12, 2019
```
* add tensorrt support for windows
```
  80b7ef6f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致