提交 · 23e3f790c0601136be1582cad7ad6da8b8e5ddb1 · Crayon鑫 / Paddle

27 7月, 2018 2 次提交
- T
  
  add flags to control num_threads · 9788e5ab
  由 tensor-tang 提交于 7月 27, 2018
  
  9788e5ab
- T
  
  control omp num_threads · 10a1c2bb
  由 tensor-tang 提交于 7月 27, 2018
  
  10a1c2bb
26 7月, 2018 1 次提交
- T
  
  fix cudnn enforce · 54e9fd3f
  由 typhoonzero 提交于 7月 26, 2018
  
  54e9fd3f
18 7月, 2018 1 次提交
- X
  
  variants · 7781297c
  由 Xin Pan 提交于 7月 10, 2018
  
  7781297c
12 7月, 2018 1 次提交
- M
  
  Improve Code · a1a1109c
  由 minqiyang 提交于 7月 12, 2018
  
  a1a1109c
11 7月, 2018 2 次提交
- M
  
  Add framework_proto to device context deps · 2cc6ca43
  由 minqiyang 提交于 7月 11, 2018
  
  2cc6ca43
- J
  MKLDNN: Extending Conv MKLDNN op to reuse MKLDNN primitives (#11750) · fbe25ef5
  由 Jacek Czaja 提交于 7月 11, 2018
```
* - Rebase of conv reuse

- clag formatter fixes

- Fix to conv reuse

- Yet another fix

- Fix

- Fix

- clagn format

* - comment update
```
  fbe25ef5
05 7月, 2018 3 次提交
- T
  
  fix conflicts · 2e418a52
  由 tensor-tang 提交于 7月 05, 2018
  
  2e418a52
- D
  Move fluid::framework::InitDevices into fluid::platform (#11757) · 4ed0b624
  由 dzhwinter 提交于 7月 05, 2018
```
* move to platform

* "move init from framework to platform"

* "remove used init"

* "fix ci"

* "fix ci"

* "fix generic"

* "fix ci"

* "fix ci"

* "fix ci"

* "disable fragile test"
```
  4ed0b624
- D
  
  "remove lapack" (#11966) · 99a99ec7
  由 dzhwinter 提交于 7月 05, 2018
  
  99a99ec7
03 7月, 2018 5 次提交
- Y
  
  Use std::map for Place <--> DeviceContext · 2d0e5592
  由 yuyang18 提交于 7月 03, 2018
  
  2d0e5592
- X
  
  hide utils to legacy · 94cb59ad
  由 Xin Pan 提交于 7月 03, 2018
  
  94cb59ad
- F
  
  add an unittest · ed4b2475
  由 fengjiayi 提交于 7月 03, 2018
  
  ed4b2475
- F
  
  fix unittests · 8553ac6a
  由 fengjiayi 提交于 7月 03, 2018
  
  8553ac6a
- F
  
  Add EOFException to represent EOF in C++ reader · 3fab4f65
  由 fengjiayi 提交于 7月 03, 2018
  
  3fab4f65
30 6月, 2018 2 次提交
- Y
  
  add debug to replacing enforce with GLOG for debug (#11244) · 28172bbb
  由 Yan Chunwei 提交于 6月 30, 2018
  
  28172bbb
- G
  
  fix code style (#11862) · e2b1c5d9
  由 gongweibao 提交于 6月 30, 2018
  
  e2b1c5d9
28 6月, 2018 1 次提交
- M
  
  Duplicated code was moved to common function · b8a04c2f
  由 mozga-intel 提交于 6月 26, 2018
  
  b8a04c2f
27 6月, 2018 1 次提交
- T
  
  move SetNumThreads to platform · e3a96300
  由 tensor-tang 提交于 6月 27, 2018
  
  e3a96300
23 6月, 2018 1 次提交
- Y
  No NCCL on macOS (#11652) · 2625178a
  由 Yi Wang 提交于 6月 22, 2018
```
* Make paddle no longer depend on boost

* Update enforce.h
```
  2625178a
22 6月, 2018 1 次提交
- C
  
  enhance ParallelExecutor stable (#11637) · da556ed6
  由 chengduo 提交于 6月 22, 2018
  
  da556ed6
21 6月, 2018 5 次提交

由 Jacek Czaja 提交于 5月 08, 2018

- Added hash function inside of MKLDNN softmax op to be used as handle for primitives stroing in a
context

- Style fixes to softmax mkldnn op

- Fixes after review

- Coding style

- Fix to style

- style fixes

- style fix

- style fixes

- Fix to cody style check

- Rephrasing a comment

fix t obroken merge

Fixes to rebase

Conflicts:
	benchmark/fluid/models/machine_translation.py
	cmake/external/mkldnn.cmake
	paddle/fluid/operators/softmax_mkldnn_op.cc

- Bumped revision of MKL-DNN up to have softmax backward primitive

- Added choosing MKLDNN softmax grad operator

- First reuse of softmax backward

- Reinvented reusing for softmax

- Fix to crash in reinvented reuse

- Clang format fixes

- Clang format fixes

- Improved softmax mkldnn reuse mechanism

- clang format fixes

- Fix to broken merge

- Fix

98f3ad3b

T
Revert "Merge pull request #11628 from PaddlePaddle/revert-11102-mozga-intel/Sum_mkldnn_layout" · d5fb8fa7
由 tensor-tang 提交于 6月 21, 2018
```
This reverts commit 4d8e8ee2, reversing
changes made to d6a9f005.
```
d5fb8fa7
T

remove usr local lib when dynamic load lib · 28a0ef95
由 tensor-tang 提交于 6月 21, 2018

28a0ef95
T

Revert "MKLDNN layout: Support for sum operator" · 90780e22
由 tensor-tang 提交于 6月 21, 2018

90780e22
C

Add No Mutex · c99fca5f
由 chengduoZH 提交于 6月 21, 2018

c99fca5f

20 6月, 2018 2 次提交
- T
  
  add usr local lib to dynamic search path · 3e73a7a9
  由 tensor-tang 提交于 6月 20, 2018
  
  3e73a7a9
- T
  
  enable dynamic load mklml lib on fluid · f503f129
  由 tensor-tang 提交于 6月 20, 2018
  
  f503f129
19 6月, 2018 2 次提交
- M
  
  MKLDNN layout: the code-review changes · 6512be59
  由 mozga-intel 提交于 6月 15, 2018
  
  6512be59
- T
  
  update the default cpu memory with MKLDNN · 9a25f289
  由 tensor-tang 提交于 6月 19, 2018
  
  9a25f289
16 6月, 2018 1 次提交
- T
  
  refine the initial cpu memory flag for mkldnn · a8c2ff31
  由 tensor-tang 提交于 6月 16, 2018
  
  a8c2ff31
14 6月, 2018 2 次提交

Fix NCCLBcast hang up bug in Parallel Executor (#11377) · 046bb5c8

由 Qiyang Min 提交于 6月 13, 2018

* 1. Create buddy allocator in each places before NcclBcast the variables
2. Check the memory usage of ALL gpus rather than the first one

* 1. Make NCCLGroupGuard guards only the ncclBcast part, which avoid ncclGroupEnd blocking the exception throwing
2. NOTE the usage of NCCLGroupGuard

* Remove the memory usage check of gpus

* Fix code style

046bb5c8

Remove cuptiFinalize. · d2afd210

由 Xin Pan 提交于 6月 14, 2018

In cupti samples, only cuptiFlush is used.
I can't find any places calling cuptiFinalize and
this API can error out as not_implemented in some
cuda installation.

d2afd210

13 6月, 2018 1 次提交
- Q
  
  fix build on mac · 9ebbfa6b
  由 qiaolongfei 提交于 6月 13, 2018
  
  9ebbfa6b
12 6月, 2018 1 次提交
- T
  
  add initial memory flag in MB for infer · 056dd404
  由 tensor-tang 提交于 6月 12, 2018
  
  056dd404
11 6月, 2018 1 次提交
- Y
  
  Add lock to record_event. · a1254a86
  由 yuyang18 提交于 6月 11, 2018
  
  a1254a86
08 6月, 2018 2 次提交
- G
  
  Update device_tracer.cc · 310598f9
  由 guochaorong 提交于 6月 08, 2018
  
  310598f9
- G
  
  fix some bugs introduced by unfreed memory · 0fec9469
  由 guochaorong 提交于 6月 08, 2018
  
  0fec9469
07 6月, 2018 1 次提交

Mkldnn layout (#11040) · 3ff9ba0e

由 mozga-intel 提交于 6月 07, 2018

* Add MKLDNN layout support in Paddle

Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.

* Add MKLDNN layout support in activation OP

* Don't populate layout from input to output when kMKLDNN in

* Refine pool mkldnn op kernel

* MKLDNN layout

* Remove the inferitance from tensor file

* MKLDNN layout: refactoring

* Remove additional #define to register new operator

* Prepare mkldnn tests to work with layout

3ff9ba0e

06 6月, 2018 1 次提交
- Q
  Fix PADDLE_ASSERT. (#10981) · e0a32074
  由 qingqing01 提交于 6月 06, 2018
```
* Enable assertions in CUDA.

* Fix PADDLE_ASSERT.
```
  e0a32074

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致