提交 · acdb57a510d116d0c9f2a0d0b26083f474cb4f8a · BaiXuePrincess / Paddle

14 6月, 2018 6 次提交

T

initial with only 1 mkl/openblas threads for each pthreads · 3e58df20
由 tensor-tang 提交于 6月 14, 2018

3e58df20

Fix NCCLBcast hang up bug in Parallel Executor (#11377) · 046bb5c8

由 Qiyang Min 提交于 6月 13, 2018

* 1. Create buddy allocator in each places before NcclBcast the variables
2. Check the memory usage of ALL gpus rather than the first one

* 1. Make NCCLGroupGuard guards only the ncclBcast part, which avoid ncclGroupEnd blocking the exception throwing
2. NOTE the usage of NCCLGroupGuard

* Remove the memory usage check of gpus

* Fix code style

046bb5c8

Add mean IOU op. (#10519) · 6fcdb240

由 whs 提交于 6月 14, 2018

* Add mean_iou op.

* Add unitest for mean iou op.

* Add optional collections of confusion matrix and mean_iou.

* Fix cuda kernel.

* Refine code.
1. Merge computing in GPU to two kernel.
2. Use wrong array and correct array instead of confusion matrix.

* Add python api and fix cuda kernel.

* Fix comments.

* Small fix.

* Small fix.

6fcdb240

Q

add comment that out var of prefetch must be created in local scope · 490a07f5
由 qiaolongfei 提交于 6月 14, 2018

490a07f5

Remove cuptiFinalize. · d2afd210

由 Xin Pan 提交于 6月 14, 2018

In cupti samples, only cuptiFlush is used.
I can't find any places calling cuptiFinalize and
this API can error out as not_implemented in some
cuda installation.

d2afd210

Y

Dynamic Graph first prototype (#11415) · d827c6e8
由 Yang Yang(Tony) 提交于 6月 13, 2018

d827c6e8

13 6月, 2018 7 次提交
- Q
  
  fix a bug in prefetch · a49ee22e
  由 qiaolongfei 提交于 6月 13, 2018
  
  a49ee22e
- Q
  
  update comment · e6f54d5a
  由 qiaolongfei 提交于 6月 13, 2018
  
  e6f54d5a
- Q
  
  add more detailed comment · 2e48ab62
  由 qiaolongfei 提交于 6月 13, 2018
  
  2e48ab62
- Q
  
  add row_size for selected rows in DebugStringEx · 7ebef493
  由 qiaolongfei 提交于 6月 13, 2018
  
  7ebef493
- Y
  
  fix nccl dist train bug · d76ebd78
  由 yi.wu 提交于 6月 13, 2018
  
  d76ebd78
- Q
  
  fix concurrency_test build error on mac · 82416f18
  由 qiaolongfei 提交于 6月 13, 2018
  
  82416f18
- Q
  
  fix build on mac · 9ebbfa6b
  由 qiaolongfei 提交于 6月 13, 2018
  
  9ebbfa6b
12 6月, 2018 6 次提交
- Q
  
  optimize code and comment · d6c8d267
  由 qiaolongfei 提交于 6月 12, 2018
  
  d6c8d267
- T
  
  add initial memory flag in MB for infer · 056dd404
  由 tensor-tang 提交于 6月 12, 2018
  
  056dd404
- W
  Trainer send term signal (#11220) · 34865f2d
  由 Wu Yi 提交于 6月 12, 2018
```
* wip

* use executor.complete to end trainer

* fix build

* fix build with distribute off

* fix typo

* fix cmake typo

* fix build
```
  34865f2d
- L
  
  update with comments · c4c78733
  由 Luo Tao 提交于 6月 12, 2018
  
  c4c78733
- Q
  
  fix the default value prefetch_var_name_to_block_id · 2b9ff39f
  由 qiaolongfei 提交于 6月 12, 2018
  
  2b9ff39f
- Q
  Make the normalization operator more general and fix bug in l2_normalize. (#11348) · 19fd0717
  由 qingqing01 提交于 6月 12, 2018
```
* Add normalization operator.
1. Refine the raw norm_op and let it more general to support to normalize Tensor along any axis.
2. There is a bug in l2_normalize API, which lacks sqrt after `reduce_sum`.
3. Use norm_op to refine the l2_normalize API.
4. Fix bug in test_normalization_wrapper.py.
```
  19fd0717
11 6月, 2018 21 次提交
- W
  Add slice op. (#11052) · adc09087
  由 whs 提交于 6月 11, 2018
```
* Add slice op.

* Remove using from header file and fix doc.

* Fix doc

* Small fix.
```
  adc09087
- Q
  
  fix build problem · 83a577e8
  由 qiaolongfei 提交于 6月 11, 2018
  
  83a577e8
- L
  
  update with comments · 7bdb573d
  由 Luo Tao 提交于 6月 11, 2018
  
  7bdb573d
- Q
  
  optimize code · 506fc8d9
  由 qiaolongfei 提交于 6月 11, 2018
  
  506fc8d9
- G
  
  Add brpc surpport. (#11263) · d9de6b86
  由 gongweibao 提交于 6月 11, 2018
  
  d9de6b86
- X
  Make status update thread-safe · 1509ae3a
  由 Xin Pan 提交于 6月 11, 2018
```
The status is updated in the Process() thread
and can be checked in another HandleRequest() thread.
```
  1509ae3a
- Q
  
  optimize comment and code · ea106c91
  由 qiaolongfei 提交于 6月 11, 2018
  
  ea106c91
- L
  
  refine docs of elementwise_op etc. · 76941990
  由 Luo Tao 提交于 6月 11, 2018
  
  76941990
- Q
  
  set status before Finish in prefetch process · 7f4b9656
  由 qiaolongfei 提交于 6月 11, 2018
  
  7f4b9656
- D
  add inplace attribute to op_proto_maker (#10665) · bfa3fd6f
  由 dzhwinter 提交于 6月 11, 2018
```
* "add inplace attribute"

* "register inplace attribute"

* "change se-next model for memory-reuse"

* "fix typo"

* repick

* fix merge conflict

* "fix stupid error"
```
  bfa3fd6f
- Q
  
  set the thread pool of prefetch to 1 to fix a bug · 5aba10b5
  由 qiaolongfei 提交于 6月 11, 2018
  
  5aba10b5
- G
  
  polish (#11363) · 9087c668
  由 gongweibao 提交于 6月 11, 2018
  
  9087c668
- Q
  
  fix grpc_server_test · 8fb78f6c
  由 qiaolongfei 提交于 6月 11, 2018
  
  8fb78f6c
- C
  replace use_event with use_cuda, because use_event means the program running... · aadaadf7
  由 chengduoZH 提交于 6月 11, 2018
```
replace use_event with use_cuda, because use_event means the program running with CUDA, so use_cuda maybe more intuitive.
```
  aadaadf7
- Q
  
  update prefetch logic in grpc_server · 4e36c0ec
  由 qiaolongfei 提交于 6月 11, 2018
  
  4e36c0ec
- G
  
  Clean `sendop` `recv` operator. (#11309) · 627d7a64
  由 gongweibao 提交于 6月 11, 2018
  
  627d7a64
- C
  
  follow comments · 961fbce8
  由 chengduoZH 提交于 6月 11, 2018
  
  961fbce8
- Q
  
  refine prefetch logic · 0d3d4ae7
  由 qiaolongfei 提交于 6月 11, 2018
  
  0d3d4ae7
- C
  
  Add cpu test for parallel_executor_crf executor_fetch_feed, and enable these tests · 7b723839
  由 chengduoZH 提交于 6月 11, 2018
  
  7b723839
- C
  
  fix allReduce bug · d24e046c
  由 chengduoZH 提交于 6月 11, 2018
  
  d24e046c
- Y
  
  Add lock to record_event. · a1254a86
  由 yuyang18 提交于 6月 11, 2018
  
  a1254a86

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致