提交 · ed0a564c909353c862bcb1533e41420fdd87eb9e · PaddlePaddle / PaddleDetection

11 1月, 2018 1 次提交
- H
  
  Optimize GemmConvMobileFunction. · ed0a564c
  由 hedaoyuan 提交于 1月 11, 2018
  
  ed0a564c
10 1月, 2018 7 次提交
- H
  
  Optimize maxPoolForward. · 2b202f75
  由 hedaoyuan 提交于 1月 10, 2018
  
  2b202f75
- Y
  [WIP] feature/parallel_gpu (#7293) · 4bcc0b64
  由 Yang Yang(Tony) 提交于 1月 10, 2018
```
feature/parallel_gpu
```
  4bcc0b64
- D
  Make init device on all gpu by default (#7345) · 5f985000
  由 dzhwinter 提交于 1月 10, 2018
```
* "init use all default devices"

* "fix init test"
```
  5f985000
- Q
  
  change data type of beam_search op (#7374) · efe06caa
  由 Qiao Longfei 提交于 1月 10, 2018
  
  efe06caa
- Q
  Topk share lod (#7373) · 91f80f79
  由 Qiao Longfei 提交于 1月 10, 2018
```
* add lod tensor ToAbsOffset test

* add share lod to topk op and softmax op
```
  91f80f79
- X
  Calculating gradients for partial graph · 585dec3d
  由 xuwei06 提交于 1月 06, 2018
```
Added backward.calc_gradient to backpropagate gradient from given targets to inputs.
```
  585dec3d
- X
  
  Fix comment for norm_op · 0ef9dc61
  由 xuwei06 提交于 1月 05, 2018
  
  0ef9dc61
09 1月, 2018 9 次提交

Y

Add grad_op_maker for sequence_pool. · 427e4745
由 yangyaming 提交于 1月 09, 2018

427e4745
Q

add general memory usage interface for both CPU/CUDA (#7352) · 45e77154
由 QI JUN 提交于 1月 09, 2018

45e77154
F

refine WhileGradOp code · fbc30215
由 fengjiayi 提交于 1月 09, 2018

fbc30215
Y
Test dist word2vec (#7334) · e249ad12
由 Yancey 提交于 1月 09, 2018
```
* test dist word2vec

* multiple trainers work
```
e249ad12

Port WarpCTC Operator (#5107) · b5fda272

由 Yiqun Liu 提交于 1月 09, 2018

* Add Seq2BatchFunctor, which will be used in WarpCTCOp.

* Implement WrapCTCFunctor and WrapCTCKernel.

* Add unittest of warpctc_op.

* Modify the check_output inferface in python unittest framework to allow check a subset of outputs.

* Use absolute offset lod in warpctc_op and related functors.

* Refine the comments of warpctc_op.

* The new python unittest supports checking a subset of the outputs, so revoke the previous change.

* Rename the transform from LoDTensor to Tensor with shape [max_sequence_length, num_sequences, sequence_width] to PaddingSequenceFunctor.

* Update to the newest codes.

* Rename the PaddingSequenceFunctor to PaddingLoDTensorFunctor and remove the computation of dimensions out of the functos.

b5fda272

F

Update · 8f962f74
由 fengjiayi 提交于 1月 09, 2018

8f962f74
Y
Rename CopyFrom to Copy for tensors (#7292) · ce6dad3b
由 Yu Yang 提交于 1月 09, 2018
```
* Rename Tensor::CopyFrom to Tensor::Copy

* Fix CI

* Fix compile
```
ce6dad3b
Y
Remove unused LoDTensor methods (#7247) · 1dad4bb2
由 Yu Yang 提交于 1月 09, 2018
```
* Remove unused LoDTensor methods

* Update
```
1dad4bb2
Q

fix GetDims bug · 8b1a81a9
由 qiaolongfei 提交于 1月 09, 2018

8b1a81a9

08 1月, 2018 13 次提交
- Q
  
  disable UseAll when init · 5b94948b
  由 qiaolongfei 提交于 1月 08, 2018
  
  5b94948b
- Q
  
  fix priority · 0b52cc88
  由 qiaolongfei 提交于 1月 08, 2018
  
  0b52cc88
- Y
  Create tensor in recv op (#7286) · aa75f1e2
  由 Yancey 提交于 1月 08, 2018
```
* create tensor in recv op

* static global function to global function
```
  aa75f1e2
- Q
  
  add back priority · ca90356b
  由 qiaolongfei 提交于 1月 08, 2018
  
  ca90356b
- L
  
  fix compile error in profiler.cc · 01ee42b1
  由 Luo Tao 提交于 1月 08, 2018
  
  01ee42b1
- Y
  
  Add more comments · 3b0afae3
  由 Yang Yu 提交于 1月 08, 2018
  
  3b0afae3
- D
  Feature/add shared layout (#7233) · e94db381
  由 dzhwinter 提交于 1月 08, 2018
```
* "reuse ShareLoD with no regret"

* "removed base class shareLayout"

* "fix CI"
```
  e94db381
- Y
  
  Refine get_places · 63ff0b4b
  由 Yang Yu 提交于 1月 08, 2018
  
  63ff0b4b
- Q
  cpu gpu transform function (#7191) · 0f353ab4
  由 Qiao Longfei 提交于 1月 08, 2018
```
* add rename guard

* add device_data_transform

* add device_data_transform_test

* modify GetExpectedKernelType

* update operator.run

* support test test_label_semantic_roles

* optimize code

* optimize code

* rename GetActualKernelType to GetExpectedKernelType

* fix chunk_eval_op and device_data_transform_test

* add is_same_place to place

* optimize code, refine rename_guard

* refine rename guard, add GetKernelTypeForVar

* optimize code

* add some log

* rename guard

* use sub scope to create var

* fix compile

* add IsInitialized for Tensor

* add VarIsTensor

* fix op_registry_test

* test

* tmp disable priority

* restore switch_kernel.md

* code clean
```
  0f353ab4
- Y
  
  Remove unused included header gflags · ea0280b4
  由 Yibing Liu 提交于 1月 08, 2018
  
  ea0280b4
- Y
  
  Remove the redundant switch case statement · d09503b2
  由 Yibing Liu 提交于 1月 08, 2018
  
  d09503b2
- E
  Show argument dimensions with operator::DebugStringEx (#7268) · 8814bec0
  由 emailweixu 提交于 1月 07, 2018
```
This can make it easier to locate error.
```
  8814bec0
- S
  
  Modify inference.cc to run example without pickletools (#7262) · 12e35141
  由 Siddharth Goyal 提交于 1月 07, 2018
  
  12e35141
06 1月, 2018 1 次提交
- Y
  
  Fix profiler place bug · 7a4f3be9
  由 Yibing Liu 提交于 1月 06, 2018
  
  7a4f3be9
05 1月, 2018 9 次提交
- T
  
  package right mkldnn and mklml libs if enabled in capi · 11ed2f2f
  由 tensor-tang 提交于 1月 05, 2018
  
  11ed2f2f
- Y
  
  Fix bad_alloc bug & refine code in profiler · df3b250c
  由 Yibing Liu 提交于 1月 05, 2018
  
  df3b250c
- T
  
  fix crash when generating c-api package · 5ab27182
  由 tensor-tang 提交于 1月 05, 2018
  
  5ab27182
- G
  
  Enhance reorder_lod_tensor_by_rank_op to support Tensor · e2192354
  由 guosheng 提交于 1月 05, 2018
  
  e2192354
- Y
  
  capi package (#7237) · 643ff03f
  由 Yancey 提交于 1月 05, 2018
  
  643ff03f
- Y
  
  Make time calc funcs return ms instead of us · 5a0a4617
  由 Yibing Liu 提交于 1月 05, 2018
  
  5a0a4617
- Y
  
  fix typos · d7e56847
  由 Yibing Liu 提交于 1月 05, 2018
  
  d7e56847
- Y
  
  Enable sorting the profiling result by different keys · 0aa03a82
  由 Yibing Liu 提交于 1月 05, 2018
  
  0aa03a82
- Y
  Add COWPtr and its unittest · 0cfb5465
  由 Yang Yu 提交于 1月 05, 2018
```
It will be used for LoD information in LoDTensor since LoD is a copy
on write field.

It is pretty slow for copying LoD information between operators. For
resnet it will cost roughly 10% time of whole time, including reading
data.
```
  0cfb5465

PaddlePaddle / PaddleDetection 大约 1 年 前同步成功

PaddlePaddle / PaddleDetection
大约 1 年前同步成功