提交 · 5c3cbb58e55d0686894c00b7c0d05d1ee6c0bea2 · Crayon鑫 / Paddle

12 6月, 2019 1 次提交

combine noavx and avx package (#17889) · 5c06bff2

由 tensor-tang 提交于 6月 12, 2019

* support avx and noavx core

* add catch and give some log

test=develop

* fix build

test=develop

* add missing package

test=develop

* fix pybind name

test=develop

* fix import error

test=develop

* conbime noavx core

test=develop

* add requirements

test=develop

* fix unkown message

test=develop

* fix api spec

test=develop

* refine and clean

test=develop

* update

* pass dist ut

* follow comments

test=develop

* refine scripts

test=develop

5c06bff2

10 6月, 2019 1 次提交
- J
  Feature/refine api for dygraph (#17907) · 4d5f6937
  由 Jiabin Yang 提交于 6月 10, 2019
```
* WIP

* WIP

* test=develop, add api doc and example code for dygraph
```
  4d5f6937
06 6月, 2019 4 次提交

G

Add backward and optimizer operator dependency pass. (#17746) · fbbdc9cc
由 gongweibao 提交于 6月 06, 2019

fbbdc9cc
W
Make ParallelExecutor support Windows GPU (#17787) · 453a49b1
由 wopeizl 提交于 6月 06, 2019
```
* fix the ParallelExecutor on Windows
test=develop
* restrict to use one GPU only under windows
```
453a49b1

翟

INT8 MKL-DNN v2 integrate to slim (#17634) · 993c703b

由翟飞跃提交于 6月 06, 2019

* refactor PR 16865

* delete mergetool files

* test=develop

* test=develop

* test=develop

* test=develop

* create dir for int8 model before call SaveOptimModel

* test=develop

* mkldnn int8 only support linux; test=develop

* refine code; test=develop

* remove comment; test=develop

* refine code; test=develop

* fix bug; test=develop

* add exception for mkldnn_post_training_strategy

* reuse int8v2 CAPI dataset; test=develop

* fix accuracy check bug; test=develop

* remove tab

* convert files to unix format

* test=develop

* reduce CI time;test=develop

* reduce CI time and refine code;test=develop

* refine comment; test=develop

* add cmake FLAGS;test=develop

* remove predict_num;test=develop

993c703b

use pyreader to read data in dygraph mode (#17314) · 841553e1

由 wopeizl 提交于 6月 06, 2019

* use pyreader to read data

* add return_list to PyReader to support return value represented as list

841553e1

05 6月, 2019 1 次提交

Use Python C-API to speed up dygraph trace (#17837) · 674e0ce2

由 Zeng Jinle 提交于 6月 05, 2019

* use python api to reduce python time cost, test=develop

* fix travis ci, test=develop

* fix Py_None error,test=develop

674e0ce2

04 6月, 2019 1 次提交

Using Smart pointer to optimizer memory usage of dyGraph (#17768) · 3b70f870

由 Jiabin Yang 提交于 6月 04, 2019

* for debug

* test=develop, memory optimize for dygraph using shared_ptr

* test=develop, fix travis ci showed error

* test=develop, fix bug for recurrent usage of varbase

* test=develop, init varbase when it need to be Add

3b70f870

31 5月, 2019 1 次提交

fix prepare context redundant code problem, optimize executor by cach… (#17743) · d5239109

由 guru4elephant 提交于 5月 31, 2019

* fix prepare context redundant code problem, optimize executor by caching create_varaiables
test=develop

* cache sub_scope, program, var when use_program_cache=True is set

* make fetch_list runable with variables, add more unittest for use_program_cache

d5239109

27 5月, 2019 2 次提交
- Z
  
  clean code of py_layer in dygraph mode,test=develop (#17661) · 432ac701
  由 Zeng Jinle 提交于 5月 27, 2019
  
  432ac701
- G
  
  Add multi-ncclcomm and 2D ncclallreduce support. (#17263) · 65bbf950
  由 gongweibao 提交于 5月 27, 2019
  
  65bbf950
25 5月, 2019 1 次提交

TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc

由 Zhaolong Xing 提交于 5月 25, 2019

* fluid int8 train and trt int8 predict align.
trt int8 predict init
op converter

* 2. align fluid int8 train and trt int8 inference.
enhance quant dequant fuse pass
enhance op converter, trt engine, trt engine op, trt subgraph pass.

* 3. add delete_quant_dequant_pass for trt

test=develop

* 4. add the missing file
test=develop

* 5. i modify the c++ interface, but forget to modify the pybind code
fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
test=develop

61221ebc

24 5月, 2019 4 次提交
- W
  add __str__ method for tensor and lodtensor to support print test=dev… (#17588) · 6724a652
  由 wopeizl 提交于 5月 24, 2019
```
* add __str__ method for tensor and lodtensor to support print test=develop
```
  6724a652
- G
  add Run Prepared Ctx (#17616) · 326bf829
  由 guru4elephant 提交于 5月 24, 2019
```
add Run Prepared Ctx, fix pybind problem
```
  326bf829
- F
  BuildStrategy api comment (#17348) · 2280f185
  由 flame 提交于 5月 24, 2019
```
Python examples of fluid.layers.io.double_buffer and some BuildStrategy's methods.
```
  2280f185
- G
  polish_executor_and_add_ctx_cache (#17536) · 7f8bc49d
  由 guru4elephant 提交于 5月 24, 2019
```
* polish_executor_and_add_ctx_cache
```
  7f8bc49d
23 5月, 2019 3 次提交
- Z
  Fix allocator bug (#16712) · c6189637
  由 Zeng Jinle 提交于 5月 23, 2019
```
* Revert "Revert "Fix allocator bug""

This reverts commit 174d0d0b.

* Revert "fix travis ci"

This reverts commit 5656fa9f.

test=develop

* add inlined_vector.h, test=develop

* add inlined_vector_test,test=develop
```
  c6189637
- Q
  fix distribute doc test=develop (#17318) · 92e7d5d7
  由 Qiao Longfei 提交于 5月 23, 2019
```
* fix distribute doc
```
  92e7d5d7
- Q
  Async exe support communicator (#17386) · 58f7695a
  由 Qiao Longfei 提交于 5月 23, 2019
```
Async exe support communicator
```
  58f7695a
20 5月, 2019 1 次提交
- T
  remove unused expected_kernel_cache_pass (#17486) · 32da5e9c
  由 Tao Luo 提交于 5月 20, 2019
```
test=develop
```
  32da5e9c
17 5月, 2019 2 次提交

Y
polish parallel dygraph code (#17164) · 02175555
由 Yan Xu 提交于 5月 17, 2019
```
* add var grad hook test=develop
```
02175555

Fix/Fix memory leak in dygraph (#17394) · d7df4e5e

由 Jiabin Yang 提交于 5月 17, 2019

* test=develop, add gradient sort backward strategy

* test=develop, fix test by add FLAGS_cudnn_deterministic on new tests

* test=develop, fix memory leak in dygraph mode

* test=develop, fix memory leak in dygraph mode

* test=develop, polish code

* test=develop, polish code

* test=develop, polish code

d7df4e5e

16 5月, 2019 1 次提交

Add setting Scope function for the graph class (#17417) · 4a1b7fec

由 Zhen Wang 提交于 5月 16, 2019

* add set_not_owned function for graph

* add scope set. test=develop

* add scope_ptr enforce not null before setting.test=develop

4a1b7fec

15 5月, 2019 1 次提交

add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118) · 66d51206

由 jiaqi 提交于 5月 15, 2019

* add save/load model, shrink table, cvm, config file & fix pull dense bug
test=develop

* fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error
add client flush, add get data size
test=develop

* fix global shuffle bug
test=develop

* fix global shuffle bug
test=develop

* fix code style
test=develop

* fix code style & modify pslib cmake
test=develop

* fix error of _role_maker
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix windows compile error of fleet
test=develop

* fix global shuffle bug

* add comment
test=develop

* update pslib.cmake
test=develop

* fix fill sparse bug
test=develop

* fix push sparse bug
test=develop

66d51206

14 5月, 2019 1 次提交

make parallel_executor support FLAGS_use_mkldnn (#17341) · 68ec0a6f

由 Tao Luo 提交于 5月 14, 2019

* make parallel_executor support FLAGS_use_mkldnn

test=develop

* add warning when set mkldnn_enabled_op_types_ in non-mkldnn env

test=develop

68ec0a6f

13 5月, 2019 1 次提交

test=develop, add gradient sort backward strategy (#17125) · 4624d7c6

由 Jiabin Yang 提交于 5月 13, 2019

* test=develop, add gradient sort backward strategy

* test=develop, fix test by add FLAGS_cudnn_deterministic on new tests

4624d7c6

12 5月, 2019 1 次提交
- C
  Add DropLocalExeScopes in ParallelExecutor (#17297) · bc833945
  由 chengduo 提交于 5月 12, 2019
```
* reset drop local scope counter
test=develop
```
  bc833945
10 5月, 2019 1 次提交

Double backward of conv2d. (#17211) · e32c9888

由 qingqing01 提交于 5月 10, 2019

* Add conv2d_grad_grad_op
* Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h.
    - Now use it in conv2d_grad_grad.
    - Will simply the searching code in conv2d and conv2d_grad in next PR.
* Enhance and fix bug in unit testing of gradient_checker.
* Support to fetch empty variables，return None in Python.

e32c9888

08 5月, 2019 3 次提交

Repair api example (#17221) · e388a1fb

由 lujun 提交于 5月 08, 2019

Fix the following API examples:

paddle.fluid.scope_guard
paddle.fluid.backward.append_backward
paddle.fluid.cpu_places
paddle.fluid.cuda_pinned_places
paddle.fluid.cuda_places
paddle.fluid.in_dygraph_mode
paddle.fluid.CUDAPlace
paddle.fluid.CPUPlace
paddle.fluid.CUDAPinnedPlace

e388a1fb

C
Code Clean: Move all pass to paddle::framework::ir (#17228) · 04bd413a
由 chengduo 提交于 5月 08, 2019
```
* move pass to ir

* polish code
test=develop

* fix dependency
test=develop
```
04bd413a
Z

fix api doc,test=develop (#17241) · f2fa3f73
由 Zeng Jinle 提交于 5月 07, 2019

f2fa3f73

07 5月, 2019 1 次提交

石

Cherry-pick benchmark related changes from release/1.4 (#17156) · a72dbe9a

由石晓伟提交于 5月 07, 2019

* cherry-pick commit from 88770542

* cherry-pick commit from 3f0b97df

* cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn

(cherry picked from commit 8643dbc2)

* Cherry-Pick from 16662 : Anakin subgraph cpu support

(cherry picked from commit 7ad182e1)

* Cherry-pick from 1662, 16797.. : add anakin int8 support

(cherry picked from commit e14ab180)

* Cherry-pick from 16813 : change singleton to graph RegistBlock
test=release/1.4

(cherry picked from commit 4b9fa423)

* Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2

Support ShuffleNet and MobileNet-v2, test=release/1.4

(cherry picked from commit a6fb066f)

* Cherry-pick : anakin subgraph add opt config layout argument #16846
test=release/1.4

(cherry picked from commit 8121b3ec)

* 1. add shuffle_channel_detect

(cherry picked from commit 6efdea89)

* update shuffle_channel op convert, test=release/1.4

(cherry picked from commit e4726a06)

* Modify symbol export rules

test=develop

a72dbe9a

06 5月, 2019 1 次提交
- Z
  Fix tensor_py.h (#17195) · c5eeecca
  由 Zeng Jinle 提交于 5月 06, 2019
```
* fix tensor_py,test=develop

* change class name,test=develop
```
  c5eeecca
30 4月, 2019 1 次提交

Fix mem leak when converting Tensor to numpy array (#17182) · 5dfe2ab9

由 Zeng Jinle 提交于 4月 30, 2019

* fix mem leak when converting Tensor to numpy array
test=develop

* remove unused unittest,test=develop

* follow comments, test=develop

* fix dygraph bug,test=develop

5dfe2ab9

25 4月, 2019 1 次提交
- Y
  ParallelDyGraph with GPU collective mode (#16827) · 0b07eef1
  由 Yan Xu 提交于 4月 25, 2019
```
implement dygraph.parallel.DataParallel to hook reduce op.
```
  0b07eef1
22 4月, 2019 3 次提交
- L
  add doc for memory_optimize, test=develop (#17010) · a770ce06
  由 liuwei1031 提交于 4月 22, 2019
```
* add doc for memory_optimize, test=develop

* update doc, test=develop

* doc update, test=develop
```
  a770ce06
- Q
  Speed unit testing. (#16978) · ea42e431
  由 qingqing01 提交于 4月 22, 2019
```
* Speed affine_channel_op unit testing
* Add check in tensor_py
* Fix ONLY_CPU Compiling
```
  ea42e431
- W
  fix nccl wrapper on windows · 51a0243a
  由 wopeizl 提交于 4月 22, 2019
```
test=develop
```
  51a0243a
21 4月, 2019 1 次提交

Refine model gpu memory (#16993) · 1202d3fc

由 Zeng Jinle 提交于 4月 21, 2019

* speedup gc and inplace softmax_with_cross_entropy_grad
test=develop

* refine models gpu mem
Merge skip vars and warning messages of mem opt
remove relu mem opt
test=develop

* follow comments
test=develop

1202d3fc

18 4月, 2019 1 次提交
- G
  
  Polish DGC code (#16818) · cbdb8a17
  由 gongweibao 提交于 4月 18, 2019
  
  cbdb8a17

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致