- 21 8月, 2019 1 次提交
-
-
由 chengduo 提交于
* add warning info for CPU_NUM test=develop * update dygraph parallel.py test=develop * prune the feed op in compiler test=release/1.5 * remove compile from PE test=develop * test CUDAPinnedPlace in reader test=release/1.5
-
- 20 8月, 2019 1 次提交
-
-
由 chengduo 提交于
* fix REGISTER_OP_WITHOUT_GRADIENT test=develop
-
- 29 7月, 2019 1 次提交
-
-
由 chengduo 提交于
* fix backward bug
-
- 08 7月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
fix mask rcnn add interface for setting optim_cache_dir(eg: when in trt int8 mode, and load model from memory, there should be a interface for setting the trt calibration table data dir) test=release/1.5
-
- 05 7月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 28 6月, 2019 1 次提交
-
-
由 石晓伟 提交于
* Update the Anakin interfaces for content-dnn and MLU (#17890) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * modify the access level of anakin engine (#18015) test=develop * fix ci test cmake test=develop
-
- 27 6月, 2019 1 次提交
-
-
由 chengduo 提交于
* update pe reduce config test=release/1.5 * drop the local_exe_scopes of the previous parallel_executor test=release/1.5
-
- 26 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=release/1.5
-
- 24 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=release/1.5
-
- 19 6月, 2019 3 次提交
-
-
由 chengduo 提交于
* update execution_strategy option default value test=release/1.5 * fix doc error test=release/1.5
-
由 chengduo 提交于
* remove nccl dep when the number of GPU is 1 test=develop * use multi card run syncBN test=release/1.5
-
由 hutuxian 提交于
Add trainer_desc proto DEPS to solve CI random fail.
-
- 17 6月, 2019 1 次提交
-
-
由 hutuxian 提交于
cherry-pick for (https://github.com/PaddlePaddle/Paddle/pull/17402) Add Pipeline Concurrency Train Mode: - Cpp: pipeline_trainer & section_worker - Python: PipelineOptimizer - Add a new data_feed type: PrivateInstantDataFeed - Add a test demo of pipeline trainer and the test model is gnn - Do not support win32 now
-
- 15 6月, 2019 1 次提交
-
-
由 chengduo 提交于
* update CPU_NUM config test=develop
-
- 14 6月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 13 6月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 10 6月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* remove attribute in Allocator::Allocate, test=develop * fix travis ci error, test=develop
-
由 gongweibao 提交于
-
- 08 6月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 06 6月, 2019 2 次提交
-
-
由 gongweibao 提交于
-
由 wopeizl 提交于
* fix the ParallelExecutor on Windows test=develop * restrict to use one GPU only under windows
-
- 05 6月, 2019 1 次提交
-
-
由 baojun 提交于
* delay infershape test=develop * fall back subblock to paddle test=develop * fix edge cases test=develop * remove output duplicates test=develop * handle reshape2_grad infershape test=develop
-
- 04 6月, 2019 2 次提交
- 03 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 31 5月, 2019 1 次提交
-
-
由 guru4elephant 提交于
* fix prepare context redundant code problem, optimize executor by caching create_varaiables test=develop * cache sub_scope, program, var when use_program_cache=True is set * make fetch_list runable with variables, add more unittest for use_program_cache
-
- 30 5月, 2019 2 次提交
-
-
由 chengduo 提交于
* add event for fast executor and add threads for scopebuffer executor test=develop
-
由 Yiqun Liu 提交于
* Enhance fused_elementwise_activation op. test=develop * Move the api fused_elementwise_activation to contrib. test=develop * Add including files. test=develop * Add the support of sigmoid in fused_elementwise_activetion op. * Update API.spec. test=develop
-
- 29 5月, 2019 2 次提交
-
-
由 gongweibao 提交于
-
由 mozga-intel 提交于
-
- 28 5月, 2019 1 次提交
-
-
由 Jacek Czaja 提交于
* - changes to graph detector - Changes to pass - Added ut for new pass - use_pass - Added pass to mkldnn passes - fix to registration - improved verbose messaging for conv bias passes - Lint fixes test=develop * - Lint fixes test=develop
-
- 27 5月, 2019 3 次提交
-
-
由 Sylwester Fraczek 提交于
* add Concat quantization add unit test for quantizing concat fix for wrong value when the input is not in map of calculated scales add use_quantizer to concat_op.cc add scale_algo rules for concat test=develop * missing fix for multiple inputs quantize-squash * wojtuss review fix: adding comment test=develop
-
由 gongweibao 提交于
-
由 Zeng Jinle 提交于
* Revert "Revert "Fix allocator bug"" This reverts commit 174d0d0b. * Revert "fix travis ci" This reverts commit 5656fa9f. test=develop * add inlined_vector.h, test=develop * add inlined_vector_test,test=develop * clean code of allocator,test=develop * delete zero_size_allocator.h,test=develop * fix failed unittest,test=develop
-
- 25 5月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* fluid int8 train and trt int8 predict align. trt int8 predict init op converter * 2. align fluid int8 train and trt int8 inference. enhance quant dequant fuse pass enhance op converter, trt engine, trt engine op, trt subgraph pass. * 3. add delete_quant_dequant_pass for trt test=develop * 4. add the missing file test=develop * 5. i modify the c++ interface, but forget to modify the pybind code fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter test=develop
-
- 24 5月, 2019 5 次提交
-
-
由 Michał Gallus 提交于
* fuse mul and elementwise add to fc * Reimplement the FC forward operator * Fix FC MKLDNN integration by transposing weights * Add FC MKLDNN Pass test=develop * FC MKLDNN Pass: change memcpy to std::copy * Fix MKLDNN FC handling of mismatch input and weights dims * Lower tolerance for MKL-DNN in resnet50 test test=develop * Adjust FC to support MKLDNN Op placement test=develop * Adjust Placement Op to set use_mkldnn attribute for graph test=develop * MKLDNN FC: fix weights format so that gemm version is called test=develop * FC MKLDNN: Remove tolerance decrease from tester_helper * FC MKL-DNN: Refactor the code, change input reorder to weight reorder * MKL-DNN FC: Introduce operator caching test=develop * FC MKL-DNN: Fix the tensor type in ExpectedKernelType test=develop * FC MKL-DNN: fix style changes test=develop * FC MKL-DNN: fallback to native on non-supported dim sizes test=develop * FC MKLDNN: fix CMake paths test=develop * FC MKLDNN: Refine placement pass graph mkldnn attribute test=develop * Fix Transpiler error for fuse_conv_eltwise test=develop * Fix missing STL includes in files test=develop * FC MKL-DNN: Enable new output size computation Also, refine pass to comply with newest interface. test=develop * FC MKL-DNN: enable only when fc_mkldnn_pass is enabled * FC MKL-DNN: Allow Weights to use oi or io format * FC MKL-DNN: Adjust UT to work with correct dims test=develop * Enable MKL DEBUG for resnet50 analyzer test=develop * FC MKL-DNN: Improve Hashing function test=develop * FC MKL-DNN: Fix shape for fc weights in transpiler * FC MKL-DNN: Update input pointer in re-used fc primitive * Add log for not handling fc fuse for unsupported dims test=develop * FC MKL-DNN: Move transpose from pass to Op Kernel test=develop * FC MKL-DNN: Disable transpose in unit test test=develop * FC MKL-DNN: Remove fc_mkldnn_pass from default list * Correct Flag for fake data analyzer tests test=develop * FC MKL-DNN: Add comment about fc mkldnn pass disablement test=develop * FC MKL-DNN: Disable fc in int8 tests test=develop
-
由 wopeizl 提交于
* add __str__ method for tensor and lodtensor to support print test=develop
-
由 Sylwester Fraczek 提交于
* add conv_concat_relu fuse test=develop * add test code test=develop * added missing include with unordered_map test=develop * review fixes for wojtuss test=develop * remove 'should (not) be fused' comment statements one of them was invalid anyway test=develop
-
由 Sylwester Fraczek 提交于
* fix quantize_squash_pass segfault when there is no tensor linked do Bias input test=develop * add googlenet test test=develop * fix concat CreateKey not using input format test=develop
-
由 guru4elephant 提交于
* polish_executor_and_add_ctx_cache
-