1. 26 11月, 2019 1 次提交
    • G
      Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) · 234060f8
      GaoWei8 提交于
      * Add fc padding to solve mkl performance
      test=develop
      
      * fix gpu pass and error information
      test=develop
      
      * fix fc_fuse_pass_test
      test=develop
      
      * fix error information
      test=develop
      
      * fix error information
      test=develop
      
      * fix name and add fc op padding test
      test=develop
      
      * fix attributes
      test=develop
      
      * optimize fc padding
      test=develop
      
      * fix test
      test=develop
      234060f8
  2. 03 9月, 2019 1 次提交
    • Y
      A a pass to enable the use of cudnn (#19346) · c5548178
      Yiqun Liu 提交于
      * Add a interface to enable cudnn for inference.
      
      * Add cudnn_placement_pass.
      test=develop
      
      * Set the default value of cudnn_enabled_op_types to null.
      test=develop
      
      * Write the common basic class, placement_pass_base, to refine the codes.
      test=develop
      
      * Call EnableCUDNN in unittest.
      test=develop
      
      * Refine cudnn_placement_pass tester.
      
      * Enable the testing of cudnn_placement_pass in inference's unittest.
      test=develop
      
      * Add the check of op kernels.
      test=develop
      c5548178
  3. 31 7月, 2019 1 次提交
    • Z
      Trt fp16 support (#18860) · 61238d31
      Zhaolong Xing 提交于
      * Fix Mask rcnn predictor
          1. refine memory optim algorithm to support the model with the block op.
          2. output diff : modify the affine channel fuse
          3. add condition_block_infer op
      add interface for setting trt calib table dir
      test=develop
      
      * add the missing files.
      test=develop
      
      * 1 add trt fp16 support
      test=develop
      61238d31
  4. 08 7月, 2019 1 次提交
  5. 06 6月, 2019 1 次提交
  6. 29 5月, 2019 1 次提交
  7. 25 5月, 2019 1 次提交
    • Z
      TRT: Support set dynamic range in int8 mode. (#17524) · 61221ebc
      Zhaolong Xing 提交于
      * fluid int8 train and trt int8 predict align.
      trt int8 predict init
      op converter
      
      * 2. align fluid int8 train and trt int8 inference.
      enhance quant dequant fuse pass
      enhance op converter, trt engine, trt engine op, trt subgraph pass.
      
      * 3. add delete_quant_dequant_pass for trt
      
      test=develop
      
      * 4. add the missing file
      test=develop
      
      * 5. i modify the c++ interface, but forget to modify the pybind code
      fix the IS_TRT_VERSION_GE bug, and fix elementwise op converter
      test=develop
      61221ebc
  8. 16 5月, 2019 1 次提交
  9. 07 5月, 2019 1 次提交
    • Cherry-pick benchmark related changes from release/1.4 (#17156) · a72dbe9a
      石晓伟 提交于
      * cherry-pick commit from 88770542
      
      * cherry-pick commit from 3f0b97df
      
      * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn
      
      (cherry picked from commit 8643dbc2)
      
      * Cherry-Pick from 16662 : Anakin subgraph cpu support
      
      (cherry picked from commit 7ad182e1)
      
      * Cherry-pick from 1662, 16797.. : add anakin int8 support
      
      (cherry picked from commit e14ab180)
      
      * Cherry-pick from 16813 : change singleton to graph RegistBlock
      test=release/1.4
      
      (cherry picked from commit 4b9fa423)
      
      * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2
      
      Support ShuffleNet and MobileNet-v2, test=release/1.4
      
      (cherry picked from commit a6fb066f)
      
      * Cherry-pick : anakin subgraph add opt config layout argument #16846
      test=release/1.4
      
      (cherry picked from commit 8121b3ec)
      
      * 1. add shuffle_channel_detect
      
      (cherry picked from commit 6efdea89)
      
      * update shuffle_channel op convert, test=release/1.4
      
      (cherry picked from commit e4726a06)
      
      * Modify symbol export rules
      
      test=develop
      a72dbe9a
  10. 28 3月, 2019 1 次提交
    • C
      Fix the interface of Pass::Apply (#16484) · ed61d67c
      chengduo 提交于
      * modify the interface of Pass::Allay
      test=develop
      
      * Polish code
      test=develop
      
      * Fix Travis CI
      test=develop
      
      * fix Pass::Apply interface
      test=develop
      
      * Fix Travis CI
      test=develop
      ed61d67c
  11. 25 3月, 2019 1 次提交
  12. 22 3月, 2019 1 次提交
  13. 21 3月, 2019 1 次提交
  14. 20 3月, 2019 4 次提交
  15. 19 3月, 2019 1 次提交
  16. 18 3月, 2019 1 次提交
    • W
      Add cpu_quantize_pass for C-API quantization (#16127) · 2579ade4
      Wojciech Uss 提交于
      * Add cpu_quantize_pass for C-API quantization
      
      test=develop
      
      * add cpu_quantize_pass test
      
      * fix lint: add include memory unorderd_map and unordered_set
      
      test=develop
      
      * fuse_relu 1
      
      test=develop
      
      * tuned 2 without squash
      
      * fixes
      
      test=develop
      
      * remove unused vars
      
      test=develop
      
      * refactored
      
      test=develop
      
      * fix lint c-style cast -> C++ style cast
      
      test=develop
      
      * remove QuantMax and c style casts
      
      test=develop
      
      * last usage of QuantMax removed
      
      test=develop
      
      * Fix Analysis Predictor UT
      
      Check if memory_optimize_pass has already been added
      to the analysis config before adding a new one, so
      that it is not added multiple times.
      test=develop
      
      * change map to unordered_map
      
      fix the forgotten part of cpu_quantize_pass_tester.cc
      
      test=develop
      
      * removed quantized attribute
      
      * fixed cpu_quantize_pass_tester and op attr comments
      
      test=develop
      
      * removed redundant line
      
      test=debug
      
      * removed gmock
      
      test=develop
      
      * fix after merge
      2579ade4
  17. 08 3月, 2019 4 次提交
  18. 07 3月, 2019 1 次提交
    • N
      cant not pass ci · a9ed4277
      nhzlx 提交于
      add if use static engine for trt
      test=develop
      a9ed4277
  19. 26 2月, 2019 1 次提交
  20. 22 2月, 2019 1 次提交
    • N
      5. add static trt load model · 1d5ef7c9
      nhzlx 提交于
      1). add static trt load model
      2). fix bug: when device_id is not 0, the trt will have a bug
      test=develop
      1d5ef7c9
  21. 18 2月, 2019 1 次提交
  22. 13 2月, 2019 1 次提交
    • G
      Clang build fixes (#15628) · da9c94da
      Gabor Buella 提交于
      * Remove some superfluous std::move calls
      
      The std:move triggered a build error (with -Werror):
      ```
      [  9%] Building CXX object paddle/fluid/memory/allocation/CMakeFiles/allocator_facade.dir/allocator_facade.cc.o
      /home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
                  [this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
                                  ^
      /home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: note: remove std::move call here
                  [this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
                                  ^~~~~~~~~~                          ~
      1 error generated.
      ```
      
      See: https://reviews.llvm.org/D7633
      
      * Remove a superfluous lambda capture from framework/operator.h
      
      ```
      [ 10%] Building CXX object paddle/fluid/platform/CMakeFiles/device_context.dir/init.cc.o
      In file included from /home/tej/code/gbuella_paddle/paddle/fluid/platform/init.cc:19:
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.h:229:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
                         [this](Variable* var) { return var; });
                          ^~~~
      1 error generated.
      ```
      
      Changing it to `return it->second;`, as is in the function below.
      
      * Rethrow an exception (instead of copying it)
      
      ```
      [ 11%] Building CXX object paddle/fluid/framework/CMakeFiles/operator.dir/operator.cc.o
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: error: local variable 'exception' will be copied despite being thrown by name [-Werror,-Wreturn-std-move]
            throw exception;
                  ^~~~~~~~~
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: note: call 'std::move' explicitly to avoid copying
            throw exception;
                  ^~~~~~~~~
                  std::move(exception)
      
      ```
      
      See https://reviews.llvm.org/D43322 for an explanation of this diagnostic message.
      
      * Remove an unused variable
      
      ```
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:884:16: error: private field 'scope_' is not used [-Werror,-Wunused-private-field]
        const Scope& scope_;
                     ^
      ```
      
      * struct ComputationOpHandle -> class ComputationOpHandle
      
      ```
      [ 13%] Building CXX object paddle/fluid/framework/details/CMakeFiles/memory_early_delete_pass.dir/memory_early_delete_pass.cc.o
      In file included from /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/memory_early_delete_pass.cc:21:
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: error: class 'ComputationOpHandle' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
      class ComputationOpHandle;
      ^
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/computation_op_handle.h:29:8: note: previous use is here
      struct ComputationOpHandle : public OpHandleBase {
             ^
      /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: note: did you mean struct here?
      class ComputationOpHandle;
      ^~~~~
      struct
      1 error generated.
      ```
      
      * Fix name() methods under fluid/operators
      
      ```
      In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.cc:15:
      In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.h:19:
      /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/jitcode.h:71:23: error: 'name' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
        virtual const char* name() const = 0;
                            ^
      /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen_base.h:31:23: note: overridden virtual function is here
        virtual const char* name() const = 0;
                            ^
      ```
      
      test=develop
      da9c94da
  23. 31 1月, 2019 1 次提交
  24. 29 1月, 2019 1 次提交
  25. 26 1月, 2019 1 次提交
  26. 25 1月, 2019 1 次提交
  27. 24 1月, 2019 1 次提交
    • N
      fix two bug: · 0779e355
      nhzlx 提交于
      1. graph and program_desc alignment
      2. trt stream
      
      test=develop
      0779e355
  28. 21 1月, 2019 1 次提交
  29. 16 1月, 2019 1 次提交
  30. 09 1月, 2019 1 次提交
  31. 07 1月, 2019 1 次提交
  32. 26 12月, 2018 1 次提交
  33. 08 12月, 2018 1 次提交
  34. 14 11月, 2018 1 次提交