1. 06 12月, 2019 1 次提交
    • Z
      CHERRY_PICK: Better TensorRT support (#20858) (#21578) · 0a4002f5
      Zhaolong Xing 提交于
      * Fix TensorRT detection bug
      
      1. Add new search path for TensorRT at tensorrt.cmake
      2. Add better debug message
      3. Fix the bug of detection of TensorRT version
      
      In NVIDIA official docker image, TensorRT headers are located at
      `/usr/include/x86_64-linux-gnu` and TensorRT libraries are located
      at `/usr/lib/x86_64-linux-gnu`, so using `-DTENSORRT_ROOT` will
      fail to detect TensorRT.
      
      There is no debug/warning message to tell developer that TensorRT
      is failed to be detected.
      
      In later version of TensorRT (e.g. v6), `NV_TENSORRT_MAJOR` is
      defined at `NvInferVersion.h` instead of `NvInfer.h`, so add
      compatibility fix.
      
      * Fix TensorRT variables in CMake
      
      1. Replace `${TENSORRT_ROOT}/include` with `${TENSORRT_INCLUDE_DIR}`
      2. Replace `${TENSORRT_ROOT}/lib` with `${TENSORRT_LIBRARY}`
      
      Manually type path may locate incorrect path of TensorRT. Use the
      paths detected by system instead.
      
      * Fix TensorRT library path
      
      1. Add new variable - `${TENSORRT_LIBRARY_DIR}`
      2. Fix TensorRT library path
      
      inference_lib.cmake and setup.py.in need the path of TensorRT library
      instead of the file of TensorRT library, so add new variable to fix it.
      
      * Add more general search rule for TensoRT
      
      Let system detect architecture instead of manually assign it, so
      replace `x86_64-linux-gnu` with `${CMAKE_LIBRARY_ARCHITECTURE}`.
      
      * Add more general search rule for TensorRT
      
      Remove duplicate search rules for TensorRT libraries. Use
      `${TENSORRT_LIBRARY_DIR}` to get full path of libnvinfer.so
      
      test=release/1.6
      0a4002f5
  2. 05 12月, 2019 2 次提交
  3. 04 12月, 2019 1 次提交
  4. 21 11月, 2019 1 次提交
    • C
      Cherry-pick error type support for release1.6 (#21294) · 974b8a83
      Chen Weihang 提交于
      * delete paddle infershape enforce marco (#20832)
      
      * Polish and arrange code in enforce.h (#20901)
      
      * Enrich the type of error and declare the error type interfaces (#21024)
      
      * Enrich the type of error and declare the error type interfaces, test=develop
      
      * adjust tests to adapt new form, test=develop
      
      * add inference deps with error_codes.pb.h, test=develop
      
      * restore stack iter start pos, test=develop
      
      * polish code based review comments, test=develop
      
      * Add dependency for error_codes.proto (#21084)
      
      * fix activation_functions deps, test=develop, test=document_fix
      
      * add error_codes_proto deps, test=develop, test=document_fix
      
      * try delete enforce.h, test=develop, test=document_fix
      
      * change cuda enforce & add example (#21142)
      test=release/1.6
      974b8a83
  5. 30 10月, 2019 1 次提交
  6. 21 10月, 2019 1 次提交
  7. 14 10月, 2019 1 次提交
  8. 08 10月, 2019 1 次提交
  9. 03 10月, 2019 2 次提交
  10. 27 9月, 2019 1 次提交
    • update operator compatible info, test=develop (#19978) · 01b9d079
      石晓伟 提交于
      * update operator compatible info, test=develop
      
      * revert cmake/version.cmake, test=develop
      
      * add unit_tests and fix bugs, test=develop
      
      * update ../paddle/fluid/framework/framework.proto, test=develop
      
      * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop
      
      * update paddle/fluid/framework/version_test.cc, test=develop
      
      * add comments and rename interfaces, test=develop
      01b9d079
  11. 20 9月, 2019 1 次提交
  12. 19 9月, 2019 1 次提交
    • Y
      Add a pass to fuse fc+elementwise_add+layernorm (#19776) · 3cd985a6
      Yiqun Liu 提交于
      * Add fc_elementwise_layernorm_fuse pass and unittest.
      
      * Add fused_fc_elementwise_layernorm op and its GPU kernel.
      test=develop
      
      * Apply fc_elementwise_layernorm_fuse_pass to GPU inference.
      
      * Add the setting of attrs in the definition of binary_op.
      test=develop
      
      * Add comment.
      
      * Implement the unittest.
      test=develop
      
      * Change the unittest name of layer_norm.
      test=develop
      3cd985a6
  13. 17 9月, 2019 1 次提交
  14. 16 9月, 2019 1 次提交
  15. 11 9月, 2019 2 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
    • Y
      Implement the GPU kernel of fc operator (#19687) · a65c728e
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      a65c728e
  16. 10 9月, 2019 1 次提交
  17. 07 9月, 2019 1 次提交
  18. 04 9月, 2019 3 次提交
  19. 31 8月, 2019 1 次提交
    • H
      Paddlebox Framework (#18982) · c756b5d2
      hutuxian 提交于
      * Support looking up embeddings from BoxPS.
      * Add a _pull_box_sparse op, for now this op is not exposed to users.
      * Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
      * Add 'BoxPSDataset' in python code.
      * Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
      * Add UT.
      * More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982
      c756b5d2
  20. 30 8月, 2019 1 次提交
  21. 20 8月, 2019 1 次提交
  22. 19 8月, 2019 3 次提交
  23. 14 8月, 2019 2 次提交
  24. 12 8月, 2019 1 次提交
  25. 01 8月, 2019 1 次提交
  26. 31 7月, 2019 1 次提交
  27. 29 7月, 2019 1 次提交
  28. 24 7月, 2019 1 次提交
  29. 23 7月, 2019 1 次提交
  30. 22 7月, 2019 1 次提交
  31. 19 7月, 2019 2 次提交