1. 06 11月, 2019 1 次提交
  2. 01 11月, 2019 1 次提交
  3. 31 10月, 2019 1 次提交
  4. 28 10月, 2019 1 次提交
  5. 25 10月, 2019 1 次提交
  6. 22 10月, 2019 1 次提交
  7. 20 10月, 2019 1 次提交
  8. 18 10月, 2019 3 次提交
  9. 17 10月, 2019 1 次提交
    • J
      [MKL-DNN] Added mkl-dnn cache clearing when creating Executor instance (#20241) · a1cd27f1
      Jacek Czaja 提交于
      * - Flushing mkl-dnn cache
      
      test=develop
      
      - Disabled clearing cache for LoadModel
      
      - Added clearing of mkl-dnn cache when Executor is created
      
      test=develop
      
      - Do not clear for GPU places
      
      test=develop
      
      - compilation fix
      
      test=develop
      
      * - Moved clearing of mkl-dnn cache in destructor of executor
      
      test=develop
      
      * - Compilation fix
      
      test=develop
      
      - Reverted conditional clearing of mkl-dnn cache in Executors's
        destructor
      
      test=develop
      
      - compilation fix
      a1cd27f1
  10. 16 10月, 2019 1 次提交
  11. 14 10月, 2019 1 次提交
    • 6
      Dlpack support (#20039) · 12e4be03
      633WHU 提交于
      * support dlpack to tensor and implement python interface test=develop
      
      * add unittest for _to_dlpack and from_dlpack test=develop
      12e4be03
  12. 12 10月, 2019 1 次提交
  13. 11 10月, 2019 1 次提交
  14. 30 9月, 2019 1 次提交
  15. 28 9月, 2019 2 次提交
    • Q
      Enable users to create custom cpp op outside framework. (#19256) · 1a3eef02
      qingqing01 提交于
      * How to write custom op needs to follow framework OP spec.
      * Package fluid_framework.so and headers into whl.
      * Add paddle.sysconfig.get_include() and paddle.sysconfig.get_lib() to get include dir and lib dir.
      * Export some C-APIs to merge OpInfo between core.so and custom_op.so.
      * Add unit testing.
      * Update API.spec.
      1a3eef02
    • L
      fix pool2d pool3d,support asymmetric padding and channel_last (#19739) · 24010472
      liym27 提交于
      * fix pool2d pool3d:
      1. support asymmetric padding;
      2. support padding algorithm:"SAME" and "VALID";
      3. support channel_last: data_format NHWC and NDHWC;
      4. support inferring shape when input with negative dims in compile time;
      5. change doc of python API and c++;
      6. fix bug in cuda kernel when Attr(adaptive) is true.
      
      test=develop,test=document_preview
      
      * fix 'tensors' to 'Tensors'. test=develop,test=document_preview
      
      * add test for converage ValueError.test=develop,test=document_preview
      
      * resolve conflict in test_pool2d. test=develop
      24010472
  16. 27 9月, 2019 1 次提交
    • C
      Paddle error message stack shaping and optimization (#19895) · b9163350
      Chen Weihang 提交于
      * shape and optimize paddle error message stack, test=develop
      
      * limit exception type & add unittest, test=develop
      
      * fix multi-platform problem, test=develop
      
      * fix related unnitest failed, test=develop
      
      * add doc & fix unittest errors, test=develop
      
      * fix function name error, test=develop
      
      * update tensor test exception msg compare, test=develop
      
      * remove unittest on win32, the dir format is different, test=develop
      
      * remove useless package, test=develop
      
      * add paddle enforce handler unittest, test=develop
      
      * add exception checkout, test=develop
      
      * fix coverage failed, test=develop
      
      * fix op registry test failed, test=develop
      
      * refactor whole pr, test=develop
      
      * remove test in CMakelist, test=develop
      
      * fix coverage, test=develop
      b9163350
  17. 26 9月, 2019 1 次提交
  18. 24 9月, 2019 2 次提交
  19. 23 9月, 2019 1 次提交
  20. 22 9月, 2019 1 次提交
  21. 20 9月, 2019 2 次提交
    • Z
      remove enforce.h file written, test=develop (#19897) · b25d1e75
      Zeng Jinle 提交于
      b25d1e75
    • J
      [MKL-DNN] LRN refactoring (#19798) · 619c797a
      Jacek Czaja 提交于
      - LRN mkl-dnn kernel refactor
      
      test=develop
      
      - compilation fix
      
      - Another compilation fix
      
      - Compilation fix
      
      - another compilation fix
      
      - compilation fix
      
      - Crash fix
      
      - optional LRN mkldnn workspace
      
      - Added mid allocation
      
      - Workaround for tests
      
      - Removed gradient from is_test ut
      
      - Removed mid for inference
      
      - Reverted LRN mid removal for is_test
      
      - PADDLE_ENFORCE adjusted
      
      - Rebase to templatization commit
      
      - Compilation fix
      
      - compilation fix
      
      test=develop
      
      - lint
      
      test=develop
      
      - Fix to crash
      
      - Rebase to recent codebase
      
       - lin
      
      - lint
      
      - compilation fix
      619c797a
  22. 19 9月, 2019 2 次提交
  23. 18 9月, 2019 2 次提交
  24. 17 9月, 2019 1 次提交
  25. 16 9月, 2019 1 次提交
  26. 14 9月, 2019 2 次提交
  27. 12 9月, 2019 1 次提交
  28. 11 9月, 2019 1 次提交
    • H
      Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989) · 12542320
      Huihuang Zheng 提交于
      TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
      
      We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
      
      Also added data_feed_proto to operator to fix CI in CPU compilation
      12542320
  29. 10 9月, 2019 2 次提交
  30. 09 9月, 2019 1 次提交
  31. 05 9月, 2019 1 次提交
    • Y
      Integrate NVRTC to support compiling CUDA kernel at runtime (#19422) · 42b5bec6
      Yiqun Liu 提交于
      * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
      test=develop
      
      * Call CUDA driver api to launch the kernel compiled by nvrtc.
      test=develop
      
      * Disable for mac and windows.
      test=develop
      
      * Refine the codes to support manually specified num_threads and workload_per_thread.
      test=develop
      
      * Refine the CUDA kernel to support large dims.
      test=develop
      42b5bec6