1. 29 7月, 2020 1 次提交
  2. 13 7月, 2020 1 次提交
    • H
      [Dy2stat] Fix Memory Optimization in run_program_op and Add SimNet as Unit Test (#25383) · f9ac5fb9
      Huihuang Zheng 提交于
      Add Similarity Net as unit test. During the unit test, we found three problems:
      
      1. The run_program_op has memory optimization error when running dy2stat net multiple times.
      2. The support for SelectedRows can cause problem in dy2stat.
      3. The return grammar has problem.
      
      This PR fixes the 1. problem but modify codes for the 2. 3. problems to make PR smaller. I will fix those two problems in the next PR(s)
      f9ac5fb9
  3. 21 6月, 2020 1 次提交
  4. 04 6月, 2020 1 次提交
  5. 09 5月, 2020 1 次提交
  6. 08 5月, 2020 1 次提交
    • H
      Add Assert Op (#24280) · 8a1a2af8
      Huihuang Zheng 提交于
      1. To make ProgramTranslator to support `assert` grammar, this PR adds `assert` python API and C++ code. 
      
      2. Fix a bug: graph_pattern_detector.h #include <gtest/gtest_prod.h> but didn't declared dependency at CMakeLists, which can cause single build failure.
      
      3. Refactoring `Formatter` in print_op to make it reusable and reuse the formatter to print in assert op.
      8a1a2af8
  7. 28 4月, 2020 1 次提交
  8. 27 4月, 2020 1 次提交
  9. 09 4月, 2020 1 次提交
    • M
      Remove: NGraph engine from PDPD repository (#23545) · 3baaee9a
      mozga-intel 提交于
      * Remove the NGraph engine from PDPD repository
      1. Each operator was removed from the operator's directory
      2. Each test was removed from the unittest directory
      3. The parallel executor support was removed from the PDPD
      4. The CMake file was removed from the PDPD
      5. The NG flags were removed from the repository
      test=develop
      
      * Remove ngraph from:
      1. Cmake file
      2. Python file
      test=develop
      3baaee9a
  10. 05 4月, 2020 1 次提交
  11. 02 4月, 2020 1 次提交
  12. 30 3月, 2020 1 次提交
  13. 26 3月, 2020 1 次提交
    • Z
      [Paddle-TRT]: Ernie Dynamic shape support. (#23138) · 430b0099
      Zhaolong Xing 提交于
      * add dynamic plugin support.
      test=develop
      
      * change emb eltwise layernorm to math function
      test=develop
      
      * add emb eltwise layernorm
      test=develop
      
      * can run dynamic shape ernie
      test=develop
      
      * fix ci
      test=develop
      
      * add ut for trt ernie dynamic
      
      test=develop
      
      * refine dynamic shape c++ interface.
      test=develop
      
      * fix comments
      test=develop
      
      * fix comments
      test=develop
      430b0099
  14. 05 2月, 2020 1 次提交
  15. 04 2月, 2020 1 次提交
  16. 09 1月, 2020 1 次提交
  17. 28 11月, 2019 1 次提交
  18. 07 11月, 2019 1 次提交
  19. 05 11月, 2019 1 次提交
    • Z
      Support NoNeedBufferVarsInference in dygraph backward (#20868) · 878a40f5
      Zeng Jinle 提交于
      * support no need buffer vars in dygraph, test=develop
      
      * fix inference compilation error, test=develop
      
      * update no_need_buffer_vars_inference, test=develop
      
      * add unittests for no_need_buffer_vars_context, test=develop
      
      * refine no_need_buffer_vars by return ref, test=develop
      
      * polish some codes, test=develop
      878a40f5
  20. 30 10月, 2019 1 次提交
  21. 28 10月, 2019 1 次提交
  22. 24 10月, 2019 1 次提交
  23. 18 10月, 2019 1 次提交
  24. 02 10月, 2019 1 次提交
  25. 30 9月, 2019 1 次提交
    • W
      fix compile paddle with anakin bug · 276b5e34
      Wilber 提交于
      * fix compile with anakin bug
      
      * remove useless deps test=develop
      
      - 修复了联编anakin时,遇到的bug.
      - 编译test_anakin_activate 不通过
      - 编译test_anakin_engine 不通过
      276b5e34
  26. 17 9月, 2019 1 次提交
  27. 11 9月, 2019 2 次提交
    • Z
      Make leaky relu inplacable (#19676) · 0daa5c97
      Zeng Jinle 提交于
      * make leaky relu inplacable, test=develop
      
      * force add unittests to pass coverage, test=develop
      0daa5c97
    • Y
      Implement the GPU kernel of fc operator (#19687) · a65c728e
      Yiqun Liu 提交于
      * Refine the codes related to fc op.
      
      * Add GPU implementation for fc functor.
      
      * Apply fc_fuse_pass in GPU inference.
      test=develop
      
      * Change the cmake for fc op.
      
      * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
      
      * Add an attribute to set the activation type in fc_op.
      
      * Enhance the unittest of fc_op.
      test=develop
      
      * Remove the declaration of FCOpGrad back to the header file.
      test=develop
      
      * Set default value for newly added arguments in test_fc_op.
      test=develop
      a65c728e
  28. 08 9月, 2019 1 次提交
  29. 19 8月, 2019 1 次提交
    • A
      Add match_matrix_tensor op (#18525) · 78a3d837
      Aurelius84 提交于
      * add matrch_matrix_tensor op test=develop
      
      * fix ignore unittest if with_mkl=off test=develop
      
      * clean code and rm is_test param test=develop
      
      * modify API.spec test=develop
      
      * rm useless code in search_compute.h test=develop
      
      * modify api.spec test=develop
      
      * modify default_grad.spec test=develop
      
      * Add API test code test=develop
      
      * clean code in search_computer.h
      
      * modify PADDLE_ENFORCE and clean search_compute.h test=develop
      
      * fix code style test=develop
      78a3d837
  30. 06 8月, 2019 1 次提交
    • K
      Add var_conv_2d op (#18518) · e681d655
      Kevin 提交于
      * fix overflow by int32 mul test=develop
      
      * fix reference nullptr
      
      * fix codestyle test=develop
      
      * modify to point in ContextProjectFunctor test=develop
      
      * modify to point in ContextProjectFunctor test=develop
      
      * modify . to -> test=develop
      
      * add var_conv_2d op test=develop
      
      * edit api.spec test=develop
      
      * ignore unittest if with_mkl=off test=develop
      
      * fix python3 division test=develop
      
      * fix ignore unittest bug test=develop
      
      * remove useless code test=develop
      
      * modify api.spec test=develop
      
      * modify default_grad.spec test=develop
      e681d655
  31. 23 7月, 2019 1 次提交
  32. 27 6月, 2019 1 次提交
    • H
      supports collective communicated training (#18175) · b7128bac
      HaoRen 提交于
      * fix prepare context redundant code problem, optimize executor by caching create_varaiables
      test=develop
      
      * supports collective training in executor
      
      * make fetch_list runable with variables, add more unittest for use_program_cache
      test=develop
      
      * fix comment
      test=develop
      
      * use unique name for nccl_id
      
      * supports output to stream in program_to_code
      
      * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code
      
      * set op role in collective training
      
      * add collective op role
      
      * remove orig file
      
      * add build optimizer by strategy
      
      * add collective strategy
      
      * refine collective strategy
      
      * add multi-process role maker
      
      * refine strategy building factory so that we can easily plugin more strategy
      
      * scale loss grad in collective sgd transpiler
      
      * add support for distributed fc
      
      * code format
      
      * revert some features for dist fc
      
      * add support for distributed fc training
      
      * fix prepare context redundant code problem, optimize executor by caching create_varaiables
      test=develop
      
      * supports collective training in executor
      
      * make fetch_list runable with variables, add more unittest for use_program_cache
      test=develop
      
      * use unique name for nccl_id
      
      * supports output to stream in program_to_code
      
      * insert sync_comm_stream before regularization; add skip_op_callstack capability in program_to_code
      
      * set op role in collective training
      
      * add collective op role
      
      * fix comment
      test=develop
      
      * remove orig file
      
      * add build optimizer by strategy
      
      * add collective strategy
      
      * refine collective strategy
      
      * add multi-process role maker
      
      * refine strategy building factory so that we can easily plugin more strategy
      
      * scale loss grad in collective sgd transpiler
      
      * add support for distributed fc
      
      * code format
      
      * revert some features for dist fc
      
      * add support for distributed fc training
      
      * test=develop
      add collective op unittest standard
      
      * test=develop
      remove the test_collective directory
      
      * test=develop
      remove the test_collective directory
      
      * remove slicegather test
      
      * code format for reducescatter
      
      * update attr of shard_index_op
      
      * Modify macro nccl_helper
      
      * remove test without distribute
      
      * macro collective_helper
      
      * marcro update
      
      * test=develop
      update support python3.5
      
      * test=develop change gpu memory use to 0.1 when test
      
      * test=develop
      update ut equal func
      
      * test=develop
      set flags to 1.5
      
      * test=develop fix pickle dumple  py35
      
      * test=develop
      fix divide in slice and add sync_comm_stream
      update atol and rtol to 1e-05
      rm shard_index op and test
      modify read input from file to read from memory
      remove origin_program in framework and add i/o in c_sync_calc_stream
      
      * test=develop update unittest sync operator I/O
      b7128bac
  33. 11 6月, 2019 1 次提交
    • Update the Anakin interfaces for content-dnn and MLU (#17890) · bce259e5
      石晓伟 提交于
      * update anakin-engine interfaces for content-dnn
      
      test=develop
      
      * support only-gpu mode of Anakin
      
      modify eltwise parse
      
      test=develop
      
      * modification for thread-safe
      
      test=develop
      
      * Integrated template instance
      
      test=develop
      
      * increase template parameters
      
      test=develop
      
      * support MLU predictor
      
      test=develop
      
      * update anakin cmake files
      
      test=develop
      
      * update TargetWrapper::set_device
      
      * update the initialization of anakin subgraph
      
      test=develop
      
      * use the default constructor of base class
      
      test=develop
      bce259e5
  34. 30 5月, 2019 1 次提交
  35. 17 5月, 2019 1 次提交
  36. 18 4月, 2019 1 次提交
  37. 28 3月, 2019 1 次提交
  38. 22 3月, 2019 1 次提交
  39. 20 3月, 2019 1 次提交