1. 16 4月, 2022 1 次提交
  2. 15 4月, 2022 18 次提交
    • Z
      solve brpc compile in arm-ubantu18 (#41649) · 56dafc4f
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * arm_brpc compile
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * only output is ok
      
      * base is ok
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * add switch server bin
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * adapt brpc ssl
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      56dafc4f
    • S
      gpu_graph engine optimization+ (#41455) · ce72690c
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      ce72690c
    • R
      Moe ref (#41836) · c37af19c
      Roc 提交于
      * moe ref
      
      * ref commit; test=document_fix
      
      * update; test=document_fix
      
      * update test=document_fix
      c37af19c
    • H
      e25b75b6
    • C
      [Phi]Reduce kernels into multiply files (#41747) · 1927aff9
      chentianyu03 提交于
      * split reduce_kernel
      
      * rm reduce_kernel in cmake
      
      * split reduce_grad kernels
      
      * fix cmake build error
      
      * format code
      
      * fix standalone_executor_test error
      1927aff9
    • Z
      [DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode (#41730) · 27f28e82
      Zhanlue Yang 提交于
      * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad
      
      * Fixed elementwise issue
      
      * Addressed CI failures
      
      * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode
      
      * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode
      
      * Enabled more test cases
      
      * [DoubleGrad] Enabled test_imperative_star_gan_with_gradient_penalty.py under eager mode
      
      * Adjusted test_imperative_star_gan_with_gradient_penalty.py
      27f28e82
    • H
      [Dygraph] Refactor Model Parallel in eager mode (#41761) · e6fb6599
      Haohongxiang 提交于
      * refactor mp in eager mode
      
      * update
      
      * update
      
      * add uts
      e6fb6599
    • T
      ff818c77
    • L
      update (#41762) · 482e5b6c
      lilong12 提交于
      482e5b6c
    • D
      【GPUPS】add afsclient and gpupsutil (#41324) · 30a1213b
      danleifeng 提交于
      * add gpupsutil and afsclient; test=develop
      30a1213b
    • F
      [MLU] add mlu softmax kernel (#41816) · 2d6b71a2
      fwenguang 提交于
      2d6b71a2
    • J
      Add eager string tensor (#41039) · a22b68b8
      Jack Zhou 提交于
      * Add core.eager.StringTensor __init__ which pyarray args can be passed
      
      * Add the numpy method of core.eager.StringTensor
      
      * revert tensor.to_string modification
      
      * Add ToPyObject for core.eager.StringTensor
      
      * Add debug string for core.eager.StringTensor
      
      * Remove place args of core.eager.StringTensor temporarily
      
      * Fix check string_tensor error
      
      * remove dtype of core.eager.StringTensor
      
      * add core.eager.StringTensor unittest
      
      * remove pstring from VarDesc
      
      * Add InitStringTensorWithStringTensor
      
      * Remove to_string modification
      
      * Remove zero_copy arg from StringTensor creator
      a22b68b8
    • Z
      [XPUPS]fix hashtable_kernel.kps (#41790) · ef6ff4ef
      zmxdream 提交于
      * refactor heter comm kernel
      
      * update. test=develop
      
      * update calc_shard_offset. test=develop
      
      * update xpu kernel. test=develop
      
      * update args of calc_shard_offset
      
      * update. test=develop
      
      * remove customGradMerger
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update optimizer kernel
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * add optimizer kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix kunlun not support size_t. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update hashtable. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * update. test=develop
      
      * update. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * template init. test=develop
      
      * hashtable template init. test=develop
      
      * fix. test=develop
      
      * fix. test=devlop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix hashtable_kernel. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      
      * fix. test=develop
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      ef6ff4ef
    • A
      [IPU] add mixed-precission support for ipu (#41733) · d7224482
      Allen Guo 提交于
      * add mixed-precission support for ipu
      
      * restore cast_model_to_fp16 api
      
      * update UTs
      d7224482
    • P
      support no_need_buffer in eager_fluid state (#41720) · 840d2eb6
      pangyoki 提交于
      * support no_need_buffer in eager_fluid state
      
      * change no_need_buffer info from fwd_info to bwd_info
      
      * fix CI fail, gru_unit donnot use no_need_buffer
      
      * fix conflict between no_need_buffer and dispensable
      
      * use tensor.define in dispensable
      
      * solve conflict
      
      * solve conflict
      840d2eb6
    • L
      Change cuDNN Conv kernel for auto tune feature (#41313) · 35acfeda
      limingshu 提交于
      * change cudnn helper for auto-tune
      
      * Add FLAGS_use_autotune to set the global status of autotune and change the order of choosing algorithm.
      
      * Fix the bug in calculating and printing current step cache hit rate.
      
      * Improve the autotune cache and fix unittest.
      
      * Change the key from AlgorithmType to int64_t.
      
      * Fix unittest for cpu-only env.
      
      * change ChooseAlgoByWorkspace for heuristic mode
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      35acfeda
    • F
      [MLU] add mlu activation kernels (#41751) · 10114859
      fwenguang 提交于
      10114859
    • F
      [MLU] add mlu new profiler (#41138) · fc208b7e
      fwenguang 提交于
      * [MLU] add mlu new profiler
      
      * fix format
      fc208b7e
  3. 14 4月, 2022 16 次提交
  4. 13 4月, 2022 5 次提交
    • L
      Lml/add prim ops (#41201) · 97dec7ca
      levi131 提交于
      * native commit for triple grad of sigmod
      
      * Updated unittests files
      
      * init functional jacobian api
      
      * Updated trible_test func
      
      * Updated gradient_checker & test_script
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * fix dygraph grad to support high differential
      
      * polish API docstring
      
      * Updated gradient checker and some related files
      
      * fix double grad strip error for high differential
      
      * fix double grad strip error for high differential
      
      * Add Sigmoid triple grad tests
      
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * Updated triple grad teses func
      
      * Use np.random to initialize ddx
      
      * Updated triple_grad_check func
      
      * add todo for gradient checker and refine some comments
      
      * remove additional code
      
      * add test for warnging in backward.py
      
      * format python code
      
      * support multi input in triple gradient checker
      
      * Add matmul triple grad kernel
      
      * Updated comments of TODO
      
      * Supported some special tests
      
      * Change code-format to follow CI std
      
      * Updated gradient_checker.py
      
      * Fix conflicts
      
      * Removed unnecessary printing log
      
      * Change code style to follow CI std
      
      * merge upstream
      
      * add_p
      
      * rm useless files
      
      * add sub_p mul_p div_p
      
      * add sqrt_p and tanh_p
      
      * add reshape_p
      
      * add broadcast_p
      
      * add broadcast_p fill_constant_p matmul_p reduce_p reshape_p transpose_p
      
      * add split_p and concat_p
      
      * add gather_p and scatter_add_p
      
      * add slice_select_p and slice_assign_p
      
      * add multi input check for add_p, sub_p, mul_p, div_p
      
      * update concat_p
      
      * refine gather_p and scatter_add_p
      
      * refine slice_assign_p and slice_select_p
      
      * add 9 test for prim ops
      
      * add more test and fix some bug
      
      * add more test
      
      * register proto
      
      * add shape valid check for broadcast_p op, and add keepdim attr into reduce_p op proto
      
      * support multi input and multi output for split_p and concat_p
      
      * fix slice bug for slice_select_p and slice_assign_p
      
      * dtype for axis attr should be long int
      
      * update dtype for axis attr int64_t
      
      * update for iscan CI
      
      * add more shape and dtype check
      
      * change IndexTensor into int32 dtype
      97dec7ca
    • W
      the one ps proto (#41659) · b12af9e1
      wangguanqun 提交于
      * the one ps proto
      
      * the one ps proto
      
      * fix
      
      * fix
      
      * fix
      
      * fix windows ci
      
      * fix windows ci
      
      * add dependency
      
      * add dependency
      b12af9e1
    • Z
      [XPUPS]add support for kunlun2 (#40985) · c9c03e7b
      zmxdream 提交于
      
      [XPUPS]add support for kunlun2
      Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
      c9c03e7b
    • C
      fix new dygraph record event (#41715) · ca4aea2c
      chenjian 提交于
      * fix new dygraph record event
      
      * refine name
      
      * fix
      
      * fix
      
      * fix according to review
      ca4aea2c
    • L
      1e56ca8a