1. 21 10月, 2021 3 次提交
  2. 20 10月, 2021 10 次提交
    • H
      fix bugs of ClipGradByGlobalNorm in HybridParallel (#36555) · 6a3941e3
      Haohongxiang 提交于
      * fix bugs of ClipGradByGlobalNorm
      
      * add unittests
      
      * add unittests
      6a3941e3
    • Fix global gather and global scatter operators (#36517) · 17b4dd70
      李季 提交于
      * fix global gather and global scatter operators
      17b4dd70
    • R
      [NPU] Add kldiv_loss_op for npu (#36494) · 6a572a19
      ronnywang 提交于
      6a572a19
    • S
      Add FasterTokenizer Operator (#34491) · 3f2d6a3f
      Steffy-zxf 提交于
      Add Tokenizer related functionalities for Transformer model in order that the process of training and predicting is consistent.
      
      * support the text string as an input Tensor
      * support the "VOCAB"unordered_map<wstring, int> as an input Tensor to lookup tokens
      * Tokenizer used for BERT. This tokenizer applies an end-to-end, text string to wordpiece tokenization.
      * It first applies basic tokenization, followed by wordpiece tokenization.
      3f2d6a3f
    • Z
      fix pow2 decay (#36559) · 605e7f08
      Zeng Jinle 提交于
      605e7f08
    • W
      add unittest (#36371) · 7325c9fb
      Wilber 提交于
      7325c9fb
    • W
      update for trt convert ut. (#36549) · 06bd348d
      Wilber 提交于
      06bd348d
    • J
      [FIX] Extend time for test_activation_nn_grad to avoid its timeout issue (#36527) · c285c719
      Jiabin Yang 提交于
      * native commit for triple grad of sigmod
      
      * Updated unittests files
      
      * init functional jacobian api
      
      * Updated trible_test func
      
      * Updated gradient_checker & test_script
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * fix dygraph grad to support high differential
      
      * polish API docstring
      
      * Updated gradient checker and some related files
      
      * fix double grad strip error for high differential
      
      * fix double grad strip error for high differential
      
      * Add Sigmoid triple grad tests
      
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * Updated triple grad teses func
      
      * Use np.random to initialize ddx
      
      * Updated triple_grad_check func
      
      * add todo for gradient checker and refine some comments
      
      * remove additional code
      
      * add test for warnging in backward.py
      
      * add tanh triple grad
      
      * format python code
      
      * refine code
      
      * make test_activation_nn_grad test time to 150s
      Co-authored-by: Nveyron95 <veyron_wu@163.com>
      Co-authored-by: Nlevi131 <limaolin01@baidu.com>
      c285c719
    • J
      [Auto Parallel] Generalization for Partition and Completion (#35735) · 797bd40d
      JZ-LIANG 提交于
      * default dist op
      
      * add dist_attr for dist op
      
      * add unitest
      
      * update inputname
      
      * update function name
      
      * add unitest
      
      * update CMakeLists.txt for CI
      
      * fix dis_matmul
      
      * fix compile error
      
      * update matmul to matmul_v2
      
      * unify api
      
      * unify api
      
      * todo
      
      * update distop forward func
      
      * update distop forward func
      
      * auto parallel backward
      
      * update dist op
      
      * autoparallel backward
      
      * add backward for embedding
      
      * temp1
      
      * temp2
      
      * temp3
      
      * temp4
      
      * backward done1
      
      * backward done2
      
      * backward done3
      
      * dist embedding remove mp mode
      
      * dist matmul remove mp mode
      
      * update dist embedding
      『
      
      * dist op init1
      
      * dist op init 2
      
      * update unitest
      
      * context remove parallel mode
      
      * partitioner remove parallel mode
      
      * update unitest
      
      * a more general method to support varying mesh in pipeline parallel
      
      * support varying mesh in pipeline parallel
      
      * embedding support varying mesh in pipeline parallel
      
      * matmul support varying mesh in pipeline parallel
      
      * default dist op support varying mesh in pipeline parallel
      
      * dist attribute for startup program
      
      * default dist op support varying mesh in pipeline parallel 2
      
      * partitoner support varying mesh in pipeline parallel
      
      * revise logic for auto compeletion
      
      * revise framework.py
      
      * revise reshard unitest
      
      * revise unitest for parallelize
      
      * chmod
      
      * fixed bug for dist embedding name mapping
      Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>
      797bd40d
    • 0
      remove no_value using var.name (#36513) · fe01ba6a
      0x45f 提交于
      * remove no_value using var.name
      
      * fix unit test for CI
      
      * fix unit test
      
      * add test case
      
      * fix test case
      
      * add more test case
      fe01ba6a
  3. 19 10月, 2021 13 次提交
  4. 18 10月, 2021 10 次提交
    • H
      [HybridParallel]Support fp16 in dygraph hybrid parallel (#36420) · 10f0a0f6
      Haohongxiang 提交于
      * [HybridParallel]Support fp16 in dygraph hybrid parallel
      
      * update
      
      * update
      
      * update for recompute
      
      * add unittest of pp+fp16
      
      * add unittest of recompute+fp16
      
      * update
      
      * modify ut
      10f0a0f6
    • J
      Added softplus FP32 FWD OneDNN kernel (#36382) · bdac9ff6
      jakpiase 提交于
      * added softplus
      
      * refactored softplus op
      
      * deleted unnecessary file
      
      * added missing file
      
      * added formatting
      
      * disabled tests if GPU is used
      
      * added reviewer suggestion
      
      * unified softplus kernel
      bdac9ff6
    • L
      Lml/vhp (#36146) · 4c0ad772
      levi131 提交于
      * init functional jacobian api
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * init hessian API
      
      * save status
      
      * polish API docstring
      
      * modify docstring
      
      * add utils.py
      
      * save status
      
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * reinvoke ci
      
      * test_hessian.py is ok
      
      * polish hessian API
      
      * init vhp
      
      * Revert "init vhp"
      
      This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8.
      
      * init vhp
      
      * finish vhp API logically
      
      * add test for partial_engine.cc
      
      * modify numerical_delta with dtype float32
      
      * merge fix for dtype float64
      
      * spell fix
      
      * save status
      
      * polish code
      
      * rm _stop_gradient_pre_process
      
      * save status
      
      * add example for vhp interface
      
      * add _compute_numerical_vjp and _compute_numerical_vhp
      
      * test is ok
      
      * vhp is ok
      
      * add testVHPFloat64
      
      * modify for comments
      
      * modify format
      
      * modify format
      
      * save status
      
      * test_vhp is ok
      
      * finish code polish
      
      * small modify for v is None
      Co-authored-by: NJiabinYang <360788950@qq.com>
      4c0ad772
    • Q
    • Q
      [NPU] fix dtype for arg_max, test=develop (#36457) · 8757fc5b
      Qi Li 提交于
      8757fc5b
    • S
      Add operators for async read & async write (#36333) · 3845afff
      Siming Dai 提交于
      * fix async_read bug
      
      * change index place to cpu
      
      * add tensor size judge
      
      * add async_read & async_write test
      
      * fix bug in async_write
      
      * fix mac py3 ci
      
      * fix bug for cpu version paddle
      
      * fix windows ci bug
      
      * change input argument error type
      
      * change const_cast to mutable_data
      
      * add async_write out-of-bound check and consumate error hint
      
      * fix a small bug for dst_tensor
      
      * add docs and refine codes
      
      * refine docs
      
      * notest,test=windows_ci
      
      * fix windows ci
      
      * fix require
      
      * fix code-block
      
      * add core.is_compiled_with_cuda()
      3845afff
    • C
      quant support matmul_v2 (#36469) · 051544b6
      ceci3 提交于
      * quant support matmul_v2
      
      * fix format
      051544b6
    • T
      [XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph... · d19a9b39
      taixiurong 提交于
      [XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph 3. xpu support update weight params in amp (#36439)
      
      d19a9b39
    • T
      [autograd.functional] Fix a bug on handling v=None in vjp and jvp (#36445) · 79dbbcce
      Tongxin Bai 提交于
      * autograd.functional passed pylint checker.
      
      * autograd.functional: fix import errors.
      
      * autograd.functional: fixed unit tests.
      
      * autograd.functional minor format change
      
      * [autograd.functional] Fixed vjp and jvp's v=None bug.
      79dbbcce
    • H
      modify ut of cond (#36475) · e496d1e9
      Haohongxiang 提交于
      e496d1e9
  5. 17 10月, 2021 1 次提交
  6. 16 10月, 2021 1 次提交
  7. 15 10月, 2021 2 次提交