1. 30 11月, 2021 2 次提交
    • Y
      [Auto Parallel] Do the physical mapping between the process graph and the cluster graph (#37094) · b0dff05d
      Yulong Ao 提交于
      * [Auto Parallel]  Add the unified cluster representation
      
      * [Auto Parallel] Add the graph class for physical mapping
      
      * [Auto Parallel] Add the simple physical mapper
      
      * Set the timeout of the mapper
      
      * Merge the upstream develop unittests cmake files
      
      * Fix a bug of the process group
      
      * Remove mapper unittest from platforms which is not GPU
      
      * Move the instantiation of process group after resharding
      
      * Add the local id for devices
      
      * Update the rank mapping format
      
      * Add some comments
      
      * Remove the related files about mapping
      
      * Update the unittest for auto mapping
      
      * Remove unused rank_mapping unittest
      
      * Improve the unittest coverage
      
      * Improve the unittest coverage
      b0dff05d
    • X
      Fix test calc gradient (#37672) · a0631364
      xiongkun 提交于
      * add scope_guard
      
      * 1. fix control flow cases 2. fix calc_gradient
      a0631364
  2. 29 11月, 2021 6 次提交
    • T
      add expand_v2/expand_as_v2 for kunlun (#37592) · dae4e7f2
      TTerror 提交于
      * add expand_v2/expand_as_v2 for kunlun
      
      * update expand_as_v2
      
      * update expand_as_v2
      
      * support float16/bool
      
      * update xpu.cmake
      dae4e7f2
    • Z
      [AMP] For `amp.decorate()` optimizers set to None is ok (#37541) · 2bb3f0b5
      zhangbo9674 提交于
      * amp.decorate optimizers set to None is ok
      
      * refine unittest
      
      * add unittest and refine example code
      
      * refine unittest
      2bb3f0b5
    • Y
    • W
      [New features] Support batch_jacobian and batch_hessian (#37547) · 4d24d352
      Weilong Wu 提交于
      * native commit for triple grad of sigmod
      
      * Updated unittests files
      
      * init functional jacobian api
      
      * Updated trible_test func
      
      * Updated gradient_checker & test_script
      
      * finish test with dtype float32
      
      * add float64 test case
      
      * polish code
      
      * use atol=1e-5 with dtype float64
      
      * fix for ci
      
      * set timeout for test_jacobian
      
      * fix dygraph grad to support high differential
      
      * polish API docstring
      
      * Updated gradient checker and some related files
      
      * fix double grad strip error for high differential
      
      * fix double grad strip error for high differential
      
      * Add Sigmoid triple grad tests
      
      * fix dygraph double grad dtype error when calling for high differential senario
      
      * Updated triple grad teses func
      
      * Use np.random to initialize ddx
      
      * Updated triple_grad_check func
      
      * add todo for gradient checker and refine some comments
      
      * remove additional code
      
      * add test for warnging in backward.py
      
      * format python code
      
      * support multi input in triple gradient checker
      
      * Add matmul triple grad kernel
      
      * Updated comments of TODO
      
      * Supported some special tests
      
      * Change code-format to follow CI std
      
      * Updated gradient_checker.py
      
      * Fix conflicts
      
      * Removed unnecessary printing log
      
      * Change code style to follow CI std
      
      * support batch in jacobian and hessian
      
      * add batch jacobian and batch hessian
      
      * Add batch_jacobian test, draft version
      
      * [New features] Add elementwise_mul triple grad kernel (#37152)
      
      * Add elementwise_mul triple grad kernel
      
      * Removed InplaceInferer and polished code
      
      * Add numerical_batch_jacobian,numerical_batch_hessian and tests
      
      * Support batch_jacobian and batch_numerical
      
      * Use pre-commit to check code format
      
      * Update doc, polish code, add unit test
      
      * Reset the TIMEOUT properties of test_jacobian to pass CI
      Co-authored-by: Nlevi131 <limaolin01@baidu.com>
      Co-authored-by: NJiabin Yang <360788950@qq.com>
      4d24d352
    • B
      fix_InternalStorage (#37568) · d0a89744
      Baibaifan 提交于
      d0a89744
    • W
      [ut] Update skip concept to ignore. (#37635) · ae544242
      Wilber 提交于
      ae544242
  3. 27 11月, 2021 2 次提交
    • Y
      [Auto Parallel] Add the graph class for the process and cluster (#37482) · 48faf638
      Yulong Ao 提交于
      * [Auto Parallel]  Add the unified cluster representation
      
      * [Auto Parallel] Add the graph class for physical mapping
      
      * [Auto Parallel] Add the simple physical mapper
      
      * Set the timeout of the mapper
      
      * Merge the upstream develop unittests cmake files
      
      * Fix a bug of the process group
      
      * Remove mapper unittest from platforms which is not GPU
      
      * Move the instantiation of process group after resharding
      
      * Add the local id for devices
      
      * Update the rank mapping format
      
      * Add some comments
      
      * Remove the related files about mapping
      
      * Remove unused rank_mapping unittest
      
      * Improve the unittest coverage
      48faf638
    • J
      fix save inference model conditional op (#37579) · fd41456f
      JingZhuangzhuang 提交于
      fd41456f
  4. 26 11月, 2021 6 次提交
    • S
      fix data parallel when VOCAB var in program (#37543) · e05540f7
      Steffy-zxf 提交于
      * fix data parallel when VOCAB var in program
      e05540f7
    • Z
      upgrade async distributed training in pscore (#37515) · 74605fc2
      zhaocaibei123 提交于
      * test
      
      * test
      
      * rm test
      
      * update
      
      * update
      
      * update
      
      * add unittest
      
      * update
      
      * update save
      74605fc2
    • L
      Fix bugs when bias add none in static graph for fused_attention op. (#37566) · 097e098d
      Li Min 提交于
      * Fix bugs when bias is none for static graph for fused_attention op.
      097e098d
    • Z
      Added interface reset_grad_inplace_version (#37573) · dcb91fd7
      Zhanlue Yang 提交于
      reset_inplace_version removes all inplace related records to VarBase/VariableWrapper, the essential purpose of which is to let you use inplace operations as if using its non-inplaced version, which of course will cause unexpected consequences if not used with care.
      
      This is essentially a hack interface to satisfy one specific request
      dcb91fd7
    • W
      TDM2 (#37044) · 4826167c
      wangzhen38 提交于
      * add tdm sample
      
      * add tdm sample in c++
      
      * update tdm sample
      
      * modify sample count
      
      * fix conflict
      
      * add set_date
      
      * fix cmake error
      
      * fix bug of proto
      
      * update index_dataset proto
      
      * update cmake
      
      * fix error cmake
      
      * fix cmake mkldnn
      
      * fix cmake proto
      
      * update cmake proto
      
      * update cmake
      
      * update rec
      
      * update dataset
      
      * update dataset
      
      * update dataset
      
      * updata dataset
      
      * updata dataset
      
      * updata coverage
      
      * updata ci
      
      * goback4
      
      * fix npu ci
      
      * add xxhash dep
      4826167c
    • S
      Fix dropout static when axis != None (#37223) · f25fda37
      smallv0221 提交于
      * fix dropout static when axis != None
      
      * update dropout test
      
      * add dropout test
      
      * fix test
      
      * Update test_dropout_op.py
      
      * Update test_dropout_op.py
      
      * fix testcase
      
      * fix testcase
      
      * Update test_dropout_op.py
      
      * fix testcase
      
      * fix testcase
      
      * optimize perf
      
      * add new test
      
      * fix testcase
      f25fda37
  5. 25 11月, 2021 7 次提交
  6. 24 11月, 2021 4 次提交
    • T
      [GpuPs]pybind core (#37287) · d69daed1
      Thunderbrook 提交于
      * pybind core
      
      * set use psgpu
      d69daed1
    • W
      [Paddle-Inference] Matmul_int8_convert: tensor*tensor (#37285) · 16590799
      Wangzheee 提交于
      * matmul_convert_int8
      
      * matmul_convert_int8
      
      * matmulconvert_int8
      
      * Matmul_int8_convert: tensor*tensor
      
      * Matmul_int8_convert: tensor*tensor
      
      * Matmul_int8_convert: tensor*tensor
      16590799
    • Y
      [Auto Parallel] Add the unified cluster representation (#37091) · db727551
      Yulong Ao 提交于
      * [Auto Parallel]  Add the unified cluster representation
      
      * Add the local id for devices
      
      * Add some comments
      db727551
    • 0
      [Dy2stat]support pure fp16 for dy2stat (#36944) · 52edad6a
      0x45f 提交于
      * run dy2stat pure fp16 in Linear model
      
      * no use self._pure_fp16_inputs
      
      * add test and fix Adam error in dy2stat pure fp16 training
      
      * use paddle.optimizer.Adam
      
      * run test in gpu
      
      * change test time for CI
      
      * enlarge atol for test_resnet_pure_fp16
      
      * refine code and enlarge atol
      
      * make custom_white_list and custom_black_list take effect for AMP and pure fp16
      
      * check tracer is not None
      
      * use default atol
      
      * change filter_size
      
      * change atol and add some NOTE
      52edad6a
  7. 23 11月, 2021 7 次提交
  8. 22 11月, 2021 6 次提交