提交 · fd85be80cfc589b75575d48119ddc7490a1ede28 · 机器未来 / Paddle

09 7月, 2021 2 次提交
- C
  [PTQ ] wrap simulated layers and save the quantized model (#33962) · fd85be80
  由 cc 提交于 7月 09, 2021
```
* PTQ save quantized model

* Wrap simulated layer

* post process the inference model
```
  fd85be80
- Z
  
  fix double grad hang bug (#34023) · 8768ffb7
  由 Zeng Jinle 提交于 7月 09, 2021
  
  8768ffb7
08 7月, 2021 5 次提交
- W
  delete the function of saving layer object. (#33697) · e22701c4
  由 WeiXin 提交于 7月 08, 2021
```
* delete the function of saving layer object.

* edit doc of paddle.save/load and polish error message
```
  e22701c4
- H
  opt dygraph python code for 215 unchecked calls (#34024) · 9b611ea2
  由 Hao Lin 提交于 7月 08, 2021
```
* opt dygraph python API, test=develop

* Fix unbind bug in manipulation.py
```
  9b611ea2
- L
  
  fix the bug, test=develop (#33996) · 6a36977d
  由 lilong12 提交于 7月 08, 2021
  
  6a36977d
- M
  
  Distributed Automatic SParsity with Fleet (#33558) · 86cb3fb8
  由 Ming-Xu Huang 提交于 7月 08, 2021
  
  86cb3fb8
- W
  Fix test_jit_save_load random failure. (#34004) · 1e5437de
  由 WeiXin 提交于 7月 08, 2021
```
* Fix test_jit_save_load random failure.

* Since CI is not activated, recommit the code.

* delete temp file.
```
  1e5437de
07 7月, 2021 4 次提交
- W
  
  fix reshape trt condition (#34007) · 0914ff97
  由 Wilber 提交于 7月 07, 2021
  
  0914ff97
- P
  
  add Wait after TensorCopy (#34005) · cb73feea
  由 pangyoki 提交于 7月 07, 2021
  
  cb73feea
- J
  Added PRelu BF16/FP32 FWD/BWD kernels (#33878) · 375e5618
  由 jakpiase 提交于 7月 07, 2021
```
* added prelu bf16/fp32 fwd/bwd kernel
```
  375e5618
- T
  
  [xpu] add dropout & amp ops in xpu place (#33891) · 84e813e3
  由 taixiurong 提交于 7月 07, 2021
  
  84e813e3
06 7月, 2021 7 次提交
- T
  add so parser (#33969) · b1c458d0
  由 Thunderbrook 提交于 7月 06, 2021
```
* add delta score, scale show

* so parser

* windows

* windows
```
  b1c458d0
- Z
  Add gpu implementation of shuffle_batch_op (#33938) · c6b6ba1f
  由 Zeng Jinle 提交于 7月 06, 2021
```
* add gpu implementation of shuffle batch
test=develop

* add thrust cuda patches
test=develop

* fix macro guard

* fix shuffle batch compile on windows/hip

* fix hip compilation error

* refine CMakeLists.txt

* fix windows compile error

* try to fix windows CI compilation error

* fix windows compilation again

* fix shuffle_batch op test on Windows
```
  c6b6ba1f
- K
  
  make DataLoader warning less noisy. test=develop (#33712) · 5085c44b
  由 Kaipeng Deng 提交于 7月 06, 2021
  
  5085c44b
- W
  
  [hybrid performance] pipeline add program cache (#33954) · c9ae1362
  由 WangXi 提交于 7月 06, 2021
  
  c9ae1362
- Z
  
  public api:add bn\ln\in; add static.xpu_place (#33897) · 6b95e674
  由 zhiboniu 提交于 7月 06, 2021
  
  6b95e674
- X
  Enhance error message for interpolate_v2 (#33941) · f2068eec
  由 xiaoting 提交于 7月 06, 2021
```
* fix interpolate for shape[i]=0, test=develop

* fix test_trilinear_interp_v2 random failure, test=develop
```
  f2068eec
- D
  【HETERPS】pipeline adaptive for heterps (#33159) · bfef7feb
  由 danleifeng 提交于 7月 06, 2021
```
* pipeline adaptive for heterps;test=develop
* fix finalize hang;test=develop
* add is_compiled_with_heterps for dataset;test=develop
* fix hashtable core when pass ins_num=0;test=develop
```
  bfef7feb
05 7月, 2021 12 次提交
- A
  
  [Dy2Stat]Fix unique_name in create_static_variable_gast_node (#33963) · 740f4e30
  由 Aurelius84 提交于 7月 05, 2021
  
  740f4e30
- J
  add `reduce_sum` op into amp black list (#33960) · aa9fdd0d
  由 jiangcheng 提交于 7月 05, 2021
```
* reduce sum op default fp32, add into amp black list

* reduce_sum default fp32 can avoid return inf when the sum value large than 65504
```
  aa9fdd0d
- W
  
  [hybrid performance] optimize pipeline performance · 9914dff7
  由 WangXi 提交于 7月 05, 2021
  
  9914dff7
- W
  
  Add fused elemwise gelu and optimize performance (#33480) · eae31856
  由 WangXi 提交于 7月 05, 2021
  
  eae31856
- P
  [NPU] change Add to AddN in sum npu op (#33957) · fa5ddfd9
  由 pangyoki 提交于 7月 05, 2021
```
* change Add to AddN in sum npu op

* add AddInputNames

* change fp16 to fp32 because numpy has accuracy loss in fp16 adding

* delete check

* fix runner error
```
  fa5ddfd9
- Q
  
  [NPU] add abs and uniform_random op and npu dockerfile, test=develop (#33942) · a84e48b9
  由 Qi Li 提交于 7月 05, 2021
  
  a84e48b9
- W
  
  optimize grad add device (#33946) · 75d247b7
  由 WangXi 提交于 7月 05, 2021
  
  75d247b7
- S
  
  fix bug of sync_parameters (#33955) · bd559a24
  由 ShenLiang 提交于 7月 05, 2021
  
  bd559a24
- C
  Refine the dygraph ptq and the module of calculating KL threshold (#33898) · 9254183d
  由 cc 提交于 7月 05, 2021
```
* refine ptq according comments
* reuse the module to calculate kl threshold
```
  9254183d
- S
  [HybridParallel] Add amp support for pipeline_parallel (#33951) · 0b911330
  由 ShenLiang 提交于 7月 05, 2021
```
* add amp support for pp

* add amp untest
```
  0b911330
- D
  【HeterPS】fix hdfs and fleet_util for supporting save/load/infer (#33903) · 2ef6188b
  由 danleifeng 提交于 7月 05, 2021
```
* fix hdfs and fleet_util for supporting save/load infer;test=develop
```
  2ef6188b
- C
  [Dygraph QAT] Save all scales to target ops and Move quant layers to paddle.nn.quant (#33871) · 00c85a74
  由 cc 提交于 7月 05, 2021
```
* Save all scales to target ops
* Move quant layers to paddle.nn.quant
```
  00c85a74
04 7月, 2021 1 次提交
- P
  [NPU] delete useless GELU in gelu grad npu op (#33872) · 4d167240
  由 pangyoki 提交于 7月 04, 2021
```
* delete useless GELU in gelu npu op

* add description

* fix format

* add check_grad in gelu unittest
```
  4d167240
02 7月, 2021 3 次提交
- W
  
  fix fleet amp get_loss_scaling (#33935) · 17a81df6
  由 WangXi 提交于 7月 02, 2021
  
  17a81df6
- C
  Refine QuantizeTranspilerV2 to support distributed training (#33781) · 4032c2e4
  由 cc 提交于 7月 02, 2021
```
* refine the old code

* support moving_average_abs_max and per_channel_abs_max

* Add moving_average_abs_max_scale op

* Convert the test program
```
  4032c2e4
- W
  
  fix shared param grad_add op_device is null (#33875) · cf4c6fb4
  由 WangXi 提交于 7月 02, 2021
  
  cf4c6fb4
01 7月, 2021 5 次提交
- Y
  
  gradient scale (#33862) · 57aabbab
  由 Yuang Liu 提交于 7月 01, 2021
  
  57aabbab
- S
  
  roll optimize (#32880) · 3fc56aa0
  由 sunli 提交于 7月 01, 2021
  
  3fc56aa0
- J
  Dygraph/sharding (#33633) · f33f2444
  由 JZ-LIANG 提交于 7月 01, 2021
```
* dygraph sharding

* update unitest hybrid_parallel_communicate_group
```
  f33f2444
- T
  
  fix bug DLTP-31078 (#33877) · 3e82a794
  由 taixiurong 提交于 7月 01, 2021
  
  3e82a794
- Z
  [AMP] add get() and set() for Grad_scaler (#33835) · 85687348
  由 zhangbo9674 提交于 7月 01, 2021
```
* add get and set for Grad_scaler

* refine some API name and comments

* refine API name and comments

* refine some comments
```
  85687348
30 6月, 2021 1 次提交

Added matmul_v2 BF16/FP32 FWD kernel (#33750) · 24783c84

由 jakpiase 提交于 6月 30, 2021

* added matmul_v2 bf16/fp32 FWD kernel

added matmul_v2 bf16/fp32 FWD kernel

* added formatting

* removed some tests due to timeout in CI

* refactored tests

* merged tests classes into one file

* minor change

* removed test guard for CUDA

* remove skipIf

* changes after review

* formated one file

* minor change

* added skipping UT in CUDA place

24783c84

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致