提交 · 6a3941e3cb9a1752df2374561a4defc7b908fa62 · Crayon鑫 / Paddle

20 10月, 2021 16 次提交

H
fix bugs of ClipGradByGlobalNorm in HybridParallel (#36555) · 6a3941e3
由 Haohongxiang 提交于 10月 20, 2021
```
* fix bugs of ClipGradByGlobalNorm

* add unittests

* add unittests
```
6a3941e3
李
Fix global gather and global scatter operators (#36517) · 17b4dd70
由李季提交于 10月 20, 2021
```
* fix global gather and global scatter operators
```
17b4dd70
R

[NPU] Add kldiv_loss_op for npu (#36494) · 6a572a19
由 ronnywang 提交于 10月 20, 2021

6a572a19
W

fix fc fuse proble (#36568) · fc5db55a
由 Wilber 提交于 10月 20, 2021

fc5db55a

Add FasterTokenizer Operator (#34491) · 3f2d6a3f

由 Steffy-zxf 提交于 10月 20, 2021

Add Tokenizer related functionalities for Transformer model in order that the process of training and predicting is consistent.

* support the text string as an input Tensor
* support the "VOCAB"unordered_map<wstring, int> as an input Tensor to lookup tokens
* Tokenizer used for BERT. This tokenizer applies an end-to-end, text string to wordpiece tokenization.
* It first applies basic tokenization, followed by wordpiece tokenization.

3f2d6a3f

W

adapt to cann5.0.3_alpha3. (#36106) · 873ee4e3
由 wuhuachaocoding 提交于 10月 20, 2021

873ee4e3
Z

fix pow2 decay (#36559) · 605e7f08
由 Zeng Jinle 提交于 10月 20, 2021

605e7f08
W

add unittest (#36371) · 7325c9fb
由 Wilber 提交于 10月 20, 2021

7325c9fb
W

update for trt convert ut. (#36549) · 06bd348d
由 Wilber 提交于 10月 20, 2021

06bd348d

fix SerializeSelectedRows (#36543) · 8ca5206b

由 zmx 提交于 10月 20, 2021

* bug fix for  DeserializeSelectedRows. test=develop

* fix bug for SerializeSelectedRows. test=develop

* update. test=develop

8ca5206b

Add CINN Compile Option (#36292) · 6524fa8d

由 Huihuang Zheng 提交于 10月 20, 2021

Add CINN compile option in CMake.

Now you can use CINN in Paddle by `-DWITH_CINN=ON` when `cmake`

To test it, you can run `make cinn_lib_test -j` and `ctest -R cinn_lib_test`. 

Note:
1. You should set
```
export runtime_include_dir=${CINN_SOURCE_DIR}/cinn/runtime/cuda 
```
When run test, the `${CINN_SOURCE_DIR}` should be set based on your CINN directory.

2. CINN is under developing now, you may have to change `CINN_GIT_TAG` to the git commit you need.

6524fa8d

W
fix (#36557) · 4bd19770
由 wenbin 提交于 10月 20, 2021
```
* fix

* remove const
```
4bd19770

[FIX] Extend time for test_activation_nn_grad to avoid its timeout issue (#36527) · c285c719

由 Jiabin Yang 提交于 10月 20, 2021

* native commit for triple grad of sigmod

* Updated unittests files

* init functional jacobian api

* Updated trible_test func

* Updated gradient_checker & test_script

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* fix dygraph grad to support high differential

* polish API docstring

* Updated gradient checker and some related files

* fix double grad strip error for high differential

* fix double grad strip error for high differential

* Add Sigmoid triple grad tests

* fix dygraph double grad dtype error when calling for high differential senario

* Updated triple grad teses func

* Use np.random to initialize ddx

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* add tanh triple grad

* format python code

* refine code

* make test_activation_nn_grad test time to 150s
Co-authored-by: Nveyron95 <veyron_wu@163.com>
Co-authored-by: Nlevi131 <limaolin01@baidu.com>

c285c719

[Auto Parallel] Generalization for Partition and Completion (#35735) · 797bd40d

由 JZ-LIANG 提交于 10月 20, 2021

* default dist op

* add dist_attr for dist op

* add unitest

* update inputname

* update function name

* add unitest

* update CMakeLists.txt for CI

* fix dis_matmul

* fix compile error

* update matmul to matmul_v2

* unify api

* unify api

* todo

* update distop forward func

* update distop forward func

* auto parallel backward

* update dist op

* autoparallel backward

* add backward for embedding

* temp1

* temp2

* temp3

* temp4

* backward done1

* backward done2

* backward done3

* dist embedding remove mp mode

* dist matmul remove mp mode

* update dist embedding
『

* dist op init1

* dist op init 2

* update unitest

* context remove parallel mode

* partitioner remove parallel mode

* update unitest

* a more general method to support varying mesh in pipeline parallel

* support varying mesh in pipeline parallel

* embedding support varying mesh in pipeline parallel

* matmul support varying mesh in pipeline parallel

* default dist op support varying mesh in pipeline parallel

* dist attribute for startup program

* default dist op support varying mesh in pipeline parallel 2

* partitoner support varying mesh in pipeline parallel

* revise logic for auto compeletion

* revise framework.py

* revise reshard unitest

* revise unitest for parallelize

* chmod

* fixed bug for dist embedding name mapping
Co-authored-by: Nzhaoyingli <zhaoyingli@baidu.com>

797bd40d

A

Add kQueueSync.synchronize_run_ logic (#36546) · 127488ba
由 Aurelius84 提交于 10月 20, 2021

127488ba

remove no_value using var.name (#36513) · fe01ba6a

由 0x45f 提交于 10月 20, 2021

* remove no_value using var.name

* fix unit test for CI

* fix unit test

* add test case

* fix test case

* add more test case

fe01ba6a

19 10月, 2021 17 次提交
- W
  Support elementwise_add triple grad Kernel (#36508) · 51c97d9f
  由 Weilong Wu 提交于 10月 19, 2021
```
* Support elementwise_add triple grad Kernel

* Change code-format to follow CI std
```
  51c97d9f
- Z
  [NPU] Add iou_similarity op (#36412) · 999242e3
  由 zhulei 提交于 10月 19, 2021
```
* [NPU] Add iou_similarity op

* [NPU] Add iou_similarity op

* [NPU] Add iou_similarity op
```
  999242e3
- K
  
  fix op_flops not define. test=develop (#36489) · f2612462
  由 Kaipeng Deng 提交于 10月 19, 2021
  
  f2612462
- Q
  [NPU] update inference cmake, test=develop (#36505) · 49d7bd38
  由 Qi Li 提交于 10月 19, 2021
```
* [NPU] update inference cmake, test=develop

* address review comments, test=develop

* fix compile error when WITH_ASCEND_CXX11 ON, test=develop
```
  49d7bd38
- D
  
  [heterps]edit shrink and unseenday logit for pslib (#36194) · 9e494472
  由 danleifeng 提交于 10月 19, 2021
  
  9e494472
- W
  Inference add type check in copy_from_cpu (#36429) · be6a8330
  由 Wilber 提交于 10月 19, 2021
```
* update

* fix ut error

* update ut
```
  be6a8330
- J
  Optimize the subgraph generated by BuildCinnPass (#36503) · 6cdc5a4b
  由 jiangcheng 提交于 10月 19, 2021
```
* add feed op and new var for the generated subgraph

* perfect the test script of build_cinn_pass 

* remove useless clear and perfect some annotation
```
  6cdc5a4b
- W
  add nearest_interp_v2 trt plugin (#34126) · 7b67f398
  由 wangxinxin08 提交于 10月 19, 2021
```
* add nearest_interp_v2 trt plugin
```
  7b67f398
- W
  
  [hybrid] static model parallel dropout support deterministic RandomSeedGenerator (#36228) · 8cc8e411
  由 WangXi 提交于 10月 19, 2021
  
  8cc8e411
- fix replicate pad when input size is 0 (#36510) · d89a759b
  由 littletomatodonkey 提交于 10月 19, 2021
```
* fix replicate pad when input size is 0
* add unit test
```
  d89a759b
- X
  catch the generatorfunction and intercept it. (#35369) · 7edcc4fb
  由 xiongkun 提交于 10月 19, 2021
```
* catch the generatorfunction and intercept it.

* add test generator

* add test case

* refine the testcase
```
  7edcc4fb
- Y
  [paddle.linalg.qr] Add the Qr Operator (#35742) · 34d785c2
  由 Yulong Ao 提交于 10月 19, 2021
```
* Add QR decomposition op

* Change codes to adapt to new svd_helper

* Update linalg.py

Restore the deleted comma

* Restore the deleted line

* Update linalg.py

* Update linalg.py

* Improve the qr code by reviews

* Update QR based on CI results

* Update qr doc, test=document_fix

* Change unsafe and ill-formed codes
```
  34d785c2
- Y
  Add auto parallel cost model and unittests (#36363) · a573a7ed
  由 YipZLF 提交于 10月 19, 2021
```
* Add auto parallel cost model and unittests

* Fixed code styles.

* Fixed bugs and codes style

* fixed typo

* Improved code style: object encapsulation.

* Fixed codes.

* Refractored estimate_cost

* Fixed typo
```
  a573a7ed
- X
  
  add rocm support for fft api (#36415) · 1d5746bd
  由 Xiaoxu Chen 提交于 10月 19, 2021
  
  1d5746bd
- X
  
  fix out of range for area interp, test=develop (#36466) · 77f4597f
  由 xiaoting 提交于 10月 19, 2021
  
  77f4597f
- Z
  
  bug fix for DeserializeSelectedRows. test=develop (#36520) · a7830a29
  由 zmx 提交于 10月 19, 2021
  
  a7830a29
- Z
  Add pow2_decay_with_linear_warmup op (#36421) · 305b99a0
  由 Zeng Jinle 提交于 10月 19, 2021
```
* add pow2_warmup op

* remove contrib __all__

* add AttrT

* rename

* follow comments

* fix duplicate PADDLE_RESTRICT
```
  305b99a0
18 10月, 2021 7 次提交

[HybridParallel]Support fp16 in dygraph hybrid parallel (#36420) · 10f0a0f6

由 Haohongxiang 提交于 10月 18, 2021

* [HybridParallel]Support fp16 in dygraph hybrid parallel

* update

* update

* update for recompute

* add unittest of pp+fp16

* add unittest of recompute+fp16

* update

* modify ut

10f0a0f6

Added softplus FP32 FWD OneDNN kernel (#36382) · bdac9ff6

由 jakpiase 提交于 10月 18, 2021

* added softplus

* refactored softplus op

* deleted unnecessary file

* added missing file

* added formatting

* disabled tests if GPU is used

* added reviewer suggestion

* unified softplus kernel

bdac9ff6

Lml/vhp (#36146) · 4c0ad772

由 levi131 提交于 10月 18, 2021

* init functional jacobian api

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* init hessian API

* save status

* polish API docstring

* modify docstring

* add utils.py

* save status

* fix dygraph double grad dtype error when calling for high differential senario

* reinvoke ci

* test_hessian.py is ok

* polish hessian API

* init vhp

* Revert "init vhp"

This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8.

* init vhp

* finish vhp API logically

* add test for partial_engine.cc

* modify numerical_delta with dtype float32

* merge fix for dtype float64

* spell fix

* save status

* polish code

* rm _stop_gradient_pre_process

* save status

* add example for vhp interface

* add _compute_numerical_vjp and _compute_numerical_vhp

* test is ok

* vhp is ok

* add testVHPFloat64

* modify for comments

* modify format

* modify format

* save status

* test_vhp is ok

* finish code polish

* small modify for v is None
Co-authored-by: NJiabinYang <360788950@qq.com>

4c0ad772

Add quant axis (#36467) · b7f76647

由 xiaoxiaohehe001 提交于 10月 18, 2021

* add_quant_axis

* add_quant_axis

* --amend

* Update quant_conv2d_dequant_fuse_pass.cc

b7f76647

Q

[NPU] add kernels for elementwise_add gather_nd tile, test=develop (#36464) · cbd15f7d
由 Qi Li 提交于 10月 18, 2021

cbd15f7d
Q

[NPU] fix dtype for arg_max, test=develop (#36457) · 8757fc5b
由 Qi Li 提交于 10月 18, 2021

8757fc5b

Add operators for async read & async write (#36333) · 3845afff

由 Siming Dai 提交于 10月 18, 2021

* fix async_read bug

* change index place to cpu

* add tensor size judge

* add async_read & async_write test

* fix bug in async_write

* fix mac py3 ci

* fix bug for cpu version paddle

* fix windows ci bug

* change input argument error type

* change const_cast to mutable_data

* add async_write out-of-bound check and consumate error hint

* fix a small bug for dst_tensor

* add docs and refine codes

* refine docs

* notest,test=windows_ci

* fix windows ci

* fix require

* fix code-block

* add core.is_compiled_with_cuda()

3845afff

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致