提交 · 2b4977f20cbe962599c55ab57c99f0c2043bf478 · PaddlePaddle / Paddle

23 5月, 2022 3 次提交
- P
  
  fix final_state_linear (#42820) · 2b4977f2
  由 pangyoki 提交于 5月 23, 2022
  
  2b4977f2
- S
  improve error info when no sample code found (#42742) · 9827c8b5
  由 Sing_chan 提交于 5月 23, 2022
```
* test=document_fix

* exit 1 if no sample code found since api must have sample code;test=document_fix

* test normal input;test=document_fix

* delete test code;test=document_fix
```
  9827c8b5
- S
  
  Fix a bug in BroadcastConfig for KP XPU2 rec model (#42866) · 106083aa
  由 shixingbo 提交于 5月 23, 2022
  
  106083aa
22 5月, 2022 1 次提交

Quantize elementwise sub (#42854) · 2ffb3371

由 Zuza Gawrysiak 提交于 5月 22, 2022

* Add elementwise_sub quantization

* Remove unnecessary comments

* Specify names for tests

* Remove comments

* Remove comments leftovers

2ffb3371

21 5月, 2022 1 次提交

delete PADDLE_WITH_TESTING in memory_block_desc (#41817) · 7b6bf281

由 pangyoki 提交于 5月 21, 2022

* delete PADDLE_WITH_TESTING in memory_block_desc

* test FLAGS_allocator_strategy=naive_best_fit

* delete flag naive_best_fit

7b6bf281

20 5月, 2022 16 次提交
- N
  
  Delete ElementwiseKernel in BroadcastKernel (#42779) · 0d878f1a
  由 niuliling123 提交于 5月 20, 2022
  
  0d878f1a
- Z
  
  support heterogeneous tensor for kernel in yaml (#42898) · c5d3bc0e
  由 zyfncg 提交于 5月 20, 2022
  
  c5d3bc0e
- W
  
  fix fused_attention_op cacheKV InferShape (#42900) · 7306d1fb
  由 WangXi 提交于 5月 20, 2022
  
  7306d1fb
- L
  use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output (#42851) · f36a9464
  由 Leo Chen 提交于 5月 20, 2022
```
* use fp32 compute type for cublasGemmStridedBatchedEx with fp16 input/output

* add flags to control compute type

* default to false

* add unit test

* default to true
```
  f36a9464
- S
  【doc CI】simplify doc check log info (#42879) · 4a48e3d1
  由 Sing_chan 提交于 5月 20, 2022
```
* simplify doc check log info;test=document_fix

* test sample code error;test=document_fix

* delete test code;test=document_fix
```
  4a48e3d1
- Y
  
  move activation kernel (#42880) · 191c441a
  由 YuanRisheng 提交于 5月 20, 2022
  
  191c441a
- W
  
  [Eager] Make CreateInferMeta more robust (#42871) · d8b69124
  由 Weilong Wu 提交于 5月 20, 2022
  
  d8b69124
- J
  
  fix hook mem leak (#42857) · 723c4ae7
  由 Jiabin Yang 提交于 5月 20, 2022
  
  723c4ae7
- X
  [Hackathon No.5] tril_indices OP (#41639) · 75db5b86
  由 xiaoguoguo626807 提交于 5月 20, 2022
```
* add tril_indices cpu kernal

* modify tril_indice cpu op

* modify bug

* modify bug

* add tril_indices python api

* add tril_indices python api

* resolve conflict

* add tril_indices test

* modify details

* add tril_indices.cu

* pythonapi pass

* save tril_indices

* CPU tril_indices pass

* delete vlog

* modify test_tril_indices_op.py

* delete tril_indices_kernel.cc.swp

* delete tril_indice.cu

* modify code style

* add newline in creation.py

* modify creation.py linux newline

* delete annotation

* check code style

* check .py style add final_state??

* modify code style

* add gpu_tril_indices

* modify gpu_compiled_juage

* modify gpu judge

* code style

* add test example

* modify english document

modify english document

modify english document

modify document

modify document

* modify pram name

* modify pram name

* modify pram

* reduce test ex
```
  75db5b86
- Z
  fix Wtype-limits (#42676) · 1f76eabf
  由 zhaocaibei123 提交于 5月 20, 2022
```
* fix Wtype-limits

* fix

* remove -Wno-error=type-limits
```
  1f76eabf
- L
  add approval for changing warning flag (#42875) · 11ce7eb1
  由 Leo Chen 提交于 5月 20, 2022
```
* add approval for changing warning flag

* test for approval

* revert changes
```
  11ce7eb1
- Y
  
  add dymf accessor support (#42881) · 56a8b3e3
  由 yaoxuefeng 提交于 5月 20, 2022
  
  56a8b3e3
- Z
  
  add arg_max tensorrt converter, fix identity_scale_op_clean_pass (#42850) · 5efc4146
  由 zhupengyang 提交于 5月 20, 2022
  
  5efc4146
- Z
  
  [MLU]support to spawn processes on mlu (#41787) · 5d1bbecb
  由 zn 提交于 5月 20, 2022
  
  5d1bbecb
- F
  
  add files and directories generated during codegen for operators into gitignore (#42874) · 2caee61f
  由 Feiyu Chan 提交于 5月 20, 2022
  
  2caee61f
- Y
  merge dymf branch (#42714) · 3f619290
  由 yaoxuefeng 提交于 5月 20, 2022
```
merge dymf branch
```
  3f619290
19 5月, 2022 15 次提交

Q

[MLU] add lookup_table_v2 and unstack op (#42847) · e726960a
由 qipengh 提交于 5月 19, 2022

e726960a
R
Fix PD_INFER_DECL redefine (#42731) · 313f5d01
由 Rui Li 提交于 5月 19, 2022
```
Signed-off-by: NKernelErr <me@lirui.tech>
```
313f5d01

OneDNN md-in-tensor refactoring part 3: Changes in quantize and dequantize (#42766) · b522ca52

由 jakpiase 提交于 5月 19, 2022

* added md support inside (de)quantizes

* added missing file

* changed paddle enforce text

* another paddle enforce change

* same as before

* removed broken tests

b522ca52

【CI】run all demo ci before exit in windows (#42700) · 6d0e4e4a

由 Sing_chan 提交于 5月 19, 2022

* run all demo ci before exit;test=document_fix;test=windows_ci_inference

* fix bug;test=document_fix;test=windows_ci_inference

* improve log

* commetn test code

* modify according to zhouwei's comments

6d0e4e4a

[Phi] Change the output format of C++ backward api (Part2) (#42545) · 4427f1b1

由 zyfncg 提交于 5月 19, 2022

* change the output format of C++ backward api

* fix merge conflict

* fix sparse api code auto-gen

* fix eager_gen bug

* fix bug of output is null

* fix bug of conv2d_grad_impl

* fix optional grad

* fix bug of eager-gen double_grad

* fix bug

* fix multiply_double_grad bug

* fix bug of higher order derivative

* fix bug of FillZeroForEmptyGradInput

* remove redundant vector in grad_node

* fix bug of test_deformable_conv_v1_op

* fix bug of test_deformable_conv_v1_op

* some refacotr

4427f1b1

A

[NPU] minor changes for version control to support version without suffix (#42856) · 892f6850
由 Aganlengzi 提交于 5月 19, 2022

892f6850
D

【GPUPS】add ctr_dymf_accessor for pscore (#42827) · 148582fe
由 danleifeng 提交于 5月 19, 2022

148582fe
Z
[Phi] Remove shared_storage (#42821) · 7a171e3c
由 zyfncg 提交于 5月 19, 2022
```
* remove shared_storage

* fix bug

* fix rnn bug
```
7a171e3c
Z
Fix typos in the comment doc of SimpleRNN, LSTM, GRU: hidden_size -> input_size. (#42770) · 155fe05b
由 Zhengyang Song 提交于 5月 19, 2022
```
test=document_fix
```
155fe05b
C
[CompileOpt] Refine enforce code and remove boost/variant include (#41093) · ca359fec
由 Chen Weihang 提交于 5月 19, 2022
```
* refine enforce code

* refine enforce code

* fix compile failed

* fix infrt failed
```
ca359fec

distribute label evenly among partitions in graph engine (#42846) · 68babef1

由 seemingwang 提交于 5月 19, 2022

* enable graph-engine to return all id

* change vector's dimension

* change vector's dimension

* enlarge returned ids dimensions

* add actual_val

* change vlog

* fix bug

* bug fix

* bug fix

* fix display test

* singleton of gpu_graph_wrapper

* change sample result's structure to fit training

* recover sample code

* fix

* secondary sample

* add graph partition

* fix pybind

* optimize buffer allocation

* fix node transfer problem

* remove log

* support 32G+ graph on single gpu

* remove logs

* fix

* fix

* fix cpu query

* display info

* remove log

* remove empyt file

* distribute labeled data evenly in graph engine
Co-authored-by: NDesmonDay <908660116@qq.com>

68babef1

[Auto Parallel] Support Primitive operators with Data Parallel (#42709) · 6b8efc45

由 JZ-LIANG 提交于 5月 19, 2022

* auto parallel support primitive op with data parallel

* add primitive change

* 5 loss 3D cylinder acc aligned

* add unitest

6b8efc45

[TensorRT] Support yolov5s (#42688) · a7778930

由 shentanyue 提交于 5月 19, 2022

* support yolov5s static/int8

* fix eltwise_sub and div weight compute

* fix delete_fill_constant_pass

a7778930

Fix API Docs bug (#42816) · 9f4d342c

由 Chen Long 提交于 5月 19, 2022

* update readme test=document_fix

* fix api docs;test=document_fix

* update logic.py;test=document_fix

* update docs;test=document_fix

9f4d342c

Z
[AutoParallel] split data in dataloader (#42838) · df470954
由 zhaoyingli 提交于 5月 19, 2022
```
* slice data in dist_loader & flag to scale grad

* bug fix

* update unittest

* enable static
```
df470954

18 5月, 2022 4 次提交

A
[Dy2Stat]Modify all jit.save path into tempfile under dygraph_to_static directory (#42842) · 16ce33b0
由 Aurelius84 提交于 5月 18, 2022
```
* [Dy2Stat]Modify all jit.save path into tempfile

* [Dy2Stat]Modify all jit.save path into tempfile
```
16ce33b0
C

fix tensorrt dla int8 problem (#42826) · a51817d7
由 csy0225 提交于 5月 18, 2022

a51817d7
F
Add Code Generation for operators, op makers and argument mapping functions (#41772) · e339d3c1
由 Feiyu Chan 提交于 5月 18, 2022
```
Add Code Generation for operators,  op makers and argument mapping functions (#41772)
```
e339d3c1

Add support for forward and reverse high-order automatic differentiation mechanism (#41919) · f6ee202f

由 WangZhen 提交于 5月 18, 2022

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* format python code

* support multi input in triple gradient checker

* Add matmul triple grad kernel

* Updated comments of TODO

* Supported some special tests

* Change code-format to follow CI std

* Updated gradient_checker.py

* Fix conflicts

* Removed unnecessary printing log

* Change code style to follow CI std

* merge upstream

* add priops.py

* add_p

* rm useless files

* add sub_p mul_p div_p

* add sqrt_p and tanh_p

* add reshape_p

* add broadcast_p

* Add python primitive wrappers.

* Jvp rules updated.

* JVP rules done for all the 17 primops.

* quick check and fixes.

* add jvp(op, *args)

* add broadcast_p fill_constant_p matmul_p reduce_p reshape_p transpose_p

* add split_p and concat_p

* add gather_p and scatter_add_p

* add slice_select_p and slice_assign_p

* Add transpose rules.

* add multi input check for add_p, sub_p, mul_p, div_p

* update concat_p

* Linearize and transpose in progress..

* refine gather_p and scatter_add_p

* updated.

* update transpose.

* refine slice_assign_p and slice_select_p

* init commit for lower

* Merged with primitive ops.

* small update

* add rules for orig2prim and prim2orig

* add 9 test for prim ops

* add more test and fix some bug

* add more test

* register proto

* Adding primops test.

* add shape valid check for broadcast_p op, and add keepdim attr into reduce_p op proto

* support multi input and multi output for split_p and concat_p

* Test updated.

* update

* fix slice bug for slice_select_p and slice_assign_p

* updated.

* Ops updated.

* Refactor and bug fixes.

* updated.

* finish orig2prim and prim2orig rules

* dtype for axis attr should be long int

* update dtype for axis attr int64_t

* update for iscan CI

* Update primx.

* Refactor vars in primx.

* update for lower transform

* add more shape and dtype check

* update primx.py

* change IndexTensor into int32 dtype

* update

* Fix linearize and transpose.

* Update is_dot

* Update is_dot

* Update is_dot

* add gradient aggregation, fix add_transpose.

* pass first linearize+transpose test.

* update test

* refactor op registration and primx.

* update rule for slice_assign

* try test lower

* update orig2prim and prim2orig

* pass simple lower pass

* update

* Update input types in the unit test.

* orig2prim segfault.

* 50% for adam.minimize

* test updated.

* temp fix erros in removing vars.

* primx updated.

* update for matmul_v2 and reshape2 orig2prim

* update for minimize

* Refine primrules

* Remove some code

* supporting unused and unreachable vars.

* update for use prim2orig in minimize

* fix gather and scatter_add transpose.

* Add rules UT

* update scatter_add

* Refine UT code

* fix nonetype check in topo

* Update gather_p pywrapper.

* remove useless print

* Merge tongxin PR and refine code

* readd some test

* rm useless print

* polish code.

* fix bug in minimize

* add get_input_var_list and get_output_var_list and use it in lower

* Fix scatter_add_p prim2orig

* Update code and fix orig2prim/prim2orig UT

* delete vars after block.desc._remove

* Improve ops and vars clean up logics.

* fix some bug in linearize and lower

* update tanh transpose.

* use set instead of list for var2remove

* test updated.

* polish code.

* fix dot2bar delete.

* merge tx/ad

* add indextensor_dot for gather and scatter_add

* add sorted for set

* Fix scale_orig2prim params

* fix some syntax bug

* add golbal_lower_update list

* Better handling of unused vars.

* update tests.

* Fix elementwise_sub orig2prim

* support none for transpose rule

* Merge and add transform UT

* fix a bug in transpose

* Fix transpose and UT

* a hacky fix for cancat op

* Fix exector place

* Refine variable name

* Add elementwise_mul orig2prim and support p_norm when p=1

* Add sqrt orig2prim rule and UT

* merge wz test

* rename files, add enable_prim, disable_prim, prim_enabled, delete global_lower_update

* fix a bug in test_ad_transform_trans

* revert modify in framework.py

* add paddle.fluid.incubate.ad_transform to  python/setup.py.in

* Fix remove vars error

* Fix p_norm_orig2prim

* merge wz

* Modify the code directory

* Add utils.py and remove get_input/output_vars functions

* Update maolin code

* Rename UT and refine test_ad_transform_primops

* Fix div_p jvp rule

* Add higher derivatives UT

* Remove UT to autograd dir

* Fix comments

* import paddle in primops.py

* Add some error message for assert

* Refine UT class name and refine some comments in primreg.py

* update minimize of paddle/optimizer for supporting new autograd

* resolve cicular importing between backward.py and optimizer.py

* fill gradients and minimize unittest

* Replace `assert isinstance` with `raise TypeError`

* Add some assert message for primx.py

* Polish variable name

* Add some assert message

* add some docstring

* refine some name

* update the format of english documents

* Split test_transform.py to two files to avoid ci error

* fix the document format of enable_prim/disable_prim/prim2orig/prim_enabled

* polish test_gradients_and_minimize

* add default value for prim_enabled api doc

* Remove some UT to avoid windows ci error

* Enlarge test_gradients_and_minimize limit time

* Fix ut limit time
Co-authored-by: Nveyron95 <veyron_wu@163.com>
Co-authored-by: NJiabin Yang <360788950@qq.com>
Co-authored-by: Nlevi131 <limaolin01@baidu.com>
Co-authored-by: NTongxin Bai <waffle.bai@gmail.com>
Co-authored-by: NXiaoxu Chen <chenxx_id@163.com>
Co-authored-by: Nlevi131 <83750468+levi131@users.noreply.github.com>

f6ee202f

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功