提交 · 8bc1c82d88d705e436291b1450e1ba3ead5ae867 · Crayon鑫 / Paddle

16 6月, 2022 8 次提交
- Z
  [Inference] add squeeze2/unsqueeze2 trt layer (#42782) · 8bc1c82d
  由 zhoutianzi666 提交于 6月 16, 2022
```
* add squeeze2

* add squeeze

* add squeeze2,unsqueeze2

* merge develop

* fix format

* add conditions for squeeze2 and unsqueeze in op_teller

* merge develop

* add squeeze unsqueeze

* add squeeze unsqueeze

* add squeeze unsqueeze

* remove unsqueeze2_eltwise_fuse_pass

* add squeeze/unsqueeze
```
  8bc1c82d
- 津
  [inference]add unary trt convert (#43509) · 890c7315
  由津提交于 6月 16, 2022
```
* add unary
```
  890c7315
- J
  
  Revert md-in-tensor refactoring (#43564) · 1ec626b1
  由 joanna.wozna.intel 提交于 6月 16, 2022
  
  1ec626b1
- J
  
  fix for quant model (#43567) · 13ad8bde
  由 jakpiase 提交于 6月 16, 2022
  
  13ad8bde
- Z
  
  remove fp16 support of depthwise_conv2d and add unittest for depthwise_conv2d, test=kunlun (#43483) · 6be3ee26
  由 zhangyikun02 提交于 6月 16, 2022
  
  6be3ee26
- R
  Support disable GC for some vars in interpretercore (#43546) · 4002f320
  由 Ruibiao Chen 提交于 6月 16, 2022
```
* Support disable GC for some vars in standalone executor

* Setting skip_gc_vars in interprecore construction
```
  4002f320
- L
  fix xpu kp compilation (#43496) · 767efaca
  由 Leo Chen 提交于 6月 16, 2022
```
* fix xpu kp compilation

* add depends
```
  767efaca
- L
  [new-exec] lazy creating work queue (#43551) · 238f82e6
  由 Leo Chen 提交于 6月 16, 2022
```
* lazy creating work queue

* fix dry_run
```
  238f82e6
15 6月, 2022 10 次提交
- H
  
  op cache supports un-persistable attributes (#43221) · 2b5771c4
  由 huzhiqiang 提交于 6月 15, 2022
  
  2b5771c4
- Y
  Optimize prod's python implementation for dygraph. (#43309) · 9b7126d0
  由 Yiqun Liu 提交于 6月 15, 2022
```
* Optimize prod's python implementation for dygraph.

* Change key_dim to head_dim.

* Add comment in unittest.

* Disable TF32 in unittest.
```
  9b7126d0
- F
  
  [MLU] add size kernel for mlu (#43450) · 4d0ca02b
  由 fwenguang 提交于 6月 15, 2022
  
  4d0ca02b
- F
  
  [MLU] add bce kernel for mlu (#43467) · 1dfa2d49
  由 fwenguang 提交于 6月 15, 2022
  
  1dfa2d49
- F
  
  [MLU] add bce kernel (#43435) · 669d8689
  由 fwenguang 提交于 6月 15, 2022
  
  669d8689
- G
  
  modify index dtype from int to int64_t of concat_and_split_functor (#43479) · 81abaaf5
  由 Guoxia Wang 提交于 6月 15, 2022
  
  81abaaf5
- Z
  Rename yaml (#43470) · fcd32950
  由 zyfncg 提交于 6月 15, 2022
```
* rename yaml file

* fix merge conflict

* fix infrt
```
  fcd32950
- add some kernels(csr*dense->csr, dense*dense->csr) of SparseTensor matmul (#42935) · 346efe96
  由 zhouweiwei2014 提交于 6月 15, 2022
```
* add some kernel(csr*dense->csr, dense*dense->csr) of SparseTensor matmul

* fix CI

* fix CI

* fix comment

* fix comment
```
  346efe96
- Y
  Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to... · 15577630
  由 Yiqun Liu 提交于 6月 15, 2022
```
Use int64_t in GetGpuLaunchConfig1D and ElementwiseKernel as index type to support large tensor. (#43506)

* Change some data type from int to int64_t in GetGpuLaunchConfig1D to support large tensor.

* Use int64_t in ElementwiseKernel as index type to support large tensor.
```
  15577630
- R
  Refactor dynload/port.h (#43431) · 332fdd1e
  由 Ruibiao Chen 提交于 6月 15, 2022
```
* Refactor port.h

* Remove some unnecessary code

* Fix CI errors
```
  332fdd1e
14 6月, 2022 16 次提交
- J
  [Eager] Fix edvr starganv2 (#43471) · c62a7e25
  由 Jiabin Yang 提交于 6月 14, 2022
```
* fix starganv2

* fix starganv2 stop_gradient end error

* fix edvr_starganv2

* fix mul kernel to fix optional ddx

* fix typo
```
  c62a7e25
- R
  Support sequential run GPU OPs for standalone executor (#43243) · 8cec1271
  由 Ruibiao Chen 提交于 6月 14, 2022
```
* Support sequential run for standalone executor

* Add UTs

* Fix test_standalone_multiply_write

* Remove unnecessary UTs
```
  8cec1271
- C
  
  [MLU] add mlu kernel for depthwise conv2d op (#43359) · 077f3788
  由 cambriconhsq 提交于 6月 14, 2022
  
  077f3788
- Z
  [MLU]: add elementwise_max mlu kernel (#43365) · ceb6b3f1
  由 zhaoying9105 提交于 6月 14, 2022
```
* [MLU]: add elementwise_max mlu kernel

* [MLU]: add int32 support for elementwise maxk MLU kernel
```
  ceb6b3f1
- Z
  
  [MLU]: add log log10 log2 MLU kernel (#43360) · 4642e8c4
  由 zhaoying9105 提交于 6月 14, 2022
  
  4642e8c4
- T
  
  fix whl check (#43415) · b1f77b4d
  由 tianshuo78520a 提交于 6月 14, 2022
  
  b1f77b4d
- S
  
  fix update loss scaling (#43487) · 0e6462d6
  由 sneaxiy 提交于 6月 14, 2022
  
  0e6462d6
- Y
  
  [cuda graph] partial program with cuda graph under static mode (#43440) · d83d59dd
  由 Yuang Liu 提交于 6月 14, 2022
  
  d83d59dd
- S
  [windows CI]open inference_ut in windows-inference pipeline (#43446) · 058c52b6
  由 Sing_chan 提交于 6月 14, 2022
```
* open inference_ut;test=windows_ci_inference

* inference_ut need onnx;test=windows_ci_inference

* disable trt_split_converter_test; use higher parallel level

* too high parallel will cause ut timeout
```
  058c52b6
- Z
  
  fix compiling werror (#43337) · c6421019
  由 Zhang Jun 提交于 6月 14, 2022
  
  c6421019
- X
  [ Make FLAGS_einsum_opt as default ] Einsum memory optimization (#43397) · 83abec60
  由 xiongkun 提交于 6月 14, 2022
```
* change logic for optimize

* modifty

* optimize the backward speed of EinsumOp

* add cache optimizer for einsum op

* EinsumOp: fix new dygraph mode error

* fix bug

* change Cache->InnerCache

* fix code

* fix

* add nan inf utils for einsum op

* add as_extra

* memory optimizer for einsum

* update code
```
  83abec60
- S
  
  【code format check upgrade】 step3：enable clang-format sort these infrt files's headers (#43333) · 403b127b
  由 Sing_chan 提交于 6月 14, 2022
  
  403b127b
- S
  
  【code format check upgrade】 step3：enable clang-format sort these cinn files's headers(#43329) · d14e3698
  由 Sing_chan 提交于 6月 14, 2022
  
  d14e3698
- W
  fix cmake-lint problems. (#43406) · 59f89236
  由 Wilber 提交于 6月 14, 2022
```
* cmake-lint

* update
```
  59f89236
- Z
  
  fix bug of infer shape for slice (#43443) · e0a01461
  由 zyfncg 提交于 6月 14, 2022
  
  e0a01461
- J
  [Eager] Fix custom op error (#43463) · 42754088
  由 Jiabin Yang 提交于 6月 14, 2022
```
* fix custom op error

* fix code error
```
  42754088
13 6月, 2022 6 次提交
- Q
  
  [MLU]add lookup_table_v2 op and fix amp feature of bert with mlu device (#43366) · 67bd5d9c
  由 qipengh 提交于 6月 13, 2022
  
  67bd5d9c
- C
  
  add mlu interp_v2(nearest&bilinear). (#43383) · affe25b7
  由 Chenxiao Niu 提交于 6月 13, 2022
  
  affe25b7
- C
  add serialization for new field in event node (#43405) · 360b8383
  由 chenjian 提交于 6月 13, 2022
```
* add serialization for new field in event node

* fix a bug
```
  360b8383
- Z
  
  add only split (#43424) · 30b10630
  由 zhoutianzi666 提交于 6月 13, 2022
  
  30b10630
- 津
  
  [inference]add topk/topk_v2 trt convertor (#43368) · 65e86580
  由津提交于 6月 13, 2022
  
  65e86580
- P
  
  Disable oneDNN adaptive pooling exhaustive check (#43236) · 4af7ebf4
  由 piotrekobi 提交于 6月 13, 2022
  
  4af7ebf4

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致