提交 · f0a6c1eb609b03404e25bdd0326b74e7a388cd37 · PaddlePaddle / Paddle-Lite

22 10月, 2019 1 次提交

由 TianXiaogang 提交于 10月 22, 2019

* feat: add beam_search_special function for support nlp model

* fix: add beam_search_compute kernel input and output

* feat: add assign op & copy_compute kernel

* feat: add fill_const_batch_size_like op & kernel

* feat: add layer_norm op and kernel and ut

* fix: fix some bugs
    fix mul_op infer_shape bug when x_dim_idx = 2, x_dims.size()=3 & y_dim_idx = 1, y_dims.size()=2
    fix elementwise_compute bug when y axis is all 1
    fix beam_search choose math_func wrong bug
    fix layer_norm get attr bug
    fix fill_constant_batch_size_like shape_set bug

* feat: add gather op and kernel & and transform ut

* feats: add ops and fix bugs to support transformer op
       fix type_cast passes to skip `while`
       fix elementwise infer_shape bug when x.dims=3 and y.dims={1} & axis=0
       fix lookup_table compute bug
       fix read_from_array/beam_search/increment/compate/gather ops data_type problems

* fix:
    transfomer ut add word read inferface
    fix copy/gather/norm/layer_norm include path problem

* fix:debug info

* fix: fix input reshape bug

* fix: fix norm bug

* style: style fix & test=develop

* style: fix operators cmakelist

* style: fix operators cmakelist; test=develop

* fix and test=develop

* fix and test=develop

* style: style fix; test=develop

f0a6c1eb

16 10月, 2019 1 次提交

[framework][place] remove prefered_place and kHost in valid_places (#2192) · 3012088b

由 sangoly 提交于 10月 16, 2019

* [framework][place] remove prefered_place, use place order in valid_place array instead test=develop

* remove kHost from valid_places test=develop

3012088b

15 10月, 2019 1 次提交
- J
  Fix quant dequant fuse pass (#2190) · 9cc7dfa8
  由 juncaipeng 提交于 10月 15, 2019
```
* fix bug for accessing the removed node, test=develop
```
  9cc7dfa8
14 10月, 2019 1 次提交
- J
  Optimize quant_dequant_fuse_pass (#2169) · 253acb80
  由 juncaipeng 提交于 10月 14, 2019
```
* optimize quant_dequant_fuse_pass, test=develop
```
  253acb80
27 9月, 2019 1 次提交

can run yolov3 fp32 on cuda devices (#2092) · 3d6d744f

由 Zhaolong Xing 提交于 9月 27, 2019

* add conv int8 support(in condition which the input or output channel not be the times of 4)
add add_kernel for cuda.

* can run yolov3 fp32
test=develop

* 1. fix bug with yolov3 run
test=develop

3d6d744f

26 9月, 2019 1 次提交
- S
  
  [Fc Fusion] fix fc fusion duplicative arguments bug test=develop (#2135) · 1e6bb8d5
  由 sangoly 提交于 9月 26, 2019
  
  1e6bb8d5
23 9月, 2019 1 次提交
- W
  
  model_test add host place (#2109) · 8bee4e29
  由 Wilber 提交于 9月 23, 2019
  
  8bee4e29
18 9月, 2019 1 次提交
- 石
  
  modify the device binding logic of the pass, test=develop (#2060) · 3569483a
  由石晓伟提交于 9月 18, 2019
  
  3569483a
13 9月, 2019 1 次提交

石

checkout if passes match targets and kernels, test=develop (#2035) · e27e1b08

由石晓伟提交于 9月 13, 2019

* checkout if passes match targets and kernels, test=develop

* add pass_utils, test=develop

* fix lite/core/mir/pass_registry.h, test=develop

* improve code styles, test=develop

* fix spell error, test=develop

e27e1b08

12 9月, 2019 1 次提交
- G
  
  enable native compiling on raspberry pi and rk3399 (#2021) · e0b4b5c9
  由 guofei 提交于 9月 12, 2019
  
  e0b4b5c9
11 9月, 2019 2 次提交
- 石
  make passes related to the device type, test=develop (#2012) · 8ca10db8
  由石晓伟提交于 9月 11, 2019
```
* make passes related to the device type, test=develop

* improve tips, test=develop
```
  8ca10db8
- Z
  fix conv-act-fuse-pass when there is no "bias" (#2003) · 2dcff5ca
  由 zhupengyang 提交于 9月 11, 2019
```
test=develop
```
  2dcff5ca
06 9月, 2019 1 次提交
- Z
  add interpolate fuse pass (#1980) · c49958a2
  由 zhupengyang 提交于 9月 06, 2019
```
test=develop
```
  c49958a2
30 8月, 2019 1 次提交

add precision and persistable attrs for the tensor. (#1899) · e2e07fa4

由 Zhen Wang 提交于 8月 30, 2019

* Add precision and persistable attrs for the tensor. And fix cxx light and full api demo.

* update precision2string methods. test=develop

* move the save logic to the front of the run in mobilenetv1_full_api.cc, test=develop.

* add comments for UpdateVarsOfProgram. test=develop

e2e07fa4

29 8月, 2019 2 次提交

T

add conv2d transpose fuse (#1909) · 57ee8714
由 tensor-tang 提交于 8月 29, 2019

57ee8714

Add yolo_box_cuda multiclass_nms_host kernel. (#1908) · de43e479

由 Wilber 提交于 8月 29, 2019

* add yolo_box_compute cuda

* move multiclass_nms(arm) to host

* add lod in scale op

* add yolo_box_cuda cmake config

* modify shuffle_channel_fuse and transpose_softmax_transpose_fuse to support run ssd model. test=develop

* reshape and transpose op don't have xshape output.

* modify yolo_box_compute_cuda, use tensor to manage cuda memory test=develop

* add yolo_box use kernel test=develop

de43e479

28 8月, 2019 2 次提交
- Z
  add transpose-softmax-transpose fuse pass (#1863) · 5e8b15f5
  由 zhupengyang 提交于 8月 28, 2019
```
* add transpose-softmax-transpose fuse pass

test=develop

* enable supported lite-npu ops

test=develop
```
  5e8b15f5
- S
  
  [Protobuf] add combined-param model save/load supported test=develop (#1876) · 93950441
  由 sangoly 提交于 8月 27, 2019
  
  93950441
23 8月, 2019 1 次提交
- T
  enable shuffle channel fuse (#1834) · 1ec18e53
  由 tensor-tang 提交于 8月 23, 2019
```
test=develop
```
  1ec18e53
16 8月, 2019 1 次提交
- Y
  
  publish lite (#1800) · 699d6cd0
  由 Yan Chunwei 提交于 8月 16, 2019
  
  699d6cd0