提交 · f9077aa4847c5e5308280167c33046e7b51c0ec5 · PaddlePaddle / Paddle-Lite

23 7月, 2020 1 次提交
- J
  
  update for subgraph. test=develop · f9077aa4
  由 jiweibo 提交于 7月 23, 2020
  
  f9077aa4
22 7月, 2020 2 次提交

[Core] Add the graph optimization of subblocks for transformer model (#3947) · 7af1a258

由 hong19860320 提交于 7月 22, 2020

* [Core][ARM] Fix beam_search, eltwise_mul supports broadcast and int64_t data type, add print op and kernel, add exeception
test=develop

* Fix the dims of parent idx of the arm kernel of beam_search op

* elementwise_mul supports int64_t data type with broadcasting

* Add print op and kernel for debugging

* Support throwing the exception when the internal error occurs

* Refine while and conditional_block op kernel

* Support the graph optimization on subblocks

* Pass program_desc and block_idx into the kernel of the control flow ops(while/conditional_block/subgraph), and create the RuntimeProgram online, it make it possiable to call the control flow ops recursively

*Add unit test for masked transformer model

7af1a258

J

update name format. test=develop · 4b54c594
由 jiweibo 提交于 7月 22, 2020

4b54c594

17 7月, 2020 1 次提交
- J
  
  update · dcde5c7c
  由 jiweibo 提交于 7月 17, 2020
  
  dcde5c7c
15 7月, 2020 1 次提交

石

update desc interfaces, test=develop (#3926) · 42ab4d55

由石晓伟提交于 7月 15, 2020

* update desc interfaces, test=develop

* update desc interfaces, test=develop

* update compatible_pb.cc, test=develop

* fix build errors, test=develop

* remove the fstream to shrink the size of library, test=develop

42ab4d55

07 7月, 2020 1 次提交
- 石
  
  cpp namespace alias, test=develop (#3894) · 4776f8f4
  由石晓伟提交于 7月 07, 2020
  
  4776f8f4
17 6月, 2020 1 次提交
- J
  
  reorganize stream. test=develop · 84bdddb2
  由 jiweibo 提交于 6月 17, 2020
  
  84bdddb2
12 6月, 2020 1 次提交
- Y
  [LITE][PASS] Remove reshape2 / squeeze2 for tf_mobilenetv1/v2 (#3773) · 29771f27
  由 Yuan Shuai 提交于 6月 12, 2020
```
* [LITE][PASS] Add pass for removing uesless reshape2 / squeeze2. test=develop
```
  29771f27
11 6月, 2020 1 次提交
- W
  
  [CUDA] [NVTX] Lite add nvtx to support performance debug. (#3764) · 6299a90a
  由 Wilber 提交于 6月 11, 2020
  
  6299a90a
09 6月, 2020 1 次提交
- H
  
  [Parl] Add CxxPredictor->Clone() method (#3759) · 24d37695
  由 huzhiqiang 提交于 6月 09, 2020
  
  24d37695
28 5月, 2020 1 次提交

[Libsize] Reduce size of dynamic library ".so" (#3717) · ec8ef528

由 T8T9 提交于 5月 28, 2020

* reduce .so size. test=develop

* compile all targets when LITE_ON_TINY_PUBLISH=OFF

* unordered_map is more convenient when key is customized class

* test=develop

ec8ef528

18 5月, 2020 1 次提交

[LITE][OPENCL] Enhance Profiler for OpenCL with in/out/filter shape,... · 53a6f3bc

由 Yuan Shuai 提交于 5月 18, 2020

[LITE][OPENCL] Enhance Profiler for OpenCL with in/out/filter shape, macs/macs_ps, real backend kernel etc. (#3641)

* [LITE][OPENCL] Enhance Precision Profiler for OpenCL. test=develop

53a6f3bc

13 4月, 2020 1 次提交
- W
  lite cuda support exec multi-stream. (#2949) · 4a7284f9
  由 Wilber 提交于 4月 13, 2020
```
lite cuda support exec multi-stream
```
  4a7284f9
30 12月, 2019 1 次提交
- Y
  Optimize the execution of RuntimeProgram by saving the bool whether the op is... · bb1cf7ff
  由 Yiqun Liu 提交于 12月 30, 2019
```
Optimize the execution of RuntimeProgram by saving the bool whether the op is feed/fetch op. (#2703)

test=develop
```
  bb1cf7ff
27 12月, 2019 1 次提交
- 石
  
  update profiler, test=develop (#2644) · 9171b70e
  由石晓伟提交于 12月 27, 2019
  
  9171b70e
16 12月, 2019 1 次提交
- 石
  update profiler, test=develop (#2607) · af37a14f
  由石晓伟提交于 12月 16, 2019
```
* update profiler, test=develop

* warm up times of profiler, test=develop
```
  af37a14f
10 12月, 2019 1 次提交

modify static_kernel_pass to support select the kernel according to input type (#2488) · 7ef0e7fe

由 Wilber 提交于 12月 10, 2019

修改了选kernel的逻辑，默认从模型文件中读取出lod_tensor的data type，在static_kernel_pick pass中如果kernel输入输出的类型与读取的data type完全一致，则选择该Kernel的概率增大。

- 增加 从模型文件__model__读取lod_tensor的data type到cpp::vardesc

- program中增加unordered_map<string, type>字段，并在 Program::PrepareWorkspace中对该字段赋值

- 修改了node.h文件，将const Type* 更改为Type*，并在SSAGraph::Build过程中为符合条件的type*赋值

- static_kernel_pick_pass中添加新规则，如果kernel的输入类型输出类型与__model__中存储的类型的一致，则score*=2。

- 支持模型中用到sequence_reverse_float kernel（输入输出均为float）和sequence_reverse_int64 kernel（输入输出均为int64），能够根据输入输出type选kernel

7ef0e7fe

04 12月, 2019 1 次提交
- 石
  
  refactor profile tools, test=develop (#2536) · 8a634b71
  由石晓伟提交于 12月 04, 2019
  
  8a634b71
24 10月, 2019 1 次提交

Make inceptionv4, resnet50, googlenet can run on x86 paltform (#2250) · edb4ea9a

由 liu zhengxi 提交于 10月 24, 2019

* make inceptionv4, resnet50, googlenet can run on x86 paltform and fix the compare part in x86 unittests, test=develop

* fix googlenet tests for benchmark record, test=develop

* [framework][profile] fix profile dump bug when op is feed and fetch test=develop (sangoly)

edb4ea9a

27 9月, 2019 1 次提交
- S
  
  [Profile] add kernel runtime profile && add op runtime summary test=develop (#2136) · aa6623b8
  由 sangoly 提交于 9月 27, 2019
  
  aa6623b8
30 8月, 2019 1 次提交

add precision and persistable attrs for the tensor. (#1899) · e2e07fa4

由 Zhen Wang 提交于 8月 30, 2019

* Add precision and persistable attrs for the tensor. And fix cxx light and full api demo.

* update precision2string methods. test=develop

* move the save logic to the front of the run in mobilenetv1_full_api.cc, test=develop.

* add comments for UpdateVarsOfProgram. test=develop

e2e07fa4

16 8月, 2019 1 次提交
- Y
  
  publish lite (#1800) · 699d6cd0
  由 Yan Chunwei 提交于 8月 16, 2019
  
  699d6cd0