Created by: wangchaochaohu
develop
-------------------------> Profiling Report <-------------------------
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
Total time: 1.59668
Computation time Total: 1.446 Ratio: 90.563%
Framework overhead Total: 0.150679 Ratio: 9.437%
------------------------- GpuMemCpy Summary -------------------------
GpuMemcpy Calls: 0 Total: 0 Ratio: 0%
------------------------- Event Summary -------------------------
Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio.
thread0::fill_constant 4 1.28716 1.287156 (1.000000) 0.000000 (0.000000) 0.177514 0.731956 0.321789 0.806144
thread0::fill_constant/fill_constant0 1 0.728399 0.728399 (1.000000) 0.000000 (0.000000) 0.728399 0.728399 0.728399 0.456195
thread0::fill_constant/fill_constant0/prepare_data 1 0.003698 0.003698 (1.000000) 0.000000 (0.000000) 0.003698 0.003698 0.003698 0.00231605
thread0::fill_constant/fill_constant0/infer_shape 1 0.010601 0.010601 (1.000000) 0.000000 (0.000000) 0.010601 0.010601 0.010601 0.00663939
thread0::fill_constant/fill_constant0/compute 1 0.687765 0.687765 (1.000000) 0.000000 (0.000000) 0.687765 0.687765 0.687765 0.430746
thread0::fill_constant/fill_constant1 1 0.195331 0.195331 (1.000000) 0.000000 (0.000000) 0.195331 0.195331 0.195331 0.122335
thread0::fill_constant/fill_constant1/prepare_data 1 0.001688 0.001688 (1.000000) 0.000000 (0.000000) 0.001688 0.001688 0.001688 0.00105719
thread0::fill_constant/fill_constant1/infer_shape 1 0.009121 0.009121 (1.000000) 0.000000 (0.000000) 0.009121 0.009121 0.009121 0.00571247
thread0::fill_constant/fill_constant1/compute 1 0.175273 0.175273 (1.000000) 0.000000 (0.000000) 0.175273 0.175273 0.175273 0.109773
thread0::fill_constant/fill_constant2 1 0.178678 0.178678 (1.000000) 0.000000 (0.000000) 0.178678 0.178678 0.178678 0.111906
thread0::fill_constant/fill_constant2/prepare_data 1 0.001057 0.001057 (1.000000) 0.000000 (0.000000) 0.001057 0.001057 0.001057 0.000661997
thread0::fill_constant/fill_constant2/infer_shape 1 0.001598 0.001598 (1.000000) 0.000000 (0.000000) 0.001598 0.001598 0.001598 0.00100082
thread0::fill_constant/fill_constant2/compute 1 0.17115 0.171150 (1.000000) 0.000000 (0.000000) 0.17115 0.17115 0.17115 0.107191
thread0::fill_constant/fill_constant3 1 0.17624 0.176240 (1.000000) 0.000000 (0.000000) 0.17624 0.17624 0.17624 0.110379
thread0::fill_constant/fill_constant3/prepare_data 1 0.001181 0.001181 (1.000000) 0.000000 (0.000000) 0.001181 0.001181 0.001181 0.000739658
thread0::fill_constant/fill_constant3/infer_shape 1 0.00151 0.001510 (1.000000) 0.000000 (0.000000) 0.00151 0.00151 0.00151 0.000945711
thread0::fill_constant/fill_constant3/compute 1 0.169149 0.169149 (1.000000) 0.000000 (0.000000) 0.169149 0.169149 0.169149 0.105938
thread0::sum 1 0.276922 0.276922 (1.000000) 0.000000 (0.000000) 0.276922 0.276922 0.276922 0.173436
thread0::sum/sum0 1 0.27089 0.270890 (1.000000) 0.000000 (0.000000) 0.27089 0.27089 0.27089 0.169658
thread0::sum/sum0/prepare_data 1 0.001829 0.001829 (1.000000) 0.000000 (0.000000) 0.001829 0.001829 0.001829 0.0011455
thread0::sum/sum0/infer_shape 1 0.010718 0.010718 (1.000000) 0.000000 (0.000000) 0.010718 0.010718 0.010718 0.00671267
thread0::sum/sum0/compute 1 0.242667 0.242667 (1.000000) 0.000000 (0.000000) 0.242667 0.242667 0.242667 0.151982
thread0::fetch 1 0.032605 0.032605 (1.000000) 0.000000 (0.000000) 0.032605 0.032605 0.032605 0.0204205
thread0::fetch/fetch0 1 0.031157 0.031157 (1.000000) 0.000000 (0.000000) 0.031157 0.031157 0.031157 0.0195136
PR result:
-------------------------> Profiling Report <-------------------------
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
Total time: 1.45377
Computation time Total: 1.28217 Ratio: 88.1963%
Framework overhead Total: 0.171599 Ratio: 11.8037%
------------------------- GpuMemCpy Summary -------------------------
GpuMemcpy Calls: 0 Total: 0 Ratio: 0%
------------------------- Event Summary -------------------------
Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio.
thread0::fill_constant 4 1.13864 1.138643 (1.000000) 0.000000 (0.000000) 0.181421 0.588494 0.284661 0.783233
fill_constant0 1 0.573295 0.573295 (1.000000) 0.000000 (0.000000) 0.573295 0.573295 0.573295 0.39435
fill_constant0/prepare_data 1 0.004323 0.004323 (1.000000) 0.000000 (0.000000) 0.004323 0.004323 0.004323 0.00297364
fill_constant0/infer_shape 1 0.010975 0.010975 (1.000000) 0.000000 (0.000000) 0.010975 0.010975 0.010975 0.00754932
fill_constant0/compute 1 0.528973 0.528973 (1.000000) 0.000000 (0.000000) 0.528973 0.528973 0.528973 0.363862
fill_constant1 1 0.180106 0.180106 (1.000000) 0.000000 (0.000000) 0.180106 0.180106 0.180106 0.123889
fill_constant1/prepare_data 1 0.001569 0.001569 (1.000000) 0.000000 (0.000000) 0.001569 0.001569 0.001569 0.00107926
fill_constant1/infer_shape 1 0.002525 0.002525 (1.000000) 0.000000 (0.000000) 0.002525 0.002525 0.002525 0.00173686
fill_constant1/compute 1 0.168136 0.168136 (1.000000) 0.000000 (0.000000) 0.168136 0.168136 0.168136 0.115655
fill_constant2 1 0.179999 0.179999 (1.000000) 0.000000 (0.000000) 0.179999 0.179999 0.179999 0.123815
fill_constant2/prepare_data 1 0.001182 0.001182 (1.000000) 0.000000 (0.000000) 0.001182 0.001182 0.001182 0.000813057
fill_constant2/infer_shape 1 0.001566 0.001566 (1.000000) 0.000000 (0.000000) 0.001566 0.001566 0.001566 0.0010772
fill_constant2/compute 1 0.172084 0.172084 (1.000000) 0.000000 (0.000000) 0.172084 0.172084 0.172084 0.118371
fill_constant3 1 0.185536 0.185536 (1.000000) 0.000000 (0.000000) 0.185536 0.185536 0.185536 0.127624
fill_constant3/prepare_data 1 0.001194 0.001194 (1.000000) 0.000000 (0.000000) 0.001194 0.001194 0.001194 0.000821311
fill_constant3/infer_shape 1 0.001528 0.001528 (1.000000) 0.000000 (0.000000) 0.001528 0.001528 0.001528 0.00105106
fill_constant3/compute 1 0.179033 0.179033 (1.000000) 0.000000 (0.000000) 0.179033 0.179033 0.179033 0.123151
thread0::sum 1 0.267111 0.267111 (1.000000) 0.000000 (0.000000) 0.267111 0.267111 0.267111 0.183736
sum0 1 0.261984 0.261984 (1.000000) 0.000000 (0.000000) 0.261984 0.261984 0.261984 0.18021
sum0/prepare_data 1 0.001607 0.001607 (1.000000) 0.000000 (0.000000) 0.001607 0.001607 0.001607 0.0011054
sum0/infer_shape 1 0.010654 0.010654 (1.000000) 0.000000 (0.000000) 0.010654 0.010654 0.010654 0.00732852
sum0/compute 1 0.233948 0.233948 (1.000000) 0.000000 (0.000000) 0.233948 0.233948 0.233948 0.160925
thread0::fetch 1 0.048019 0.048019 (1.000000) 0.000000 (0.000000) 0.048019 0.048019 0.048019 0.0330306
fetch0 1 0.034219 0.034219 (1.000000) 0.000000 (0.000000) 0.034219 0.034219 0.034219 0.0235381