Created by: wangchaochaohu
Profiler using example code This PR:
-------------------------> Profiling Report <-------------------------
Note! This Report merge all thread info into one.
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
Total time: 33.4768
Computation time Total: 9.54767 Ratio: 28.5203%
Framework overhead Total: 23.9291 Ratio: 71.4797%
------------------------- GpuMemCpy Summary -------------------------
GpuMemcpy Calls: 12 Total: 1.06891 Ratio: 3.193%
GpuMemcpyAsync Calls: 12 Total: 1.06891 Ratio: 3.193%
------------------------- Event Summary -------------------------
Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio.
ParallelExecutor::Run 1 33.3599 33.267285 (0.997224) 0.092608 (0.002776) 33.3599 33.3599 33.3599 0.996509
FastThreadedSSAGraphExecutorPrepare 1 26.0902 25.997574 (0.996450) 0.092608 (0.003550) 26.0902 26.0902 26.0902 0.779352
fused_all_reduce 1 8.76217 2.480211 (0.283059) 6.281961 (0.716941) 8.76217 8.76217 8.76217 0.261739
mul 4 3.47151 3.442485 (0.991639) 0.029024 (0.008361) 0.256854 1.33485 0.867877 0.103699
mean 4 2.82855 2.817609 (0.996131) 0.010944 (0.003869) 0.289439 1.0633 0.707138 0.084493
elementwise_add_grad 4 2.07373 2.061025 (0.993874) 0.012704 (0.006126) 0.050706 1.15218 0.518432 0.0619453
mean_grad 4 1.83154 1.823063 (0.995370) 0.008480 (0.004630) 0.201255 0.818809 0.457886 0.0547109
Scale LossGrad 4 1.7562 1.748869 (0.995827) 0.007328 (0.004173) 0.281598 0.888646 0.439049 0.0524602
GpuMemcpyAsync:CPU->GPU 4 0.709247 0.701919 (0.989668) 0.007328 (0.010332) 0.042646 0.26316 0.177312 0.0211862
sgd 8 1.55178 1.538021 (0.991133) 0.013760 (0.008867) 0.052253 0.349815 0.193973 0.046354
mul_grad 4 0.889237 0.861365 (0.968656) 0.027872 (0.031344) 0.174724 0.343964 0.222309 0.0265628
InitLocalVars 1 0.57186 0.571860 (1.000000) 0.000000 (0.000000) 0.57186 0.57186 0.57186 0.0170823
coalesce_tensor 4 0.473217 0.473217 (1.000000) 0.000000 (0.000000) 0.084212 0.199022 0.118304 0.0141357
Fetch 1 0.381305 0.367929 (0.964920) 0.013376 (0.035080) 0.381305 0.381305 0.381305 0.0113901
GpuMemcpyAsync:GPU->CPU 4 0.24279 0.229414 (0.944907) 0.013376 (0.055093) 0.035801 0.131014 0.0606975 0.00725249
elementwise_add 4 0.363059 0.352307 (0.970385) 0.010752 (0.029615) 0.05774 0.113593 0.0907648 0.0108451
eager_deletion 12 0.027635 0.027635 (1.000000) 0.000000 (0.000000) 0.001541 0.005127 0.00230292 0.000825498
ScopeBufferedMonitor::post_local_exec_scopes_process 1 0.012179 0.012179 (1.000000) 0.000000 (0.000000) 0.012179 0.012179 0.012179 0.000363805
ScopeBufferedMonitor::pre_local_exec_scopes_process 1 0.004394 0.004394 (1.000000) 0.000000 (0.000000) 0.004394 0.004394 0.004394 0.000131255
GpuMemcpyAsync:CPU->GPU 4 0.116875 0.102316 (0.875431) 0.014559 (0.124569) 0.020035 0.053887 0.0292188 0.00349123
W0518 07:13:48.128401 80971 profiler_helper.h:384] Cannot find the push marker of event 'ParallelExecutor::Run', which will be ignored in profiling report.
-------------------------> Profiling Report <-------------------------
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
------------------------- Event Summary -------------------------
Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio.
thread16::sgd 1 0.163545 0.161881 (0.989825) 0.001664 (0.010175) 0.163545 0.163545 0.163545 1
thread15::sgd 1 0.268643 0.266531 (0.992138) 0.002112 (0.007862) 0.268643 0.268643 0.268643 1
thread14::sgd 1 0.316286 0.314942 (0.995751) 0.001344 (0.004249) 0.316286 0.316286 0.316286 1
thread13::sgd 1 0.197217 0.195873 (0.993185) 0.001344 (0.006815) 0.197217 0.197217 0.197217 1
thread12::sgd 1 0.349815 0.348439 (0.996066) 0.001376 (0.003934) 0.349815 0.349815 0.349815 1
thread11::mean 1 0.570155 0.567435 (0.995229) 0.002720 (0.004771) 0.570155 0.570155 0.570155 0.598273
thread11::Fetch 1 0.381305 0.367929 (0.964920) 0.013376 (0.035080) 0.381305 0.381305 0.381305 0.40011
GpuMemcpyAsync:GPU->CPU 4 0.24279 0.229414 (0.944907) 0.013376 (0.055093) 0.035801 0.131014 0.0606975 0.254764
thread11::eager_deletion 1 0.001541 0.001541 (1.000000) 0.000000 (0.000000) 0.001541 0.001541 0.001541 0.001617
thread10::mean 1 0.905661 0.902813 (0.996855) 0.002848 (0.003145) 0.905661 0.905661 0.905661 0.997617
thread10::eager_deletion 1 0.002163 0.002163 (1.000000) 0.000000 (0.000000) 0.002163 0.002163 0.002163 0.00238262
thread9::mean 1 1.0633 1.060802 (0.997653) 0.002496 (0.002347) 1.0633 1.0633 1.0633 0.997736
thread9::eager_deletion 1 0.002413 0.002413 (1.000000) 0.000000 (0.000000) 0.002413 0.002413 0.002413 0.00226422
thread8::Scale LossGrad 1 0.281598 0.279774 (0.993523) 0.001824 (0.006477) 0.281598 0.281598 0.281598 1
GpuMemcpyAsync:CPU->GPU 1 0.26316 0.261336 (0.993069) 0.001824 (0.006931) 0.26316 0.26316 0.26316 0.934524
thread7::Scale LossGrad 1 0.291794 0.289906 (0.993530) 0.001888 (0.006470) 0.291794 0.291794 0.291794 0.669787
GpuMemcpyAsync:CPU->GPU 1 0.190436 0.188548 (0.990086) 0.001888 (0.009914) 0.190436 0.190436 0.190436 0.437129
thread7::sgd 1 0.143858 0.142162 (0.988211) 0.001696 (0.011789) 0.143858 0.143858 0.143858 0.330213
thread6::elementwise_add_grad 1 1.15218 1.148913 (0.997167) 0.003264 (0.002833) 1.15218 1.15218 1.15218 0.52329
thread6::mean_grad 1 0.574346 0.572202 (0.996267) 0.002144 (0.003733) 0.574346 0.574346 0.574346 0.260853
thread6::Scale LossGrad 1 0.294159 0.292335 (0.993799) 0.001824 (0.006201) 0.294159 0.294159 0.294159 0.1336
GpuMemcpyAsync:CPU->GPU 1 0.213005 0.211181 (0.991437) 0.001824 (0.008563) 0.213005 0.213005 0.213005 0.0967415
thread6::mul_grad 1 0.176503 0.169527 (0.960477) 0.006976 (0.039523) 0.176503 0.176503 0.176503 0.0801632
thread6::eager_deletion 2 0.004611 0.004611 (1.000000) 0.000000 (0.000000) 0.001926 0.002685 0.0023055 0.0020942
thread5::mean_grad 1 0.818809 0.816825 (0.997577) 0.001984 (0.002423) 0.818809 0.818809 0.818809 0.341401
thread5::mul 1 0.658406 0.651334 (0.989259) 0.007072 (0.010741) 0.658406 0.658406 0.658406 0.274521
thread5::elementwise_add_grad 1 0.629194 0.626154 (0.995168) 0.003040 (0.004832) 0.629194 0.629194 0.629194 0.262341
thread5::mul_grad 1 0.174724 0.168196 (0.962638) 0.006528 (0.037362) 0.174724 0.174724 0.174724 0.0728508
thread5::elementwise_add 1 0.113593 0.111129 (0.978309) 0.002464 (0.021691) 0.113593 0.113593 0.113593 0.0473624
thread5::eager_deletion 2 0.003654 0.003654 (1.000000) 0.000000 (0.000000) 0.001822 0.001832 0.001827 0.00152353
thread4::fused_all_reduce 1 8.76217 2.480211 (0.283059) 6.281961 (0.716941) 8.76217 8.76217 8.76217 0.808448
thread4::mul 1 1.33485 1.327523 (0.994510) 0.007328 (0.005490) 1.33485 1.33485 1.33485 0.123161
thread4::mul_grad 1 0.343964 0.336732 (0.978975) 0.007232 (0.021025) 0.343964 0.343964 0.343964 0.0317361
thread4::mean_grad 1 0.201255 0.199079 (0.989188) 0.002176 (0.010812) 0.201255 0.201255 0.201255 0.0185689
thread4::elementwise_add 1 0.089355 0.086731 (0.970634) 0.002624 (0.029366) 0.089355 0.089355 0.089355 0.0082444
thread4::sgd 1 0.052253 0.050109 (0.958969) 0.002144 (0.041031) 0.052253 0.052253 0.052253 0.00482116
thread4::elementwise_add_grad 1 0.050706 0.047506 (0.936891) 0.003200 (0.063109) 0.050706 0.050706 0.050706 0.00467843
thread4::eager_deletion 2 0.003706 0.003706 (1.000000) 0.000000 (0.000000) 0.001745 0.001961 0.001853 0.000341937
thread3::mul 1 1.2214 1.213974 (0.993922) 0.007424 (0.006078) 1.2214 1.2214 1.2214 0.624312
thread3::elementwise_add_grad 1 0.241652 0.238452 (0.986758) 0.003200 (0.013242) 0.241652 0.241652 0.241652 0.123519
thread3::mean_grad 1 0.237133 0.234957 (0.990824) 0.002176 (0.009176) 0.237133 0.237133 0.237133 0.12121
thread3::mul_grad 1 0.194046 0.186910 (0.963225) 0.007136 (0.036775) 0.194046 0.194046 0.194046 0.0991858
thread3::elementwise_add 1 0.05774 0.054828 (0.949567) 0.002912 (0.050433) 0.05774 0.05774 0.05774 0.0295136
thread3::eager_deletion 2 0.00442 0.004420 (1.000000) 0.000000 (0.000000) 0.002201 0.002219 0.00221 0.00225926
thread2::Scale LossGrad 1 0.888646 0.886854 (0.997983) 0.001792 (0.002017) 0.888646 0.888646 0.888646 1
GpuMemcpyAsync:CPU->GPU 1 0.042646 0.040854 (0.957980) 0.001792 (0.042020) 0.042646 0.042646 0.042646 0.0479899
thread1::mean 1 0.289439 0.286559 (0.990050) 0.002880 (0.009950) 0.289439 0.289439 0.289439 0.405402
thread1::mul 1 0.256854 0.249654 (0.971969) 0.007200 (0.028031) 0.256854 0.256854 0.256854 0.359762
thread1::elementwise_add 1 0.102371 0.099619 (0.973117) 0.002752 (0.026883) 0.102371 0.102371 0.102371 0.143386
thread1::sgd 1 0.060164 0.058084 (0.965428) 0.002080 (0.034572) 0.060164 0.060164 0.060164 0.0842686
thread1::eager_deletion 1 0.005127 0.005127 (1.000000) 0.000000 (0.000000) 0.005127 0.005127 0.005127 0.00718112
thread0::FastThreadedSSAGraphExecutorPrepare 1 26.0902 25.997574 (0.996450) 0.092608 (0.003550) 26.0902 26.0902 26.0902 0.973678
thread0::InitLocalVars 1 0.57186 0.571860 (1.000000) 0.000000 (0.000000) 0.57186 0.57186 0.57186 0.0213417
coalesce_tensor 4 0.473217 0.473217 (1.000000) 0.000000 (0.000000) 0.084212 0.199022 0.118304 0.0176603
thread0::GpuMemcpyAsync:CPU->GPU 4 0.116875 0.102316 (0.875431) 0.014559 (0.124569) 0.020035 0.053887 0.0292188 0.00436174
thread0::ScopeBufferedMonitor::post_local_exec_scopes_process 1 0.012179 0.012179 (1.000000) 0.000000 (0.000000) 0.012179 0.012179 0.012179 0.000454517
thread0::ScopeBufferedMonitor::pre_local_exec_scopes_process 1 0.004394 0.004394 (1.000000) 0.000000 (0.000000) 0.004394 0.004394 0.004394 0.000163983
develop
-------------------------> Profiling Report <-------------------------
Note! This Report merge all thread info into one.
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
Total time: 57.2235
Computation time Total: 15.9601 Ratio: 27.8908%
Framework overhead Total: 41.2634 Ratio: 72.1092%
------------------------- GpuMemCpy Summary -------------------------
GpuMemcpy Calls: 12 Total: 1.21895 Ratio: 2.13016%
GpuMemcpyAsync Calls: 12 Total: 1.21895 Ratio: 2.13016%
------------------------- Event Summary -------------------------
Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio.
FastThreadedSSAGraphExecutorPrepare 1 27.1881 27.091693 (0.996455) 0.096383 (0.003545) 27.1881 27.1881 27.1881 0.475121
mean 4 10.4217 10.399238 (0.997845) 0.022463 (0.002155) 1.09743 3.72905 2.60543 0.182123
fused_all_reduce 1 8.69537 2.343391 (0.269499) 6.351975 (0.730501) 8.69537 8.69537 8.69537 0.151955
mul 4 4.26694 4.237721 (0.993153) 0.029216 (0.006847) 0.622723 1.72138 1.06673 0.0745662
Scale LossGrad 4 3.11722 3.107936 (0.997023) 0.009280 (0.002977) 0.505603 0.953103 0.779304 0.0544744
GpuMemcpyAsync:CPU->GPU 4 0.866906 0.857626 (0.989295) 0.009280 (0.010705) 0.029891 0.529066 0.216727 0.0151495
mul_grad 4 1.27391 1.245369 (0.977593) 0.028544 (0.022407) 0.063961 1.06414 0.318478 0.0222621
InitLocalVars 1 0.580126 0.580126 (1.000000) 0.000000 (0.000000) 0.580126 0.580126 0.580126 0.0101379
coalesce_tensor 4 0.476745 0.476745 (1.000000) 0.000000 (0.000000) 0.073257 0.221552 0.119186 0.00833129
sgd 8 0.427709 0.412989 (0.965584) 0.014720 (0.034416) 0.0257 0.131092 0.0534636 0.00747436
Fetch 1 0.366278 0.345606 (0.943562) 0.020672 (0.056438) 0.366278 0.366278 0.366278 0.00640084
GpuMemcpyAsync:GPU->CPU 4 0.215761 0.195089 (0.904190) 0.020672 (0.095810) 0.034806 0.096327 0.0539403 0.0037705
elementwise_add 4 0.321055 0.310335 (0.966610) 0.010720 (0.033390) 0.064491 0.097086 0.0802637 0.00561055
mean_grad 4 0.199697 0.191122 (0.957060) 0.008575 (0.042940) 0.039939 0.061745 0.0499243 0.00348977
elementwise_add_grad 4 0.17598 0.163564 (0.929447) 0.012416 (0.070553) 0.037632 0.053681 0.043995 0.00307531
GpuMemcpyAsync:CPU->GPU 4 0.136287 0.122559 (0.899271) 0.013728 (0.100729) 0.023804 0.060851 0.0340717 0.00238166
eager_deletion 12 0.039494 0.039494 (1.000000) 0.000000 (0.000000) 0.00112 0.007255 0.00329117 0.000690171
ScopeBufferedMonitor::post_local_exec_scopes_process 1 0.010934 0.010934 (1.000000) 0.000000 (0.000000) 0.010934 0.010934 0.010934 0.000191075
ScopeBufferedMonitor::pre_local_exec_scopes_process 1 0.002692 0.002692 (1.000000) 0.000000 (0.000000) 0.002692 0.002692 0.002692 4.70436e-05
-------------------------> Profiling Report <-------------------------
Place: All
Time unit: ms
Sorted by total time in descending order in the same thread
------------------------- Event Summary -------------------------
Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio.
thread12::sgd 1 0.131092 0.129524 (0.988039) 0.001568 (0.011961) 0.131092 0.131092 0.131092 1
thread11::sgd 2 0.099652 0.096196 (0.965319) 0.003456 (0.034681) 0.033393 0.066259 0.049826 1
thread10::mean 1 2.6048 2.598558 (0.997604) 0.006240 (0.002396) 2.6048 2.6048 2.6048 0.875974
thread10::Fetch 1 0.366278 0.345606 (0.943562) 0.020672 (0.056438) 0.366278 0.366278 0.366278 0.123177
GpuMemcpyAsync:GPU->CPU 4 0.215761 0.195089 (0.904190) 0.020672 (0.095810) 0.034806 0.096327 0.0539403 0.0725588
thread10::eager_deletion 1 0.002525 0.002525 (1.000000) 0.000000 (0.000000) 0.002525 0.002525 0.002525 0.000849139
thread9::mean 1 2.99042 2.985007 (0.998192) 0.005408 (0.001808) 2.99042 2.99042 2.99042 0.999377
thread9::eager_deletion 1 0.001864 0.001864 (1.000000) 0.000000 (0.000000) 0.001864 0.001864 0.001864 0.000622937
thread8::mul 1 1.25035 1.243022 (0.994139) 0.007328 (0.005861) 1.25035 1.25035 1.25035 0.850321
thread8::mul_grad 1 0.071167 0.063903 (0.897930) 0.007264 (0.102070) 0.071167 0.071167 0.071167 0.0483983
thread8::elementwise_add 1 0.064491 0.061771 (0.957824) 0.002720 (0.042176) 0.064491 0.064491 0.064491 0.0438582
thread8::mean_grad 1 0.039939 0.037731 (0.944716) 0.002208 (0.055284) 0.039939 0.039939 0.039939 0.0271612
thread8::elementwise_add_grad 1 0.037632 0.034528 (0.917517) 0.003104 (0.082483) 0.037632 0.037632 0.037632 0.0255923
thread8::eager_deletion 2 0.006866 0.006866 (1.000000) 0.000000 (0.000000) 0.00112 0.005746 0.003433 0.00466933
thread7::mean 1 3.72905 3.721408 (0.997949) 0.007647 (0.002051) 3.72905 3.72905 3.72905 0.83005
thread7::mul 1 0.672486 0.665222 (0.989198) 0.007264 (0.010802) 0.672486 0.672486 0.672486 0.149689
thread7::elementwise_add 1 0.084597 0.081717 (0.965956) 0.002880 (0.034044) 0.084597 0.084597 0.084597 0.0188304
thread7::eager_deletion 1 0.006427 0.006427 (1.000000) 0.000000 (0.000000) 0.006427 0.006427 0.006427 0.00143059
thread6::fused_all_reduce 1 8.69537 2.343391 (0.269499) 6.351975 (0.730501) 8.69537 8.69537 8.69537 0.733848
thread6::mul 1 1.72138 1.713794 (0.995594) 0.007584 (0.004406) 1.72138 1.72138 1.72138 0.145276
thread6::mul_grad 1 1.06414 1.056752 (0.993054) 0.007392 (0.006946) 1.06414 1.06414 1.06414 0.0898088
thread6::sgd 5 0.196965 0.187269 (0.950773) 0.009696 (0.049227) 0.0257 0.064178 0.039393 0.0166229
thread6::elementwise_add 1 0.074881 0.072321 (0.965812) 0.002560 (0.034188) 0.074881 0.074881 0.074881 0.00631961
thread6::mean_grad 1 0.044631 0.042423 (0.950528) 0.002208 (0.049472) 0.044631 0.044631 0.044631 0.00376665
thread6::elementwise_add_grad 1 0.041424 0.038000 (0.917343) 0.003424 (0.082657) 0.041424 0.041424 0.041424 0.00349599
thread6::eager_deletion 2 0.010209 0.010209 (1.000000) 0.000000 (0.000000) 0.002954 0.007255 0.0051045 0.000861592
thread5::Scale LossGrad 1 0.953103 0.951247 (0.998053) 0.001856 (0.001947) 0.953103 0.953103 0.953103 0.852902
GpuMemcpyAsync:CPU->GPU 1 0.029891 0.028035 (0.937908) 0.001856 (0.062092) 0.029891 0.029891 0.029891 0.0267485
thread5::mul_grad 1 0.063961 0.057369 (0.896937) 0.006592 (0.103063) 0.063961 0.063961 0.063961 0.0572367
thread5::mean_grad 1 0.053382 0.051398 (0.962834) 0.001984 (0.037166) 0.053382 0.053382 0.053382 0.0477699
thread5::elementwise_add_grad 1 0.043243 0.040427 (0.934880) 0.002816 (0.065120) 0.043243 0.043243 0.043243 0.0386968
thread5::eager_deletion 2 0.003794 0.003794 (1.000000) 0.000000 (0.000000) 0.001456 0.002338 0.001897 0.00339513
thread4::Scale LossGrad 1 0.505603 0.503811 (0.996456) 0.001792 (0.003544) 0.505603 0.505603 0.505603 1
GpuMemcpyAsync:CPU->GPU 1 0.272054 0.270262 (0.993413) 0.001792 (0.006587) 0.272054 0.272054 0.272054 0.538078
thread3::mean 1 1.09743 1.094265 (0.997113) 0.003168 (0.002887) 1.09743 1.09743 1.09743 0.603191
thread3::mul 1 0.622723 0.615683 (0.988695) 0.007040 (0.011305) 0.622723 0.622723 0.622723 0.342272
thread3::elementwise_add 1 0.097086 0.094526 (0.973632) 0.002560 (0.026368) 0.097086 0.097086 0.097086 0.0533622
thread3::eager_deletion 1 0.002137 0.002137 (1.000000) 0.000000 (0.000000) 0.002137 0.002137 0.002137 0.00117458
thread2::Scale LossGrad 1 0.749638 0.747526 (0.997183) 0.002112 (0.002817) 0.749638 0.749638 0.749638 1
GpuMemcpyAsync:CPU->GPU 1 0.529066 0.526954 (0.996008) 0.002112 (0.003992) 0.529066 0.529066 0.529066 0.705762
thread1::Scale LossGrad 1 0.908872 0.905352 (0.996127) 0.003520 (0.003873) 0.908872 0.908872 0.908872 0.822798
GpuMemcpyAsync:CPU->GPU 1 0.035895 0.032375 (0.901936) 0.003520 (0.098064) 0.035895 0.035895 0.035895 0.0324956
thread1::mul_grad 1 0.074641 0.067345 (0.902252) 0.007296 (0.097748) 0.074641 0.074641 0.074641 0.0675722
thread1::mean_grad 1 0.061745 0.059570 (0.964774) 0.002175 (0.035226) 0.061745 0.061745 0.061745 0.0558975
thread1::elementwise_add_grad 1 0.053681 0.050609 (0.942773) 0.003072 (0.057227) 0.053681 0.053681 0.053681 0.0485972
thread1::eager_deletion 2 0.005672 0.005672 (1.000000) 0.000000 (0.000000) 0.001551 0.004121 0.002836 0.00513484
thread0::FastThreadedSSAGraphExecutorPrepare 1 27.1881 27.091693 (0.996455) 0.096383 (0.003545) 27.1881 27.1881 27.1881 0.973851
thread0::InitLocalVars 1 0.580126 0.580126 (1.000000) 0.000000 (0.000000) 0.580126 0.580126 0.580126 0.0207796
coalesce_tensor 4 0.476745 0.476745 (1.000000) 0.000000 (0.000000) 0.073257 0.221552 0.119186 0.0170765
thread0::GpuMemcpyAsync:CPU->GPU 4 0.136287 0.122559 (0.899271) 0.013728 (0.100729) 0.023804 0.060851 0.0340717 0.00488167
thread0::ScopeBufferedMonitor::post_local_exec_scopes_process 1 0.010934 0.010934 (1.000000) 0.000000 (0.000000) 0.010934 0.010934 0.010934 0.000391645
thread0::ScopeBufferedMonitor::pre_local_exec_scopes_process 1 0.002692 0.002692 (1.000000) 0.000000 (0.000000) 0.002692 0.002692 0.002692 9.64248e-05
share_vars_from is set, scope is ignored.