Profiling results for sync data reader of DeepASR
Created by: pkuyym
测试采用实际音频数据,共run了10个batch,batch size=32,去掉异步和buffer,结论如下:
- 最主要热点是trans add delta,这个占用了超过80%的时间
- feeding操作耗时很少,可以暂不考虑优化
主函数耗时分析:
- 热点是data processing
- feeding操作耗时很少,可以排除掉这个风险
Total time: 109.838 s
File: profile_sync_reader.py
Function: profile_reader at line 33
Line # Hits Time Per Hit % Time Line Contents
==============================================================
33 @profile
34 def profile_reader(epoch_num, batch_size):
35 2 10.0 5.0 0.0 for epoch_id in xrange(epoch_num):
36 1 4.0 4.0 0.0 for batch_id, one_batch in enumerate(
37 11 95392189.0 8672017.2 86.8 data_reader.batch_iterator(batch_size, batch_size)):
38 11 46440.0 4221.8 0.0 (bat_feature, bat_label, lod) = one_batch
39 11 83802.0 7618.4 0.1 res_feature.set(bat_feature, place)
40 11 452.0 41.1 0.0 res_feature.set_lod([lod])
41 11 1048.0 95.3 0.0 res_label.set(bat_label, place)
42 11 90.0 8.2 0.0 res_label.set_lod([lod])
43 11 14309543.0 1300867.5 13.0 time.sleep(1.3)
44 11 61.0 5.5 0.0 if batch_id > 9:
45 1 4296.0 4296.0 0.0 break