Paddle新API数据读取性能问题
Created by: reyoung
刚刚做了一个benchmark,在mnist的api_train_v2.py里面,一共运行了19秒,有16秒都是在数据转换的地方。
那块的性能值得优化一下。
Tue Mar 14 16:47:32 2017 profile.bin
3191874 function calls (3182864 primitive calls) in 19.872 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 19.890 19.890 api_train_v2.py:1(<module>)
1 0.002 0.002 18.307 18.307 api_train_v2.py:58(main)
1 0.065 0.065 18.222 18.222 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/trainer.py:64(train)
1097 0.003 0.000 16.342 0.015 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/py_paddle/dataprovider_converter.py:174(__call__)
1097 0.077 0.000 16.339 0.015 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/data_feeder.py:91(convert)
1097 0.297 0.000 16.077 0.015 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/py_paddle/dataprovider_converter.py:154(convert)
140100 2.678 0.000 15.458 0.000 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/py_paddle/dataprovider_converter.py:55(scan)
139003 0.154 0.000 12.766 0.000 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/numpy/lib/function_base.py:4951(append)
139006 12.488 0.000 12.488 0.000 {numpy.core.multiarray.concatenate}
1 0.005 0.005 1.582 1.582 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/__init__.py:14(<module>)
1103 0.073 0.000 1.136 0.001 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/minibatch.py:30(batch_reader)
1 0.004 0.004 1.032 1.032 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/dataset/__init__.py:16(<module>)
1 0.001 0.001 0.941 0.941 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/dataset/sentiment.py:21(<module>)
1 0.015 0.015 0.940 0.940 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/nltk/__init__.py:17(<module>)
120002 0.122 0.000 0.922 0.000 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/reader/decorator.py:65(data_reader)
140105 0.545 0.000 0.834 0.000 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/dataset/mnist.py:38(reader)
1880 0.004 0.000 0.765 0.000 api_train_v2.py:92(event_handler)
2 0.009 0.005 0.739 0.370 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/trainer.py:125(test)
1 0.002 0.002 0.517 0.517 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/paddle/v2/optimizer.py:1(<module>)
1 0.001 0.001 0.481 0.481 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/py_paddle/__init__.py:15(<module>)
1 0.003 0.003 0.478 0.478 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/py_paddle/util.py:16(<module>)
309/30 0.022 0.000 0.452 0.015 {__import__}
1 0.003 0.003 0.445 0.445 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/nltk/collocations.py:25(<module>)
1 0.004 0.004 0.434 0.434 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/nltk/metrics/__init__.py:14(<module>)
1 0.002 0.002 0.428 0.428 /Users/baidu/tmp/.py_env/lib/python2.7/site-packages/nltk/metrics/scores.py:8(<module>)