Benchmark of refactorization and Tensorflow/Paddle V2
Created by: dzhwinter
Here we do some experiments to benchmark our implementation with TensorFlow/Paddle V2. Based on the Paddle commit id d883547b.
The TensorFlow use a special random number generate algorithm Phlixo
, we fetch out the random value from Paddle and feed it into TensorFlow variable. I compared the results op by op with TensorFlow, find that our conv2d implementation has a difference of 6 to 8 decimal scale, other parts keep the same even at the last decimal.
TensorFlow has a speed 15x faster than our current refactorization version.
The benchmark results show as below: both the accuracy is same :
pass=0, batch=0, loss=2.305631, error=0.882812
pass=0, batch=1, loss=2.244341, error=0.789062
pass=0, batch=2, loss=2.081296, error=0.617188
pass=0, batch=3, loss=1.913030, error=0.320312
pass=0, batch=4, loss=1.817348, error=0.398438
pass=0, batch=5, loss=1.599624, error=0.242188
pass=0, batch=6, loss=1.427543, error=0.273438
pass=0, batch=7, loss=1.283092, error=0.226562
pass=0, batch=8, loss=1.280328, error=0.296875
pass=0, batch=9, loss=1.048430, error=0.210938
pass=0, batch=10, loss=0.975700, error=0.257812
pass=0, batch=11, loss=0.861773, error=0.289062
pass=0, batch=12, loss=0.692513, error=0.171875
pass=0, batch=13, loss=0.541580, error=0.125000
pass=0, batch=14, loss=0.581401, error=0.132812
pass=0, batch=15, loss=0.626544, error=0.148438
pass=0, batch=16, loss=0.452217, error=0.101562
pass=0, batch=17, loss=0.463440, error=0.164062
The time-consuming comparison.
Paddle
pass=2, batch=453, loss=0.024159, error=0.007812, elapse=0.031905
pass=2, batch=454, loss=0.006635, error=0.007812, elapse=0.032114
pass=2, batch=455, loss=0.009840, error=0.000000, elapse=0.031895
pass=2, batch=456, loss=0.006372, error=0.000000, elapse=0.031872
pass=2, batch=457, loss=0.004728, error=0.000000, elapse=0.031861
pass=2, batch=458, loss=0.014771, error=0.007812, elapse=0.031856
pass=2, batch=459, loss=0.007937, error=0.007812, elapse=0.031929
pass=2, batch=460, loss=0.000169, error=0.000000, elapse=0.031851
pass=2, batch=461, loss=0.000213, error=0.000000, elapse=0.031932
pass=2, batch=462, loss=0.002135, error=0.000000, elapse=0.031914
pass=2, batch=463, loss=0.017388, error=0.000000, elapse=0.032012
pass=2, batch=464, loss=0.030636, error=0.007812, elapse=0.031753
pass=2, batch=465, loss=0.000952, error=0.000000, elapse=0.031770
pass=2, batch=466, loss=0.148781, error=0.039062, elapse=0.031697
pass=2, batch=467, loss=0.001351, error=0.000000, elapse=0.031742
pass=2, batch=468, loss=0.254510, error=0.010417, elapse=0.025343
TensorFlow
pass=2, batch=453, loss=0.025126, error=0.007812, elapse=0.000567
pass=2, batch=454, loss=0.006477, error=0.007812, elapse=0.000425
pass=2, batch=455, loss=0.008151, error=0.000000, elapse=0.000505
pass=2, batch=456, loss=0.007689, error=0.000000, elapse=0.000589
pass=2, batch=457, loss=0.004428, error=0.000000, elapse=0.000453
pass=2, batch=458, loss=0.015366, error=0.007812, elapse=0.000412
pass=2, batch=459, loss=0.006853, error=0.000000, elapse=0.000644
pass=2, batch=460, loss=0.000175, error=0.000000, elapse=0.000405
pass=2, batch=461, loss=0.000173, error=0.000000, elapse=0.000764
pass=2, batch=462, loss=0.002133, error=0.000000, elapse=0.000473
pass=2, batch=463, loss=0.021018, error=0.000000, elapse=0.000493
pass=2, batch=464, loss=0.033293, error=0.007812, elapse=0.000556
pass=2, batch=465, loss=0.000927, error=0.000000, elapse=0.000469
pass=2, batch=466, loss=0.145350, error=0.046875, elapse=0.000645
pass=2, batch=467, loss=0.001364, error=0.000000, elapse=0.000793
pass=2, batch=468, loss=0.246977, error=0.010417, elapse=0.000297