diff --git a/benchmark/IntelOptimizedPaddle.md b/benchmark/IntelOptimizedPaddle.md index 8ee7fd28c58f2a2bcb82040eb824a37062bd4e9c..94a79c3c8ea3d5a8c50ba2b7782abec21e2cff13 100644 --- a/benchmark/IntelOptimizedPaddle.md +++ b/benchmark/IntelOptimizedPaddle.md @@ -22,6 +22,7 @@ On each machine, we will test and compare the performance of training on single #### Training Test on batch size 64, 128, 256 on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz +Pay attetion that the speed below includes forward, backward and parameter update time. So we can not directly compare the data with the benchmark of caffe `time` [command](https://github.com/PaddlePaddle/Paddle/blob/develop/benchmark/caffe/image/run.sh#L9), which only contain forward and backward. The updating time of parameter would become very heavy when the weight size are large, especially on alexnet. Input image size - 3 * 224 * 224, Time: images/second @@ -55,6 +56,16 @@ Input image size - 3 * 224 * 224, Time: images/second +- Alexnet + +| BatchSize | 64 | 128 | 256 | +|--------------|--------| ------ | -------| +| OpenBLAS | 0.85 | 1.03 | 1.17 | +| MKLML | 71.26 | 106.94 | 155.18 | +| MKL-DNN     | 362.66 | 497.66 | 610.73 | + +chart TBD + #### Inference Test on batch size 1, 2, 4, 8, 16 on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz - VGG-19