Created by: lidanqing-intel
This PR improve performance of density_prior_box_op ~3 times faster
- test machine: SKX 8180.
- test script: ./test_analyzer_resnet50 --infer_model=models/detect --gtest_filter=Analyzer_resnet50.profile_mkldnn --paddle_num_threads=4 --repeat=1 --batch_size=1 --profile
test=develop