Task for v1.5: Optimize BERT L12 (dim=768)
Created by: jianhang-liu
Optimize BERT base model (L12, dim=768).
Target HW: 2650v4
What had been done in Q1:
- ERF optimization with MKL - merged
- Framework level optimization (RunTime Context cache, kernel selection cache, etc.) - merged
What will be done in Q2:
- Remove transpose/reshape - PR16342
- Optimize infer shape for a few OPs - patch ready. PR will be raised after PR16342 merge