Add bfloat16 passes (!26999) · 合并请求 · PaddlePaddle / Paddle

Add bfloat16 passes !26999

Created by: wozna

New features

Others

This PR is the next step of adding bfloat16 support to PaddlePaddle.

This PR presents:

two passes for enabling mkldnn bfloat16
tests for those passes
insertion passes to CpuPassStrategy
there is still no support in Numpy for bfloat16, so added there in framework.py conversion from python uint16 to C++ bfloat16

More about passes: After enabling bfloat16, two passes will be applied to the graph.

first cpu_bfloat16_placement_pass.cc - this pass first sets to attribute mkldnn_data_type = bfloat16 to all operators contained in the vector bfloat16_enabled_op_types or if user won't set any op types all operators that support bfloat16 will be marked.
second cpu_bfloat16_pass.cc - this pass first adds reorder between float32 and bfloat16 operators, unless bfloat16 operator is conv2d (conv2d can reoder input to bfloat16 inside kernel, so reoder is no needed, what decreases number of reoders). Than adds reorder between bfloat16 and float32 operators or use attr force_output_fp32 if the operator has this attribute.