Created by: baojun-nervana
This PR includes a number of changes to enable BERT through ngraph:
- set ngraph to support block 0 only, the subblock from condition_block op will run through paddle
- Infershape was delayed before getting output dimension.
- reason for this change is some of the variables are reused in some reshape ops and it will impact the inputs when the inputs trying to pull the shape.
- mul_op and pool2d were updated accordingly per this change
- fix some corner cases
- handle shape {0}
- partial support of lookup_table_grad
- remove duplicated output
- this is an optimization
- Handle reshape2_grad infershape Paddle reshape2 add a leading dimension 0 and chop it off in reshape_grad infershape. Ngraph would interpret 0 as 0 element, so it needs special handling in ngraph engine