[QAT] BERT Model
Created by: Sand3r-
Hello, as it has been agreed on previously, intel is going to prepare QAT pass for BERT INT8. Hence, we'd like to find out the following:
- What flavour (type) of BERT model shall we optimize? There are several ones that I know of, each differs by the ops it is built of.
- Could you please provide data the reader for us? Two of the QAT BERT models we have received accepted only 2 inputs, while other BERT models we have saw (the one from
benchmark
repository or the one from bert unit-test), contained 4 inputs, named placeholder[0-3]. - How can we find out how to compute the accuracy?
- What is the performance measure that we use for the model in question? Is it words per second (wps) or something else?