The inference speed of V2 API is much slower than the V1 version
Created by: lcy-seso
A user reported that the speed of inference by using V2 APIs is slower than before, as I know, there are some not quite good designs. Is it possible to optimized?
Besides, we do not provide an official doc on how to do batch inference. I want to know if there are some rules to follow or some things I should pay attentions to.