refine tensorrt Integrate
Created by: NHZlX
-
Tensorrt engine op now just support cpu . All inputs of tensorrt engine op will be copied from gpu to cpu during inference process everytime.
-
A pointer to CPU memory is needed of the TRT weight. Before TRT runs, fluid loads weight into GPU storage. so we need to copy the weights from GPU to CPU in our op converter. If we use a temp tensor, the weight memory will be released in advance, which affecting the construction of TRT Op.