PaddlePaddle / Paddle
大约 2 年前同步成功

large memory used when infer

Created by: tensor-tang

This is an issue of NLP online service.

When run inference, the memory usage is always kept as about 6G, which is definitely larger than actually needed.