[new feature request]half-precision support (#243) · Issue · PaddlePaddle / models

[new feature request]half-precision support

Created by: chenjiasheng

As the DS2 paper mentioned:

Our deployment system evaluates RNNs in half-precision arithmetic, which has no measurable accuracy impact, but significantly improves efficiency. We wrote our own 16-bit matrix-matrix multiply routines for this task, substantially improving throughput for our relatively small batches.

So have you implemented this half-precision arithmetic? Does it take the advantages of CUDA's half-precision ability (https://devblogs.nvidia.com/parallelforall/new-features-cuda-7-5/ https://devblogs.nvidia.com/parallelforall/mixed-precision-programming-cuda-8/)?

PaddlePaddle / models 大约 1 年 前同步成功

[new feature request]half-precision support

PaddlePaddle / models
大约 1 年前同步成功