Created by: GaoWei8
PR types
Function optimization
PR changes
OPs
Describe
增加softmax CUDA kernel
softmax前向
api | Paddle(kernel)优化前 | Paddle(kernel)优化后 | Tensorflow(kernel) |
---|---|---|---|
softamx_0 | 0.09460 | 0.01171 | 0.00910 |
softamx_2 | 0.02370 | 0.01235 | 0.00919 |
softamx_4 | 0.23351 | 0.12017 | 0.60117 |
softmax反向
api | Paddle(kernel)优化前 | Paddle(kernel)优化后 | Tensorflow(kernel) |
---|---|---|---|
softamx_0 | 0.12974 | 0.04748 | 0.01532 |
softamx_2 | 0.04280 | 0.01235 | 0.01498 |
softamx_4 | 0.37190 | 0.2531 | 0.82553 |