Created by: zhangting2020
Performance optimization PR types
APIs PR changes
return the input(x) when x.dtype is the same as attr(dtype) Describe
GPU Performance
V100, cuda10
op | shape | input dtype | dtype | before | after | speed up |
---|---|---|---|---|---|---|
cast | [16, 1785] | bool | bool | 0.00150 ms | 0.000026 ms | 57x |