-
由 chentianyu03 提交于
* combine reduce_cuda codes * support float16 in pten redcue_mean * replace ReduceCudaKernel impl with pten reduce impl * mv reduce funcs into reduce_cuda_impl * rm unsed codes and headers * mv GetReduceDim into reduce_cuda_impl * recover GetReduceDim in reduce_op.h * add new dispatch macro * fix pool op output not inited and cause transform to pten::denseTensor error * fix output tensor not initialized error * rename new dispatch macro and format code style * rm reduce_functor_op.h file
08941eda