* migrate truncated_gaussian_random kernel to phi, test=kunlun * reuse CPU kernel, test=kunlun * debug kernel, test=kunlun * migrate truncated_gaussian_random kernel to phi, test=kunlun * split truncated_normal, test=kunlun * try fix error from CI, test=kunlun