* add split selected rows op * update comment * add grad check * registry cuda kernel * fix ci failed