* init add * add topk op * someupdate * fix style check * add test py file * update top k cuda kernel * follow comments * remove debug print * accuracy_op * fix casting error * fix casting error * fix casting error * fix rename bug... * make it smaller * update cast