- modify aligned_matmul kernel for dynamically malloc memory - fix top_k_avg_pooling kernel to support data whose size is more than cuda shared memory.
拖放文件到此处或点击上传