Fork自 PaddlePaddle / Paddle
Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979) test=develop
拖放文件到此处或点击上传