Created by: chengduoZH
fix : https://github.com/PaddlePaddle/Paddle/issues/8567
concat
operation
Analysis the The input is a list of tensors and axis
which indicates the concation axis. The shape of input's tensor can be any, but only the dimension of axis
can be different.
For example, the input is two tensors.
- case 1:
- t_a's shape: [9,2,3,4]
- t_b's shape:[3,2,3,4]
-
axis = 0
,
Obviously, the output's shape is [12,2,3,4]. To simply solve this case, we can reshape t_a to [9, 24] and t_b to [3, 24], finally concate the two tensor longitudinally. The output's shape is [12, 24]. In this case, we only copy two times.
- case 2:
- t_a's shape: [9,2,3,4]
- t_b's shape:[9,3,3,4]
-
axis = 1
,
To simply solve this case, we can reshape t_a to [9, 2, 12] and t_b to [9, 3, 12], finally concate the two tensor on the second axis. The output's shape is [9,5,12]. In this case, we should copy 18 times.
- case 3:
- t_a's shape: [9,2,3,4]
- t_b's shape:[9,2,3,3]
-
axis = 3
,
Firstly, we reshape t_a to [54, 4] and t_b to [54, 3], finally concate the two tensor horizontally. The output's shape is [54, 7]. This is the worst case, we should copy 108 times.
TODO
- use one Cuda kernel to complete those copies. All of those cases can be solved by one strategy.