提交 11a55ae0 编写于 作者: L liuqi

Update memory layout doc.

上级 e5ada494
...@@ -21,10 +21,10 @@ The CPU tensor buffer is organized in the following order: ...@@ -21,10 +21,10 @@ The CPU tensor buffer is organized in the following order:
* - 1-D Argument, length = W * - 1-D Argument, length = W
- W - W
OpenCL runtime memory layout GPU runtime memory layout
----------------------------- -----------------------------
OpenCL runtime uses 2D image with CL_RGBA channel order as the tensor storage. GPU runtime implementation base on OpenCL, which uses 2D image with CL_RGBA
This requires OpenCL 1.2 and above. channel order as the tensor storage. This requires OpenCL 1.2 and above.
The way of mapping the Tensor data to OpenCL 2D image (RGBA) is critical for The way of mapping the Tensor data to OpenCL 2D image (RGBA) is critical for
kernel performance. kernel performance.
...@@ -53,7 +53,7 @@ The Input/Output Tensor is stored in NHWC format: ...@@ -53,7 +53,7 @@ The Input/Output Tensor is stored in NHWC format:
- Default Input/Output format - Default Input/Output format
* - Height-Major Input/Output * - Height-Major Input/Output
- NHWC - NHWC
- [W * C, N * (H+3)/4 - [W * C, N * (H+3)/4]
- Winograd Convolution format - Winograd Convolution format
* - Width-Major Input/Output * - Width-Major Input/Output
- NHWC - NHWC
...@@ -94,11 +94,11 @@ Filter Tensor ...@@ -94,11 +94,11 @@ Filter Tensor
- Image size [width, height] - Image size [width, height]
- Explanation - Explanation
* - Convolution Filter * - Convolution Filter
- HWOI - OIHW
- [RoundUp<4>(I), H * W * (O+3)/4] - [I, (O+3)/4 * W * H]
- Convolution filter format,There is no difference compared to [H*w*I, (O+3)/4] - Convolution filter format,There is no difference compared to [H*W*I, (O+3)/4]
* - Depthwise Convlution Filter * - Depthwise Convlution Filter
- HWIM - MIHW
- [H * W * M, (I+3)/4] - [H * W * M, (I+3)/4]
- Depthwise-Convolution filter format - Depthwise-Convolution filter format
...@@ -114,10 +114,10 @@ coordination relation between **Image** and **Buffer**. ...@@ -114,10 +114,10 @@ coordination relation between **Image** and **Buffer**.
- Pixel coordinate relationship - Pixel coordinate relationship
- Explanation - Explanation
* - Convolution Filter * - Convolution Filter
- P[m, n] = {E[h, w, o, i] | (h=T/W, w=T%W, o=[n/HW*4+k], i=m)} - P[m, n] = {E[o, i, h, w] | (o=[n/HW*4+k], i=m, h=T/W, w=T%W)}
- HW= H * W, T=n%HW, k=[0, 4) - HW= H * W, T=n%HW, k=[0, 4)
* - Depthwise Convlution Filter * - Depthwise Convlution Filter
- P[m, n] = {E[h, w, i, 0] | (h=m/W, w=m%W, i=[n*4+k])} - P[m, n] = {E[0, i, h, w] | (i=[n*4+k], h=m/W, w=m%W)}
- only support multiplier == 1, k=[0, 4) - only support multiplier == 1, k=[0, 4)
1-D Argument Tensor 1-D Argument Tensor
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册