From af3b46301599dfdbf7dd91cb3995c4fa660531fc Mon Sep 17 00:00:00 2001 From: Liangliang He Date: Fri, 11 May 2018 14:29:04 +0800 Subject: [PATCH] Update rst table format --- docs/development/adding_a_new_op.md | 4 + docs/development/memory_layout.rst | 163 ++++++++++++------ .../create_a_model_deployment.rst | 80 +++++---- docs/getting_started/op_lists.rst | 122 +++++-------- 4 files changed, 198 insertions(+), 171 deletions(-) diff --git a/docs/development/adding_a_new_op.md b/docs/development/adding_a_new_op.md index 08e3efca..be0c82b3 100644 --- a/docs/development/adding_a_new_op.md +++ b/docs/development/adding_a_new_op.md @@ -95,3 +95,7 @@ Add test and benchmark ---------------------- It's strongly recommended to add unit test and micro benchmark for your new Op. If you wish to contribute back, it's required. + +Document the new Op +--------------------- +Finally, add an entry in operator table in the document. diff --git a/docs/development/memory_layout.rst b/docs/development/memory_layout.rst index f8e428ca..2065e66c 100644 --- a/docs/development/memory_layout.rst +++ b/docs/development/memory_layout.rst @@ -5,17 +5,21 @@ CPU runtime memory layout ------------------------- The CPU tensor buffer is organized in the following order: -+-----------------------------+--------------+ -| Tensor type | Buffer | -+=============================+==============+ -| Intermediate input/output | NCHW | -+-----------------------------+--------------+ -| Convolution Filter | OIHW | -+-----------------------------+--------------+ -| Depthwise Convolution Filter| MIHW | -+-----------------------------+--------------+ -| 1-D Argument, length = W | W | -+-----------------------------+--------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor type + - Buffer + * - Intermediate input/output + - NCHW + * - Convolution Filter + - OIHW + * - Depthwise Convolution Filter + - MIHW + * - 1-D Argument, length = W + - W OpenCL runtime memory layout ----------------------------- @@ -34,66 +38,117 @@ Input/Output Tensor The Input/Output Tensor is stored in NHWC format: -+---------------------------+--------+----------------------------+-----------------------------+ -|Tensor type | Buffer | Image size [width, height] | Explanation | -+===========================+========+============================+=============================+ -|Channel-Major Input/Output | NHWC | [W * (C+3)/4, N * H] | Default Input/Output format | -+---------------------------+--------+----------------------------+-----------------------------+ -|Height-Major Input/Output | NHWC | [W * C, N * (H+3)/4] | Winograd Convolution format | -+---------------------------+--------+----------------------------+-----------------------------+ -|Width-Major Input/Output | NHWC | [(W+3)/4 * C, N * H] | Winograd Convolution format | -+---------------------------+--------+----------------------------+-----------------------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor type + - Buffer + - Image size [width, height] + - Explanation + * - Channel-Major Input/Output + - NHWC + - [W * (C+3)/4, N * H] + - Default Input/Output format + * - Height-Major Input/Output + - NHWC + - [W * C, N * (H+3)/4 + - Winograd Convolution format + * - Width-Major Input/Output + - NHWC + - [(W+3)/4 * C, N * H] + - Winograd Convolution format Each Pixel of **Image** contains 4 elements. The below table list the coordination relation between **Image** and **Buffer**. -+---------------------------+-------------------------------------------------------------------------+-------------+ -|Tensor type | Pixel coordinate relationship | Explanation | -+===========================+=========================================================================+=============+ -|Channel-Major Input/Output | P[i, j] = {E[n, h, w, c] | (n=j/H, h=j%H, w=i%W, c=[i/W * 4 + k])} | k=[0, 4) | -+---------------------------+-------------------------------------------------------------------------+-------------+ -|Height-Major Input/Output | P[i, j] = {E[n, h, w, c] | (n=j%N, h=[j/H*4 + k], w=i%W, c=i/W)} | k=[0, 4) | -+---------------------------+-------------------------------------------------------------------------+-------------+ -|Width-Major Input/Output | P[i, j] = {E[n, h, w, c] | (n=j/H, h=j%H, w=[i%W*4 + k], c=i/W)} | k=[0, 4) | -+---------------------------+-------------------------------------------------------------------------+-------------+ - +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor type + - Pixel coordinate relationship + - Explanation + * - Channel-Major Input/Output + - P[i, j] = {E[n, h, w, c] | (n=j/H, h=j%H, w=i%W, c=[i/W * 4 + k])} + - k=[0, 4) + * - Height-Major Input/Output + - P[i, j] = {E[n, h, w, c] | (n=j%N, h=[j/H*4 + k], w=i%W, c=i/W)} + - k=[0, 4) + * - Width-Major Input/Output + - P[i, j] = {E[n, h, w, c] | (n=j/H, h=j%H, w=[i%W*4 + k], c=i/W)} + - k=[0, 4) Filter Tensor ~~~~~~~~~~~~~ -+----------------------------+------+---------------------------------+------------------------------------------------------------------------------+ -| Tensor |Buffer| Image size [width, height] | Explanation | -+============================+======+=================================+==============================================================================+ -|Convolution Filter | HWOI | [RoundUp<4>(I), H * W * (O+3)/4]|Convolution filter format,There is no difference compared to [H*w*I, (O+3)/4]| -+----------------------------+------+---------------------------------+------------------------------------------------------------------------------+ -|Depthwise Convlution Filter | HWIM | [H * W * M, (I+3)/4] |Depthwise-Convolution filter format | -+----------------------------+------+---------------------------------+------------------------------------------------------------------------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor + - Buffer + - Image size [width, height] + - Explanation + * - Convolution Filter + - HWOI + - [RoundUp<4>(I), H * W * (O+3)/4] + - Convolution filter format,There is no difference compared to [H*w*I, (O+3)/4] + * - Depthwise Convlution Filter + - HWIM + - [H * W * M, (I+3)/4] + - Depthwise-Convolution filter format Each Pixel of **Image** contains 4 elements. The below table list the coordination relation between **Image** and **Buffer**. -+----------------------------+-------------------------------------------------------------------+---------------------------------------+ -|Tensor type | Pixel coordinate relationship | Explanation | -+============================+===================================================================+=======================================+ -|Convolution Filter | P[m, n] = {E[h, w, o, i] | (h=T/W, w=T%W, o=[n/HW*4+k], i=m)}| HW= H * W, T=n%HW, k=[0, 4) | -+----------------------------+-------------------------------------------------------------------+---------------------------------------+ -|Depthwise Convlution Filter | P[m, n] = {E[h, w, i, 0] | (h=m/W, w=m%W, i=[n*4+k])} | only support multiplier == 1, k=[0, 4)| -+----------------------------+-------------------------------------------------------------------+---------------------------------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor type + - Pixel coordinate relationship + - Explanation + * - Convolution Filter + - P[m, n] = {E[h, w, o, i] | (h=T/W, w=T%W, o=[n/HW*4+k], i=m)} + - HW= H * W, T=n%HW, k=[0, 4) + * - Depthwise Convlution Filter + - P[m, n] = {E[h, w, i, 0] | (h=m/W, w=m%W, i=[n*4+k])} + - only support multiplier == 1, k=[0, 4) 1-D Argument Tensor ~~~~~~~~~~~~~~~~~~~ -+----------------+----------+------------------------------+---------------------------------+ -| Tensor type | Buffer | Image size [width, height] | Explanation | -+================+==========+==============================+=================================+ -| 1-D Argument | W | [(W+3)/4, 1] | 1D argument format, e.g. Bias | -+----------------+----------+------------------------------+---------------------------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor type + - Buffer + - Image size [width, height] + - Explanation + * - 1-D Argument + - W + - [(W+3)/4, 1] + - 1D argument format, e.g. Bias Each Pixel of **Image** contains 4 elements. The below table list the coordination relation between **Image** and **Buffer**. -+--------------+---------------------------------+-------------+ -| Tensor type | Pixel coordinate relationship | Explanation | -+==============+=================================+=============+ -|1-D Argument | P[i, 0] = {E[w] | w=i*4+k} | k=[0, 4) | -+--------------+---------------------------------+-------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Tensor type + - Pixel coordinate relationship + - Explanation + * - 1-D Argument + - P[i, 0] = {E[w] | w=i*4+k} + - k=[0, 4) + diff --git a/docs/getting_started/create_a_model_deployment.rst b/docs/getting_started/create_a_model_deployment.rst index f7851aec..bf47aed5 100644 --- a/docs/getting_started/create_a_model_deployment.rst +++ b/docs/getting_started/create_a_model_deployment.rst @@ -19,46 +19,50 @@ Here is an deployment file example used by Android demo application. TODO: change this example file to the demo deployment file (reuse the same file) and rename to a reasonable name. -.. literalinclude :: models/demo_app_models.yaml +.. literalinclude:: models/demo_app_models.yaml :language: yaml Configurations -------------------- -+--------------------------+----------------------------------------------------------------------------------------+ -| Configuration key | Description | -+==========================+========================================================================================+ -| target_abis | The target ABI to build, can be one or more of 'host', 'armeabi-v7a' or 'arm64-v8a' | -+--------------------------+----------------------------------------------------------------------------------------+ -| embed_model_data | Whether embedding model weights as the code, default to 1 | -+--------------------------+----------------------------------------------------------------------------------------+ -| platform | The source framework, tensorflow or caffe | -+--------------------------+----------------------------------------------------------------------------------------+ -| model_file_path | The path of the model file, can be local or remote | -+--------------------------+----------------------------------------------------------------------------------------+ -| weight_file_path | The path of the model weights file, used by Caffe model | -+--------------------------+----------------------------------------------------------------------------------------+ -| model_sha256_checksum | The SHA256 checksum of the model file | -+--------------------------+----------------------------------------------------------------------------------------+ -| weight_sha256_checksum | The SHA256 checksum of the weight file, used by Caffe model | -+--------------------------+----------------------------------------------------------------------------------------+ -| input_nodes | The input node names, one or more strings | -+--------------------------+----------------------------------------------------------------------------------------+ -| output_nodes | The output node names, one or more strings | -+--------------------------+----------------------------------------------------------------------------------------+ -| input_shapes | The shapes of the input nodes, in NHWC order | -+--------------------------+----------------------------------------------------------------------------------------+ -| output_shapes | The shapes of the output nodes, in NHWC order | -+--------------------------+----------------------------------------------------------------------------------------+ -| runtime | The running device, one of CPU, GPU or DSP | -+--------------------------+----------------------------------------------------------------------------------------+ -| limit_opencl_kernel_time | Whether splitting the OpenCL kernel within 1 ms to keep UI responsiveness, default to 0| -+--------------------------+----------------------------------------------------------------------------------------+ -| dsp_mode | Control the DSP precision and performance, default to 0 usually works for most cases | -+--------------------------+----------------------------------------------------------------------------------------+ -| obfuscate | Whether to obfuscate the model operator name, default to 0 | -+--------------------------+----------------------------------------------------------------------------------------+ -| fast_conv | Whether to enable Winograd convolution, **will increase memory consumption** | -+--------------------------+----------------------------------------------------------------------------------------+ -| input_files | Specify Numpy validation inputs. When not provided, [-1, 1] random values will be used | -+--------------------------+----------------------------------------------------------------------------------------+ +.. list-table:: + :widths: auto + :header-rows: 1 + :align: left + + * - Configuration key + - Description + * - target_abis + - The target ABI to build, can be one or more of 'host', 'armeabi-v7a' or 'arm64-v8a' + * - embed_model_data + - Whether embedding model weights as the code, default to 1 + * - platform + - The source framework, tensorflow or caffe + * - model_file_path + - The path of the model file, can be local or remote + * - weight_file_path + - The path of the model weights file, used by Caffe model + * - model_sha256_checksum + - The SHA256 checksum of the model file + * - weight_sha256_checksum + - The SHA256 checksum of the weight file, used by Caffe model + * - input_nodes + - The input node names, one or more strings + * - output_nodes + - The output node names, one or more strings + * - input_shapes + - The shapes of the input nodes, in NHWC order + * - output_shapes + - The shapes of the output nodes, in NHWC order + * - runtime + - The running device, one of CPU, GPU or DSP + * - limit_opencl_kernel_time + - Whether splitting the OpenCL kernel within 1 ms to keep UI responsiveness, default to 0 + * - dsp_mode + - Control the DSP precision and performance, default to 0 usually works for most cases + * - obfuscate + - Whether to obfuscate the model operator name, default to 0 + * - fast_conv + - Whether to enable Winograd convolution, **will increase memory consumption** + * - input_files + - Specify Numpy validation inputs. When not provided, [-1, 1] random values will be used diff --git a/docs/getting_started/op_lists.rst b/docs/getting_started/op_lists.rst index 6afa5d06..803d3217 100644 --- a/docs/getting_started/op_lists.rst +++ b/docs/getting_started/op_lists.rst @@ -1,82 +1,46 @@ Operator lists ============== -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| Operator | Android NN | Status | Remark | -+==================================+==============+========+=======================================================+ -| ADD | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| AVERAGE\_POOL\_2D | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| BATCH\_NORM | | Y | Fusion with activation is supported | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| BIAS\_ADD | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| CHANNEL\_SHUFFLE | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| CONCATENATION | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| CONV\_2D | Y | Y | Fusion with BN and activation layer is supported | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| DEPTHWISE\_CONV\_2D | Y | Y | Only multiplier = 1 is supported; Fusion is supported | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| DEPTH\_TO\_SPACE | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| DEQUANTIZE | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| EMBEDDING\_LOOKUP | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| FLOOR | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| FULLY\_CONNECTED | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| GROUP\_CONV\_2D | | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| HASHTABLE\_LOOKUP | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| L2\_NORMALIZATION | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| L2\_POOL\_2D | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| LOCAL\_RESPONSE\_NORMALIZATION | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| LOGISTIC | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| LSH\_PROJECTION | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| LSTM | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| MATMUL | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| MAX\_POOL\_2D | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| MUL | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| PSROI\_ALIGN | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| PRELU | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RELU | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RELU1 | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RELU6 | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RELUX | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RESHAPE | Y | Y | Limited support | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RESIZE\_BILINEAR | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RNN | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| RPN\_PROPOSAL\_LAYER | | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| SOFTMAX | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| SPACE\_TO\_DEPTH | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| SVDF | Y | | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ -| TANH | Y | Y | | -+----------------------------------+--------------+--------+-------------------------------------------------------+ +.. Please keep in chronological order when editing +.. csv-table:: + :widths: auto + :header: "Operator","Android NN","Supported","Remark" + + "ADD","Y","Y","" + "AVERAGE_POOL_2D","Y","Y","" + "BATCH_NORM","","Y","Fusion with activation is supported" + "BIAS_ADD","","Y","" + "CHANNEL_SHUFFLE","","Y","" + "CONCATENATION","Y","Y","" + "CONV_2D","Y","Y","Fusion with BN and activation layer is supported" + "DEPTHWISE_CONV_2D","Y","Y","Only multiplier = 1 is supported; Fusion is supported" + "DEPTH_TO_SPACE","Y","Y","" + "DEQUANTIZE","Y","","" + "EMBEDDING_LOOKUP","Y","","" + "FLOOR","Y","","" + "FULLY_CONNECTED","Y","Y","" + "GROUP_CONV_2D","","","" + "HASHTABLE_LOOKUP","Y","","" + "L2_NORMALIZATION","Y","","" + "L2_POOL_2D","Y","","" + "LOCAL_RESPONSE_NORMALIZATION","Y","Y","" + "LOGISTIC","Y","Y","" + "LSH_PROJECTION","Y","","" + "LSTM","Y","","" + "MATMUL","","Y","" + "MAX_POOL_2D","Y","Y","" + "MUL","Y","","" + "PSROI_ALIGN","","Y","" + "PRELU","","Y","" + "RELU","Y","Y","" + "RELU1","Y","Y","" + "RELU6","Y","Y","" + "RELUX","","Y","" + "RESHAPE","Y","Y","Limited support" + "RESIZE_BILINEAR","Y","Y","" + "RNN","Y","","" + "RPN_PROPOSAL_LAYER","","Y","" + "SOFTMAX","Y","Y","" + "SPACE_TO_DEPTH","Y","Y","" + "SVDF","Y","","" + "TANH","Y","Y","" -- GitLab