Add lite documents (#961)

* add lite * lite clean * update * up * update * refine lite mobile doc (#962) * update * update

Add lite documents (#961)
* add lite * lite clean * update * up * update * refine lite mobile doc (#962) * update * update
0009c8e3 · Yan Chunwei · xsrobin · e0051d50 · 0009c8e3 · 0009c8e3
13 changed file
--- a/doc/fluid/advanced_usage/deploy/mobile/for_developer.md
+++ b/doc/fluid/advanced_usage/deploy/mobile/for_developer.md
+# 开发者文档
+
+## 基本概念
+
+### Place
+
+`Place`类确定了kernel运行时的上下文信息，其中包含了kernel运行时所在的平台，执行运算数据的精度以及数据的布局等信息，使得MIR的分析更加清晰准确。它主要的成员变量如下：
+
+* `TargetType target`: kernel运行时所在的平台，如X86/CUDA/ARM等；
+* `PrecisionType precision`: kernel执行运算的数据的精度，如Float, Int8, Fp16等；
+* `DataLayoutType layout`: kernel执行运算的数据的布局，如NCHW, NHWC等；
+
+### OpLite
+
+`Oplite`类负责协助kernel计算，本身不具备计算功能，主要的接口功能包括：
+
+* `CheckShape`: 用于检查op的输入/输出参数维度、类型是否合法，以及属性信息是否符合设计；
+* `InferShape`: 用于设置输出Tensor的形状信息；
+* `CreateKernels`:  创建相关的kernel;
+* `Attach`: 用于从`Scope`和`OpDesc`中获取参数的指针，并传递给kernel;
+
+重要方法及声明如下：
+
+```c++
+class OpLite : public Registry {
+ public:
+  OpLite() = default;
+  explicit OpLite(const std::string &type) : op_type_(type) {}
+  explicit OpLite(const std::vector<Place> &valid_places)
+      : valid_places_(valid_places) {}
+
+  void SetValidPlaces(const std::vector<Place> &places) {
+    VLOG(3) << "valid places " << valid_places_.size();
+    valid_places_ = places;
+  }
+  // Set supported places
+  const std::vector<Place> &valid_places() const { return valid_places_; }
+  // Check the shape.
+  virtual bool CheckShape() const { return true; }
+  // Inference the outputs' shape.
+  virtual bool InferShape() const { return true; }
+  // Run this operator.
+  virtual bool Run();
+
+  // Link the external execution environ to internal context.
+  bool Attach(const cpp::OpDesc &opdesc, lite::Scope *scope);
+
+  // Create all the kernels for the valid targets.
+  std::vector<std::unique_ptr<KernelBase>> CreateKernels(
+      const std::vector<Place> &places, const std::string &kernel_type = "");
+
+  // Assign op param to kernel.
+  virtual void AttachKernel(KernelBase *kernel) = 0;
+};
+```
+
+### KernelLite
+
+为了提升kernel对`Target`, `Precision`, `DataLayout`等多种执行模式的支持，引入了`KernelLite`的概念，它主要有以下特点：
+
+* 可以通过模版特化不同`Place`和kernel的实现，加强对不同执行模式的支持；
+* 轻量级，`KernelLite`类似functor，只有执行的职能，执行效率更高；
+* 每个kernel有明确执行的模式，并且可以在analysis time参与分析；
+* 依赖简单，便于部署到mobile执行；
+* 硬件调度信息等`context`跟具体的kernel绑定，方便定制不同kernel的行为。
+
+重要的方法及声明如下：
+
+```c++
+template <TargetType Target, PrecisionType Precision,
+          DataLayoutType DataLayout = DataLayoutType::kNCHW>
+class KernelLite : public KernelBase {
+ public:
+  // Run the kernel.
+  virtual void Run() { CHECK(false) << "Not Implemented"; }
+  // Set target
+  TargetType target() const override { return Target; }
+  // Set precision
+  PrecisionType precision() const override { return Precision; }
+  // Set data layout
+  DataLayoutType layout() const override { return DataLayout; }
+  Place place() const override { return Place{Target, Precision, DataLayout}; }
+  void Touch() {}
+
+  KernelLite() = default;
+  virtual ~KernelLite() = default;
+};
+```
+
+
+
+## 架构简介
+
+Mobile 在这次升级为 lite 架构， 侧重多硬件、高性能的支持，其主要设计思想如下
+
+- 引入 Type system，强化多硬件、量化方法、data layout 的混合调度能力
+- 硬件细节隔离，通过不同编译开关，对支持的任何硬件可以自由插拔
+- 引入 MIR(Machine IR) 的概念，强化带执行环境下的优化支持
+- 优化期和执行期严格隔离，保证预测时轻量和高效率
+
+架构图如下
+
+![Paddle Inference Refactor1.0](https://github.com/Superjomn/_tmp_images/raw/master/images/lite.jpg)
+
+
+
+## 增加新 Kernel的方法
+
+下面主要介绍op新增kernel如何写，简单总结新增kernel的实现需要包含如下内容：
+
+- kernel实现：继承自`KernelLite`类的对应op的Compute类定义与实现，根据输入的数据类型，数据布局，数据所在的设备以及运行时所调用的第三方库的不同实现不同的kernel；server端CPU kernel实现在.h文件中。
+- kernel注册：server端CPU kernel注册实现在.cc文件。
+
+## 实现C++类
+
+以mul op的CPU Kernel实现为例，mul kernel执行运算的矩阵乘法的公式为*Out* = *X* * *Y*,  可见该计算由两个输入，一个输出组成; 输入输出参数分别从OP的param中获取，如mul op的param定义如下：
+
+```c++
+struct MulParam {
+  const lite::Tensor* x{};
+  const lite::Tensor* y{};
+  lite::Tensor* output{};
+  int x_num_col_dims{1};
+  int y_num_col_dims{1};
+};
+```
+
+下面开始定义`MulCompute`类的实现：
+
+```c++
+template <typename T>
+class MulCompute : public KernelLite<TARGET(kX86), PRECISION(kFloat)> {
+ public:
+  using param_t = operators::MulParam;
+
+  void Run() override {
+    auto& context = ctx_->As<X86Context>();
+    auto& param = *param_.get_mutable<operators::MulParam>();
+    CHECK(context.x86_device_context());
+    
+    //1. 为output分配内存
+    param.output->template mutable_data<T>();
+
+    // 2. 获取计算用的输入输出
+    auto* x = &param.x->raw_tensor();
+    auto* y = &param.y->raw_tensor();
+    
+    auto* z = &param.output->raw_tensor();
+    
+    //3. 对输入输出数据进行需要的处理...
+    Tensor x_matrix, y_matrix;
+    if (x->dims().size() > 2) {
+      x_matrix = framework::ReshapeToMatrix(*x, param.x_num_col_dims);
+    } else {
+      x_matrix = *x;
+    }
+
+    //4. 调用数学库进行矩阵的运算... 
+    auto blas = paddle::operators::math::GetBlas<platform::CPUDeviceContext, T>(
+        *context.x86_device_context());
+
+    blas.MatMul(x_matrix, y_matrix, z);
+  }
+
+  virtual ~MulCompute() = default;
+};
+```
+
+`MulCompute`类继承自`kernelLite`, 带有下面两个模版参数：
+
+- `TARGET(kX86)`: `Target`代表的是硬件信息，如CUDA/X86/ARM/…，表示该kernel运行的硬件平台，在该示例中我们写的是kX86，表示mul这个kernel运行在X86平台上； 
+
+- `PRECISION(kFloat)`：`Precision`代表该kernel运算支持的数据精度信息，示例中写的是`kFloat`, 表示mul这个kernel支持Float数据的运算；
+
+  需要为`MulCompute`类重写`Run`接口， kernel 的输入和输出分别通过`MulParam`获得，输入/输出的变量类型是`lite::Tensor`。
+
+到此，前向mul kernel的实现完成，接下来需要在.cc文件中注册该kernel。
+
+## 注册kernel
+
+在.cc文件中注册实现的kernel：
+
+```c++
+REGISTER_LITE_KERNEL(mul, kX86, kFloat, kNCHW,
+                     paddle::lite::kernels::x86::MulCompute<float>, def)
+    .BindInput("X", {LiteType::GetTensorTy(TARGET(kX86))})
+    .BindInput("Y", {LiteType::GetTensorTy(TARGET(kX86))})
+    .BindOutput("Out", {LiteType::GetTensorTy(TARGET(kX86))})
+    .Finalize();
+```
+
+在上面的代码中；
+
+- `REGISTER_LITE_KERNEL`: 注册MulCompute类，并特化模版参数为float类型， 类型名为mul, 运行的平台为X86, 数据精度为float, 数据布局为NCHW；
+- 在运行时，框架系统根据输入数据所在的设备，输入数据的类型，数据布局等信息静态的选择合适的kernel执行运算。
+
+## 开发环境
+
+### Mobile端开发和测试
+
+我们提供了移动端开发所需的docker镜像环境，在`paddle/fluid/lite/tools/Dockerfile.mobile`，可以直接通过
+`docker build --file paddle/fluid/lite/tools/Dockerfile.mobile --tag paddle-lite-mobile:latest . `生成镜像文件。
+
+该镜像中提供了
+
+ - Android端的交叉编译环境
+ - ARM Linux端的交叉编译环境
+ - Android端的模拟器环境
+ - 开发所需的格式检查工具
+
+#### 相关的cmake选项
+
+目前支持如下的编译配置，以生成不同目标上的程序。
+
+- `ARM_TARGET_OS` 代表目标操作系统， 目前支持 "android" "armlinux"， 默认是Android
+- `ARM_TARGET_ARCH_ABI` 代表ARCH，支持输入"armv8"和"armv7"，针对OS不一样选择不一样。
+    - `-DARM_TARGET_OS="android"` 时 
+        - "armv8", 等效于 "arm64-v8a"。 default值为这个。
+        - "armv7", 等效于 "armeabi-v7a"。 
+    - `-DARM_TARGET_OS="armlinux"` 时 
+        - "armv8", 等效于 "arm64"。 default值为这个。
+        - "armv7hf", 等效于使用`eabihf`且`-march=armv7-a -mfloat-abi=hard -mfpu=neon-vfpv4 `。
+        - "armv7", 等效于使用`eabi`且`-march=armv7-a -mfloat-abi=softfp -mfpu=neon-vfpv4`。
+- `ARM_TARGET_LANG` 代表目标编译的语言， 默认为gcc，支持 gcc和clang两种。
+
+注意: ARM Linux当前仅支持在armv8上编译并测试。
+
+#### 开发
+
+添加新的ARM端kernel，主要分为3部分：
+
+1. 添加具体的数学计算，在`paddle/fluid/lite/arm/math`中添加对应的数学函数，侧重点在于代码本身的优化，充分利用NEON指令发挥其优势。
+2. 添加kernel声明和调用实例，在`paddle/fluid/lite/kernels/arm`中添加对应kernel的框架声明和调用，侧重点在于每种kernel严格对应输入输出的类型。
+3. 添加单元测试，在`paddle/fluid/lite/kernels/arm`中添加相应的单元测试，并保持其在模拟器或者真机中可以通过。
+
+#### 测试
+
+我们在镜像开发环境中添加了`arm64-v8a`和`armeabi-v7a`的Android模拟环境，在没有真机环境下，可以很方便的用于测试对应平台上的单元测试。
+
+常用步骤如下
+
+```shell
+# 创建Android avd (armv8)
+$ echo n | avdmanager create avd -f -n paddle-armv8 -k "system-images;android-24;google_apis;arm64-v8a"
+
+# 启动Android armv8 emulator
+$ ${ANDROID_HOME}/emulator/emulator -avd paddle-armv8 -noaudio -no-window -gpu off -verbose &
+
+# 其他正常测试步骤
+
+# 关闭所有模拟器
+$ adb devices | grep emulator | cut -f1 | while read line; do adb -s $line emu kill; done
+```
+
--- a/doc/fluid/advanced_usage/deploy/mobile/images/Paddle Inference Refactor1.0.jpg
+++ b/doc/fluid/advanced_usage/deploy/mobile/images/Paddle Inference Refactor1.0.jpg
--- a/doc/fluid/advanced_usage/deploy/mobile/images/lite-process.png
+++ b/doc/fluid/advanced_usage/deploy/mobile/images/lite-process.png
--- a/doc/fluid/advanced_usage/deploy/mobile/images/lite_train_process.png
+++ b/doc/fluid/advanced_usage/deploy/mobile/images/lite_train_process.png
--- a/doc/fluid/advanced_usage/deploy/mobile/images/op-kernel-relation.png
+++ b/doc/fluid/advanced_usage/deploy/mobile/images/op-kernel-relation.png
--- a/doc/fluid/advanced_usage/deploy/mobile/index_cn.rst
+++ b/doc/fluid/advanced_usage/deploy/mobile/index_cn.rst
@@ -4,12 +4,12 @@

 本模块介绍了 PaddlePaddle 组织下的嵌入式平台深度学习框架——Paddle-Mobile，包括：

-* `项目简介 <mobile_readme.html>`_：简要介绍了 Paddle-Mobile 的应用效果，特点以及使用说明
+* `项目简介 <mobile_index.html>`_：简要介绍了 Paddle-Mobile 特点以及使用说明

-* `环境搭建 <mobile_build.html>`_：分别介绍如何在Docker和非Docker下搭建环境
+* `开发者文档 <for_developer.html>`_：分别介绍如何开发扩展及编译 mobile 预测库

 .. toctree::
   :hidden:

-   mobile_readme.md
-   mobile_build.md
+   mobile_index.md
+   for_developer.md
--- a/doc/fluid/advanced_usage/deploy/mobile/index_en.rst
+++ b/doc/fluid/advanced_usage/deploy/mobile/index_en.rst
@@ -2,14 +2,3 @@
 Mobile Deployment
 #################

-This section is for a deep learning framework in PaddlePaddle organization —— Paddle-Mobile：
-
-* `Brief Introduction to the Project <mobile_readme_en.html>`_：Brief introduction to effects, features, and user guides of Paddle-Mobile 
-
-* `Build Environment <mobile_build_en.html>`_：How to build environment for Mobile with Docker or without it.
-
-.. toctree::
-   :hidden:
-
-   mobile_readme_en.md
-   mobile_build_en.md
--- a/doc/fluid/advanced_usage/deploy/mobile/mobile_build.md
+++ b/doc/fluid/advanced_usage/deploy/mobile/mobile_build.md
-# 环境搭建
-## 使用 docker
-### 1. 安装 docker
-安装 docker 的方式，参考 [官方文档](https://docs.docker.com/install/)
-### 2. 使用 docker 搭建构建环境
-首先进入 paddle-mobile 的目录下，执行 `docker build`
-以 Linux/Mac 为例 (windows 建议在 'Docker Quickstart Terminal' 中执行)
-```
-$ docker build -t paddle-mobile:dev - < Dockerfile
-```
-使用 `docker images` 可以看到我们新建的 image
-```
-$ docker images
-REPOSITORY      TAG     IMAGE ID       CREATED         SIZE
-paddle-mobile   dev     33b146787711   45 hours ago    372MB
-```
-### 3. 使用 docker 构建
-进入 paddle-mobile 目录，执行 docker run
-```
-$ docker run -it --mount type=bind,source=$PWD,target=/paddle-mobile paddle-mobile:dev
-root@5affd29d4fc5:/ # cd /paddle-mobile
-# 生成构建 android 产出的 Makefile
-root@5affd29d4fc5:/ # rm CMakeCache.txt
-root@5affd29d4fc5:/ # cmake -DCMAKE_TOOLCHAIN_FILE=tools/toolchains/arm-android-neon.cmake
-# 生成构建 linux 产出的 Makefile
-root@5affd29d4fc5:/ # rm CMakeCache.txt
-root@5affd29d4fc5:/ # cmake -DCMAKE_TOOLCHAIN_FILE=tools/toolchains/arm-linux-gnueabi.cmake
-```
-### 4. 设置编译选项
-可以通过 ccmake 设置编译选项
-```
-root@5affd29d4fc5:/ # ccmake .
-                                                     Page 1 of 1
- CMAKE_ASM_FLAGS
- CMAKE_ASM_FLAGS_DEBUG
- CMAKE_ASM_FLAGS_RELEASE
- CMAKE_BUILD_TYPE
- CMAKE_INSTALL_PREFIX             /usr/local
- CMAKE_TOOLCHAIN_FILE             /paddle-mobile/tools/toolchains/arm-android-neon.cmake
- CPU                              ON
- DEBUGING                         ON
- FPGA                             OFF
- LOG_PROFILE                      ON
- MALI_GPU                         OFF
- NET                              googlenet
- USE_EXCEPTION                    ON
- USE_OPENMP                       OFF
-```
-修改选项后，按 `c`, `g` 更新 Makefile
-### 5. 构建
-使用 make 命令进行构建
-```
-root@5affd29d4fc5:/ # make
-```
-### 6. 查看构建产出
-构架产出可以在 host 机器上查看，在 paddle-mobile 的目录下，build 以及 test/build 下，可以使用 adb 指令或者 scp 传输到 device 上执行
-
-## 不使用 docker
-不使用 docker 的方法，可以直接用 cmake 生成 makefile 后构建。使用 ndk 构建 android 应用需要正确设置 NDK_ROOT。构建 linux 应用需要安装 arm-linux-gnueabi-gcc 或者类似的交叉编译工具，可能需要设置 CC，CXX 环境变量，或者在 tools/toolchains/ 中修改 arm-linux-gnueabi.cmake，或者增加自己需要的 toolchain file。
--- a/doc/fluid/advanced_usage/deploy/mobile/mobile_build_en.md
+++ b/doc/fluid/advanced_usage/deploy/mobile/mobile_build_en.md
-# Build Environment
-## Use docker 
-### 1. Install docker
-For the installation of docker, please refer to [official document](https://docs.docker.com/install/)
-### 2. Use docker to build environment
-First we enter into the directory of paddle-mobile and run `docker build` .
-Take Linux/Mac as an example (On windows, it is recommended to run in 'Docker Quickstart Terminal' )
-```
-$ docker build -t paddle-mobile:dev - < Dockerfile
-```
-Use `docker images` to show image we created
-```
-$ docker images
-REPOSITORY      TAG     IMAGE ID       CREATED         SIZE
-paddle-mobile   dev     33b146787711   45 hours ago    372MB
-```
-### 3. Use docker to build
-Enter into the directory of paddle-mobile and perform *docker run*
-```
-$ docker run -it --mount type=bind, source=$PWD, target=/paddle-mobile paddle-mobile:dev
-root@5affd29d4fc5:/ # cd /paddle-mobile
-# Generate Makefile in the construction of android
-root@5affd29d4fc5:/ # rm CMakeCache.txt
-root@5affd29d4fc5:/ # cmake -DCMAKE_TOOLCHAIN_FILE=tools/toolchains/arm-android-neon.cmake
-# Generate Makefile in the construction of linux
-root@5affd29d4fc5:/ # rm CMakeCache.txt
-root@5affd29d4fc5:/ # cmake -DCMAKE_TOOLCHAIN_FILE=tools/toolchains/arm-linux-gnueabi.cmake
-```
-### 4. Configure compiling options
-We can configure compiling options with ccmake.
-```
-root@5affd29d4fc5:/ # ccmake .
-                                                     Page 1 of 1
- CMAKE_ASM_FLAGS
- CMAKE_ASM_FLAGS_DEBUG
- CMAKE_ASM_FLAGS_RELEASE
- CMAKE_BUILD_TYPE
- CMAKE_INSTALL_PREFIX             /usr/local
- CMAKE_TOOLCHAIN_FILE             /paddle-mobile/tools/toolchains/arm-android-neon.cmake
- CPU                              ON
- DEBUGING                         ON
- FPGA                             OFF
- LOG_PROFILE                      ON
- MALI_GPU                         OFF
- NET                              googlenet
- USE_EXCEPTION                    ON
- USE_OPENMP                       OFF
-```
-After updating options, we can update Makefile pressing `c`, `g` .
-### 5. Build
-Use command *make* to build
-```
-root@5affd29d4fc5:/ # make
-```
-### 6. Check Output of Building
-Output can be checked on the host machine. In the directory of paddle-mobile, build and test/build, you can use command adb or scp to make it run on device.
-
-## Without docker
-Without docker, you can directly use cmake to generate makefile and then build. It needs to appropriately configure NDK_ROOT to build android with ndk. To build linux applications needs to install arm-linux-gnueabi-gcc or similiar cross-building tools and may need to configure environment variables like CC, CXX; or update arm-linux-gnueabi.cmake in tools/toolchains/ ; or add toolchain file if it is needed.
--- a/doc/fluid/advanced_usage/deploy/mobile/mobile_index.md
+++ b/doc/fluid/advanced_usage/deploy/mobile/mobile_index.md
+# Paddle-Mobile
+
+## 简介
+
+## 使用方法
+
+目前有两种 C++ 接口可以实现 mobile 预测：
+
+- CxxConfig: 完整功能预测接口
+- MobileConfig: 专用于移动端的轻量级接口
+
+对应的 Java 接口也有两种：
+
+- loadCxxModel: 完整功能预测接口
+- loadMobileModel: 专用于移动端的轻量级接口
+
+前者输入原始预测模型，并执行相应的计算图优化后，实现高性能预测；后者输入计算图优化之后的模型，直接执行相关计算。
+
+### Java Basics
+
+#### 编译
+
+Java 接口需要在 cmake 选项中同时打开 DWITH_LITE, DLITE_WITH_JAVA, DLITE_WITH_ARM。 例如：
+
+```shell
+# ARM_TARGET_OS in "android" , "armlinux"
+# ARM_TARGET_ARCH_ABI in "armv8", "armv7" ,"armv7hf"
+# ARM_TARGET_LANG in "gcc" "clang"
+mkdir -p build.lite.android.arm8.gcc
+cd build.lite.android.arm8.gcc
+
+cmake .. \
+  -DWITH_GPU=OFF \
+  -DWITH_MKL=OFF \
+  -DWITH_LITE=ON \
+  -DLITE_WITH_JAVA=ON \
+  -DLITE_WITH_CUDA=OFF \
+  -DLITE_WITH_X86=OFF \
+  -DLITE_WITH_ARM=ON \
+  -DLITE_WITH_LIGHT_WEIGHT_FRAMEWORK=ON \
+  -DWITH_TESTING=ON \
+  -DARM_TARGET_OS=android -DARM_TARGET_ARCH_ABI=armv8 -DARM_TARGET_LANG=gcc
+
+make -j4
+```
+
+make 成功后，Linux下会生成动态库文件 paddle/fluid/lite/api/android/jni/libpaddle_lite_jni.so（ Mac 下为 
+libpaddle_lite_jni.jnilib, Windows 下为libpaddle_lite_jni.dll ）该动态库即 Java JNI ( Java Native Interface ) 所需要的
+C++ 接口动态链接库，下面例子中我们将使用 Linux 下 libpaddle_lite_jni.so 为例。同时，也会在同一个文件夹下生成 
+PaddlePredictor.jar
+
+#### Android 程序构建
+
+在我们的库中，Java 代码库被放在 paddle/fluid/lite/api/android/jni/src 中，具体有两个classes:
+
+com.baidu.paddle.lite.PaddlePredictor
+com.baidu.paddle.lite.Place
+
+你可以将其打包成 .jar 或者直接使用 Java 源代码接口。如果要使用 .jar，我们上节编译中生成的 .jar 也可以直接使用。
+
+请将 JNI 动态链接库放在 Android Studio 代码 jniLibs 文件夹对应的体系结构文件夹下。例如要在 arm8 架构的手机，就 在 src/main/jniLibs/arm8 文件夹下放置 libpaddle_lite_jni.so，文件路径如果不存在请创建。
+
+接下来，我们将具体介绍PaddlePredictor.java 和 Place.java 
+
+#### 代码接口 Place
+
+Paddle 预测中，为了便于管理不同的硬件及kernel 的其他实现细节，定义如下四个信息：
+
+- Target: 具体的硬件空间，比如 `ARM` 表示 ARM CPU，`OPEN_CL` 表示 OpenCL
+- DataLayout: Tensor 中的数据排布，目前有 `NCHW`
+- Precison: kernel 的计算精度，或者 Tensor 的存储类型，目前有 `FLOAT`, `INT8` 等
+- `Device`: 硬件的 device id，可以是 0 开始的整数
+
+前三个为Java enum，最后一个为整型。相关定义如下
+
+```java
+public enum TargetType {
+  UNKNOWN(0), HOST(1), X86(2), CUDA(3), ARM(4), OPEN_CL(5), ANY(6);
+}
+public enum PrecisionType {
+  UNKNOWN(0), FLOAT(1), INT8(2), INT32(3), ANY(4);
+}
+public enum DataLayoutType {
+  UNKNOWN(0), NCHW(1), ANY(2);
+}
+```
+
+而 Place 就是这四个信息的整合，其数据结构为
+
+```java
+public class Place {
+  public TargetType target;
+  public PrecisionType precision;
+  public DataLayoutType layout;
+  public int device;
+};
+```
+
+Place 用于标记Kernel 的主要计算模式，比如`place.precision=INT8` 的 kernel 表示为 Int8量化的 kernel。Place 暴露给用户，用户帮助指定模型硬件及量化等模式。
+
+#### 代码接口 PaddlePredictor
+
+PaddlePredictor 提供的 methods 都是 native static methods。整体上运行的思路为
+载入模型 -> 设置输入 -> 运行模型 -> 获取输出/存储运行后优化的模型 -> 清理掉载入的模型
+
+我们将介绍各个步骤的主要功能，具体接口的参数和返回值请见Javadoc：
+
+1. 载入模型：
+
+	```java
+	// 载入没有优化过的原始模型，用户可以设置期望的 Place 和可选的 Place 
+	public static native boolean loadCxxModel(String modelPath, Place preferredPlace, Place[] validPlaces); 
+	
+	// 载入没有优化过的原始模型，用户可以设置期望的 Place 和可选的 Place 
+	public static native boolean loadMobileModel(String modelPath);
+	```
+
+2. 设置输入
+
+	```java
+	// 设置第 offest （从0开始）输入的维度和float数据
+	public static native boolean setInput(int offset, int[] dims, float[] buf);
+	
+	// 设置第 offest （从0开始）输入的维度和byte数据 （在c++端为int8）
+	public static native boolean setInput(int offset, int[] dims, byte[] buf);
+	```
+
+3. 运行模型
+	
+	```java
+	// 运行模型
+	public static native boolean run();
+	```
+
+4. 获取输出
+	
+	```java
+	// 获取第 offset （从0开始）的 float 输出
+	public static native float[] getFloatOutput(int offset);
+	// 获取第 offset （从0开始）的 byte 输出
+	public static native byte[] getByteOutput(int offset);
+	// 指定名字获取 Var 的 float 输出
+	public static native float[] fetchFloat(String name);
+	// 指定名字获取 Var 的 byte 输出
+	public static native byte[] fetchByte(String name);
+	```
+
+5. 存储运行后优化的模型
+
+	```java
+	public static native boolean saveOptimizedModel(String modelPath);
+	```
+
+6. 清理掉载入的模型
+	
+	```java
+	public static native boolean clear();
+	```
+
+使用示例如下：
+
+```java
+String modelPath = "lite_naive_model"; // 用户定义的模型路径
+
+// 用户自定义的输入，例子里是 100 * 100 的 float
+float[] inputBuffer = new float[10000];
+for (int i = 0; i < 10000; ++i) {
+inputBuffer[i] = i;
+}
+int[] dims = {100, 100};
+
+// Cxx Model 设定 Place
+Place preferredPlace = new Place(Place.TargetType.X86, Place.PrecisionType.FLOAT);
+Place[] validPlaces = new Place[2];
+validPlaces[0] = preferredPlace;
+validPlaces[1] = new Place(Place.TargetType.ARM, Place.PrecisionType.FLOAT);
+
+// 载入模型
+PaddlePredictor.loadCxxModel(modelPath, preferredPlace, validPlaces);
+// 设置输入
+PaddlePredictor.setInput(0, dims, inputBuffer);
+// 运行Predictor
+PaddlePredictor.run();
+// 获取输出
+float[] cxxOutput = PaddlePredictor.getFloatOutput(0);
+// 保持优化后的模型在新路径
+String optimizedModelPath = modelPath + ".opt";
+PaddlePredictor.saveOptimizedModel(optimizedModelPath);
+// 清除已载入的模型
+PaddlePredictor.clear();
+
+// Mobile Model 载入优化后的模型
+PaddlePredictor.loadMobileModel(optimizedModelPath);
+// 设置输入
+PaddlePredictor.setInput(0, dims, inputBuffer);
+// 运行
+PaddlePredictor.run();
+// 获取输出
+float[] mobileOutput = PaddlePredictor.getFloatOutput(0);
+```
+
+
+### C++ Basics
+
+在使用前，有几个基本概念：
+
+#### Place
+
+Place 在 C++ 中概念与 Java 相同，为了便于管理不同的硬件及kernel 的其他实现细节，定义如下四个信息：
+
+- Target: 具体的硬件空间，比如 `kARM` 表示 ARM CPU，`kOpenCL` 表示 OpenCL
+- DataLayout: Tensor 中的数据排布，目前有 `kNCHW`
+- Precison: kernel 的计算精度，或者 Tensor 的存储类型，目前有 `kFloat`, `kInt8` 等
+- `Device`: 硬件的 device id，可以是0开始的整数
+
+前三个为结构体，最后一个为整型。相关定义如下
+
+```c++
+enum class TargetType : int {
+  kUnk = 0,
+  kHost,
+  kX86,
+  kCUDA,
+  kARM,
+  kOpenCL,
+  kAny,  // any target
+  NUM,   // number of fields.
+};
+enum class PrecisionType : int {
+  kUnk = 0,
+  kFloat,
+  kInt8,
+  kInt32,
+  kAny,  // any precision
+  NUM,   // number of fields.
+};
+enum class DataLayoutType : int {
+  kUnk = 0,
+  kNCHW,
+  kAny,  // any data layout
+  NUM,   // number of fields.
+};
+```
+
+而 Place 就是这四个信息的整合，其数据结构为
+
+```c++
+struct Place {
+  TargetType target{TARGET(kUnk)};
+  PrecisionType precision{PRECISION(kUnk)};
+  DataLayoutType layout{DATALAYOUT(kUnk)};
+  int16_t device{0};  // device ID
+};
+```
+
+Place 用于标记Kernel 的主要计算模式，比如`place.precision=kInt8` 的 kernel 表示为 Int8量化的 kernel。Place 暴露给用户层，用户帮助指定模型执行的硬件及量化等执行模式。
+
+#### Config
+
+预测接口使用的第一步是执行 `CreatePaddlePredictor(config)` 接口创建一个 predictor，具体的 config 目前有多个选择，对应着也会模板特化出不同的 predictor以适应不同的场景。
+
+模板接口如下
+
+```c++
+template <typename ConfigT>
+std::shared_ptr<PaddlePredictor> CreatePaddlePredictor(const ConfigT&);
+```
+
+接下来会详细介绍两种 Config: `CxxConfig` 和 `MobileConfig`.
+
+### CxxConfig 及对应 Predictor
+
+接口如下：
+
+- `set_model_dir(const std::string& x)` 设置模型路径(目前只支持 `__model__` + `params` 两个文件的模型格式)
+- `set_preferred_place(const Place& x)` 设置期望的执行 Place
+- `set_valid_places(const std::vector<Place>& x)`设置可选的 Place
+
+`valid_places` 用于设置模型可执行的 Place 范围，底层会根据place 信息挑选出具体的硬件执行 kernel，而`preferred_place` 用于指定 `valid_places` 中最优先执行的 Place，从而使对应 place 的 kernel 更优先被选择.
+
+比如，要执行 ARM FP32 量化预测，可以设置
+
+```c++
+CxxConfig config;
+config.set_model_dir("xxx");  // model_dir 为必须选项
+// 设置有效的Place信息
+config.set_valid_places({Place{TARGET(kARM), PRECISION(kFloat)}});
+// 当每个Op有多个kernel可选择的时候，优先选择preferred_place可运行的kernel。
+config.set_preferred_place(Place{TARGET(kARM), PRECISION(kInt8)});
+```
+
+ 创建完 config 之后可以继续获得 predictor 来执行预测
+
+```c++
+auto predictor = CreatePaddlePredictor(config);
+```
+
+获取模型的输入和输出 tensor 以填充或获取数据。
+
+这里的 Tensor 都是 handle，用户最好复用。
+
+```c++
+auto x_tensor = predictor->GetInput(0/*index*/);
+// 这里的 0 表示输入序列的 offset，具体的顺序由训练中 save_inference_model 存储决定
+// 注意，这里的 x_tensor 是一个 unique_ptr，也就是一个对应的 handle，用户可以在每个 batch 都复用
+// 这个 handle.
+auto out_tensor = predictor->GetOutput(0/*index*/);
+// 这里 out_tensor 是只读的
+```
+
+ 这里的 Tensor 提供了用户需要的详细的信息，其定义如下，用户可以自由使用其他接口
+
+```c++
+struct Tensor {
+  void Resize(const shape_t& shape);
+
+  /// Readonly data.
+  template <typename T>
+  const T* data() const;
+
+  template <typename T>
+  T* mutable_data() const;
+
+  /// Shape of the tensor.
+  shape_t shape() const;
+};
+```
+
+接着上面例子，`x_tensor` 是第`0` 个输入的 Tensor，是可写的。 可以类似如下方式准备输入
+
+```c++
+// 指定 batch_size=10, 其余维度为 200, 30
+// 注意，这里的维度需要参考实际模型做修改
+x_tensor->Resize({10, 200, 30});
+// Resize 更新 shape 后，调用 mutable_data 来实际分配内存
+auto x_data = x_tensor->mutable_data<float>();
+// 可以随意修改 x_data 的输入，比如 memcpy(x_data, some_data, some_size);
+```
+
+模型可能有多个输入，如上类似 `x_tensor` ，调用 `GetInput(i)` 获得其余 tensor 并修改。
+
+输入准备完毕，就可以执行预测：
+
+```c++
+// 执行模型的预测，模型会基于前面设定的 input tensor，执行模型计算，并填充 output tensor
+predictor->Run();
+```
+
+ 执行完毕，可以获取 output tensor 的数据
+
+```c++
+// 获得 output tensor 的 shape
+auto out_shape = out_tensor->shape();
+
+// 获得具体的 data，是一块连续的 memory
+const auto* out_data = out_tensor->data<float>();
+```
+
+### MobileConfig
+
+`MobileConfig` 基本用法类似于 `CxxConfig` ，具体区别是
+
+- CxxConfig 会执行完整的预测，包括图分析等较重的逻辑
+  - 输入为原始的预测模型，无需做离线处理
+  - 可以将图分析优化完的模型存储下来（借助 SaveOptimizedModel 接口），用于 `MobileConfig`
+- MobileConfig 考虑到手机应用的空间及初始化时长的限制，阉割掉图分析的能力，只执行预测本身
+  - 更轻量级
+  - 输入模型必须为图分析优化完的模型 (借助 CxxConfig 作离线处理)
+
+由于 MobileConfig 的输入模型必须为优化完的模型，相应的 Kernel 的 Place 由输入模型决定，因此没有 CxxConfig 中 指定Place的接口，目前只有指定模型路径的接口：
+
+-  `void set_model_dir(const std::string& x)`
+
+使用 MobileConfig 的其余步骤 与CxxConfig 完全一致。
+
+### GenCode 功能介绍
+
+Mobile 支持将模型和预测库结合，转化为 C++代码，进而融合成一个链接库，在设备上执行`paddle_code_generator` 及相应参数便可转化。
+
+### INT8量化预测
+
+Paddle-Mobile支持对[PaddleSlim](https://github.com/PaddlePaddle/models/tree/develop/PaddleSlim)中量化训练得到的模型的预测。
+
+其中使用方法如下：
+
+```c++
+CxxConfig config;
+config.set_model_dir("xxx");  // model_dir 为必须选项
+// 由于 ARM Int8 模式只包括 Conv，MUL 等少数量化 kernel，因此需要一并选择上 Float 的 kernel
+config.set_valid_places({Place{TARGET(kARM), PRECISION(kInt8)},  // Int8 计算 kernel
+                         Place{TARGET(kARM), PRECISION(kFloat)}  // Float 也需要选择以补充
+                        });
+// 上面同时选择了 kInt8 和 kFloat 两类模式的 kernel，下面设置 kInt8 的 kernel 为优先选择
+config.set_preferred_place(Place{TARGET(kARM), PRECISION(kInt8)});
+```
+
+目前该功能已在Mobilenetv1上进行了验证，并且还在持续开发中。
+
+
+## 源码编译
+
+### ARM CPU
+
+当前ARM 上可以支持arm v8和v7的交叉编译。环境可以直接使用`paddle/fluid/lite/tools/Dockerfile.mobile`生成docker镜像。
+
+- 主要的cmake选项
+                
+    - `ARM_MATH_LIB_DIR` 代表arm相关数学库的路径，可以从官网指定路径下载。
+    - `ARM_TARGET_OS` 代表目标操作系统， 目前支持 "android" "armlinux"， 默认是Android
+    - `ARM_TARGET_ARCH_ABI` 代表ARCH，支持输入"armv8"和"armv7"，针对OS不一样选择不一样。
+        - `-DARM_TARGET_OS="android"` 时 
+            - "armv8", 等效于 "arm64-v8a"。 default值为这个。
+            - "armv7", 等效于 "armeabi-v7a"。 
+        - `-DARM_TARGET_OS="armlinux"` 时 
+            - "armv8", 等效于 "arm64"。 default值为这个。当前仅支持这个输入。
+    - `ARM_TARGET_LANG` 代表目标编译的语言， 默认为gcc，支持 gcc和clang两种。
+
+- 参考示例
+	
+	```shell
+	# ARM_TARGET_OS in "android" , "armlinux"
+	# ARM_TARGET_ARCH_ABI in "armv8", "armv7" ,"armv7hf"
+	# ARM_TARGET_LANG in "gcc" "clang"
+	cmake .. \
+	    -DWITH_GPU=OFF \
+	    -DWITH_MKL=OFF \
+	    -DWITH_LITE=ON \
+	    -DLITE_WITH_CUDA=OFF \
+	    -DLITE_WITH_X86=OFF \
+	    -DLITE_WITH_ARM=ON \
+	    -DLITE_WITH_LIGHT_WEIGHT_FRAMEWORK=ON \
+        -DARM_MATH_LIB_DIR="<to_arm_math_libs_path>" \
+	    -DWITH_TESTING=ON \
+	    -DARM_TARGET_OS="android" -DARM_TARGET_ARCH_ABI="armv8" -DARM_TARGET_LANG="gcc"
+	make -j4
+	```
+
+### OpenCL
+
+Paddle-Mobile支持在Android系统上运行基于OpenCL的程序，目前提供armv8和armv7的交叉编译。
+
+#### 编译
+
+- 编译环境: 使用`paddle/fluid/lite/tools/Dockerfile.mobile`生成docker镜像。
+- cmake编译选型介绍
+    * `ARM_TARGET_OS` 代表目标操作系统， 目前仅支持 "android", 亦为默认值。
+    * `ARM_TARGET_ARCH_ABI` 代表ARCH，支持输入"armv8"和"armv7"。其中，"armv8",
+    等效于 "arm64-v8a"，亦为默认值；"armv7", 等效于 "armeabi-v7a"。
+    * `ARM_TARGET_LANG` 代表目标编译的语言， 默认为gcc，支持 gcc和clang两种。
+- 参考示例
+
+	```shell
+	# ARM_TARGET_OS in "android"
+	# ARM_TARGET_ARCH_ABI in "armv8", "armv7" ,"armv7hf"
+	# ARM_TARGET_LANG in "gcc" "clang"
+	# 假设我们处于源码根目录下
+	mkdir build_opencl && cd build_opencl
+	cmake .. \
+	    -DLITE_WITH_OPENCL=ON \
+	    -DWITH_GPU=OFF \
+	    -DWITH_MKL=OFF \
+	    -DWITH_LITE=ON \
+	    -DLITE_WITH_CUDA=OFF \
+	    -DLITE_WITH_X86=OFF \
+	    -DLITE_WITH_ARM=ON \
+	    -DLITE_WITH_LIGHT_WEIGHT_FRAMEWORK=ON \
+	    -DWITH_TESTING=ON \
+	    -DARM_TARGET_OS="android" -DARM_TARGET_ARCH_ABI="armv8" -DARM_TARGET_LANG="gcc"
+	# 完整编译
+	make -j4
+	# 或者我们也可以make某一target文件
+	make test_mobilenetv1_lite -j4
+	make test_cl_runtime -j4
+	make test_elementwise_add_opencl -j4
+	make test_pool_opencl -j4
+	```
+
+#### 运行
+
+- 运行文件准备
+
+使用如下命令将运行OpenCL程序时需要加载的文件push到手机端(假设我们处于源码根目录下)：
+
+```
+# 我们将文件统一push到/data/local/tmp/opencl目录下
+adb shell mkdir -p /data/local/tmp/opencl
+# 将OpenCL的kernels文件push到/data/local/tmp/opencl目录下
+adb push paddle/fluid/lite/opencl/cl_kernel /data/local/tmp/opencl
+# 将mobilenet_v1的模型文件push到/data/local/tmp/opencl目录下
+adb push build_opencl/third_party/install/mobilenet_v1 /data/local/tmp/opencl
+# 将OpenCL测试程序(如test_mobilenetv1_lite) push到/data/local/tmp/opencl目录下
+adb push paddle/fluid/lite/api/test_mobilenetv1_lite /data/local/tmp/opencl
+```
+
+- 运行OpenCL程序
+
+使用如下命令运行OpenCL程序。其中，`--cl_path`指定了OpenCL的kernels文件即cl\_kernel所在目录，
+`--modle_dir`指定了模型文件所在目录。
+
+```shell
+adb shell
+cd /data/local/tmp/opencl && ./test_mobilenetv1_lite --cl_path=. --model_dir=mobilenet_v1
+```
+
--- a/doc/fluid/advanced_usage/deploy/mobile/mobile_readme.md
+++ b/doc/fluid/advanced_usage/deploy/mobile/mobile_readme.md
-# 项目简介
-
-<!--[![Release](https://img.shields.io/github/release/PaddlePaddle/Paddle-Mobile.svg)](https://github.com/PaddlePaddle/Paddle-Mobile/releases)
-[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)-->
-
-
-欢迎来到 Paddle-Mobile GitHub 项目。Paddle-Mobile是PaddlePaddle组织下的项目，是一个致力于嵌入式平台的深度学习的框架
-
-## Features
-
- 高性能支持ARM CPU
- 支持Mali GPU
- 支持Andreno GPU
- 支持苹果设备的GPU Metal实现
- 支持ZU5、ZU9等FPGA开发板
- 支持树莓派等arm-linux开发板
-
-## Demo
-[ANDROID](https://github.com/xiebaiyuan/paddle-mobile-demo)
-
-### 原Domo目录
-
-请参考这里[这里](https://github.com/PaddlePaddle/paddle-mobile/tree/develop/demo)
-
-## 文档
-
-### 设计文档
-
-关于paddle-mobile设计文档请参考[这里](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/design_doc.md)，如果想了解更多内容，[Issue](https://github.com/PaddlePaddle/paddle-mobile/issues)中会有很多早期的设计和讨论过程
-
-
-### 开发文档
-
-开发文档主要是关于编译、运行等问题。作为开发者，它可以和贡献文档共同结合使用
-
-* [iOS](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_ios.md)
-* [Android_CPU](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_android.md)
-* [Android_GPU](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_android_GPU.md)
-* [FPGA](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_fpga.md)
-* [ARM_LINUX](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_arm_linux.md)
-
-### 贡献代码
-
- [贡献代码](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/CONTRIBUTING.md)
-
- 上面文档中涵盖了主要的贡献代码流程，如果在实践中您还遇到了其他问题，可以发[Issue](https://github.com/PaddlePaddle/paddle-mobile/issues)。我们看到后会尽快处理
-
-
-## 模型获得
-目前Paddle-Mobile仅支持Paddle fluid训练的模型。如果你手中的模型是不同种类的模型，需要进行模型转换才可以运行
-
-### 1. 直接使用Paddle Fluid训练
-
-该方式最为可靠，推荐方式
-
-### 2. Caffe转为Paddle Fluid模型
-
-请参考这里[这里](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/caffe2fluid)
-
-### 3. ONNX
-
-ONNX全称为“Open Neural Network Exchange”，即“开放的神经网络切换”，该项目的目的是让不同的神经网络开发框架做到互通互用
-
-除直接使用PaddlePaddle训练fluid版本的模型外，还可以通过onnx转换得到个别Paddle Fluid模型
-
-目前，百度也在做onnx支持工作。相关转换项目在[这里](https://github.com/PaddlePaddle/paddle-onnx)
-
-### 4. 部分测试模型和测试图片下载
-
-[下载链接](http://mms-graph.bj.bcebos.com/paddle-mobile%2FmodelsAndImages.zip)
-
- 测试输入数据可由本仓库下的脚本`tools/python/imagetools`生成。
-
-## 交流与反馈
- 欢迎您通过[Github Issues](https://github.com/PaddlePaddle/Paddle/issues)来提交问题、报告与建议
- QQ群: 696965088 (Paddle-Mobile)
- [论坛](http://ai.baidu.com/forum/topic/list/168): 欢迎大家在PaddlePaddle论坛分享在使用PaddlePaddle中遇到的问题和经验, 营造良好的论坛氛围
-
-## Copyright and License
-Paddle-Mobile 提供相对宽松的Apache-2.0开源协议 [Apache-2.0 license](LICENSE)
-
-
-## 旧版 Mobile-Deep-Learning
-原MDL(Mobile-Deep-Learning)工程被迁移到了这里 [Mobile-Deep-Learning](https://github.com/allonli/mobile-deep-learning)
-
--- a/doc/fluid/advanced_usage/deploy/mobile/mobile_readme_en.md
+++ b/doc/fluid/advanced_usage/deploy/mobile/mobile_readme_en.md
-# Brief Introduction to the Project
-
-<!--[![Release](https://img.shields.io/github/release/PaddlePaddle/Paddle-Mobile.svg)](https://github.com/PaddlePaddle/Paddle-Mobile/releases)
-[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)-->
-
-
-Welcome to Paddle-Mobile GitHub project. Paddle-Mobile is a project of PaddlePaddle as well as a deep learning framework for embedded platforms.
-
-## Features
-
- high performance in support of ARM CPU
- support Mali GPU
- support Andreno GPU
- support the realization of GPU Metal on Apple devices
- support implementation on ZU5、ZU9 and other FPGA-based development boards
- support implementation on Raspberry Pi and other arm-linux development boards
-
-## Demo
- [ANDROID](https://github.com/xiebaiyuan/paddle-mobile-demo)
-
-### Catalog of original Demo
-
-[https://github.com/PaddlePaddle/paddle-mobile/tree/develop/demo](https://github.com/PaddlePaddle/paddle-mobile/tree/develop/demo)
-
-## Documentation
-
-### Documentation of design
-
-If you want to know more details about the documentation of paddle-mobile design, please refer to [documentation of design](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/design_doc.md) . There are many previous designs and discussions: [issue](https://github.com/PaddlePaddle/paddle-mobile/issues).
-
-
-
-### Documentation of development
-
-Documentation of development is mainly about building, running and other tasks. As a developer, you can use it with the help of contributed documents.
-
- [iOS](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_ios.md)
- [Android_CPU](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_android.md)
- [Android_GPU](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_android_GPU.md)
- [FPGA](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_fpga.md)
- [ARM_LINUX](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/doc/development_arm_linux.md)
-
-### How to contribute your documents
- [tutorial link to contribute documents](https://github.com/PaddlePaddle/paddle-mobile/blob/develop/CONTRIBUTING.md)
- Main procedure of contributing code is covered in the document above. If you have other problems during the procedure, please send them as [issue](https://github.com/PaddlePaddle/paddle-mobile/issues). We will deal with it as quickly as possible.
-
-
-## Acquisition of Models
-At present Paddle-Mobile only supports models trained by Paddle fluid. Models can only be operated regularly after transformation if you have models trained by other framworks.
-### 1. Use Paddle Fluid directly to train
-It is the most reliable method to be recommended
-### 2. Transform Caffe to Paddle Fluid model
-[https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/caffe2fluid](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/caffe2fluid)
-### 3. ONNX
-ONNX is the acronym of Open Neural Network Exchange. The project is aimed to make a full communication and usage among different neural network development frameworks.
-
-Except for directly using fluid models trained by PaddlePaddle, you can also get certain Paddle fluid models through onnx transformation.
-
-At present，work in support of onnx is also under operation in Baidu. Related transformation project can be referred to here：
-[https://github.com/PaddlePaddle/paddle-onnx](https://github.com/PaddlePaddle/paddle-onnx)
-
-### 4. Download parts of testing models and testing pictures
-[http://mms-graph.bj.bcebos.com/paddle-mobile%2FmodelsAndImages.zip](http://mms-graph.bj.bcebos.com/paddle-mobile%2FmodelsAndImages.zip)
-
-<!--## Online output of simple search
-
-Gif as following is the application output of online main part detection of simple search app
-![ezgif-1-050a733dfb](http://otkwwi4x8.bkt.clouddn.com/2018-07-05-ezgif-1-050a733dfb.gif)-->
-
- input data generated by tools from `tools/python/imagetools`.
-
-## Communication
- [Github Issues](https://github.com/PaddlePaddle/Paddle/issues): bug reports, feature requests, install issues, usage issues, etc.
- QQ discussion group: 696965088 (Paddle-Mobile).
- [Forums](http://ai.baidu.com/forum/topic/list/168?pageNo=1): discuss implementations, research, etc.
-
-## Copyright and License
-Paddle-Mobile provides relatively unstrict Apache-2.0 Open source agreement [Apache-2.0 license](LICENSE).
-
-
-## Old version Mobile-Deep-Learning
-Original MDL(Mobile-Deep-Learning) project has been transferred to [Mobile-Deep-Learning](https://github.com/allonli/mobile-deep-learning)
--- a/Anakin @ beec126e
+++ b/Anakin @ beec126e
-Subproject commit 65178d41c3a61ba846f1e94909e3cb50a8c19c92
+Subproject commit beec126e4cfe762e4b6b542496069323dca35ee7