diff --git a/caffe2fluid/doc/Accuracy.md b/caffe2fluid/doc/Accuracy.md
new file mode 100644
index 0000000000000000000000000000000000000000..8cf0396736d32d7212cd8799158f10c3678ff58b
--- /dev/null
+++ b/caffe2fluid/doc/Accuracy.md
@@ -0,0 +1,39 @@
+## Accuracy
+
+
+### [Accuracy](http://caffe.berkeleyvision.org/tutorial/layers/accuracy.html)
+```
+layer {
+    name: "accuracy"
+    type: "Accuracy"
+    bottom: "pred"
+    bottom: "label"
+    top: "accuracy"
+    include{
+	phase: TEST
+    }
+}
+```
+
+
+### [paddle.fluid.layers.accuracy](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-253-accuracy)
+```python
+paddle.fluid.layers.accuracy(
+    input,
+    label,
+    k = 1,
+    correct = None,
+    total = None
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+Caffe：只能计算每个类别中top1中正确预测的个数。          
+PaddlePaddle：可以通过设置`k`来计算每个类别中top k 中正确预测的个数。
+
+
+
+
+
+
diff --git a/caffe2fluid/doc/ArgMax.md b/caffe2fluid/doc/ArgMax.md
new file mode 100644
index 0000000000000000000000000000000000000000..a2144ff43eaaf97c0bba7f21418d1320b3d92895
--- /dev/null
+++ b/caffe2fluid/doc/ArgMax.md
@@ -0,0 +1,31 @@
+## ArgMax
+
+
+### [ArgMax](http://caffe.berkeleyvision.org/tutorial/layers/argmax.html)
+```
+layer {
+    name: "argmax"
+    type: "ArgMax"
+    bottom: "data"
+    top: "argmax"	
+    argmax_param{
+	out_max_val: false
+	top_k: 1
+	axis: 0
+    }
+}
+```
+
+
+### [paddle.fluid.layers.argmax](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-204-argmax)
+```python
+paddle.fluid.layers.argmax(
+    x,
+    axis = 0
+)
+```  
+
+### 功能差异
+#### 输出的差异
+Caffe：可以通过设置设置`top_k`使输出为前k大的索引，同时可以设置`out_max_val`为true来使输出为前k大的数值。                                    
+PaddlePaddle：只能输出最大值的索引。
diff --git a/caffe2fluid/doc/BatchNorm.md b/caffe2fluid/doc/BatchNorm.md
new file mode 100644
index 0000000000000000000000000000000000000000..b245da99685422a987a04bdca40f9d09c427f874
--- /dev/null
+++ b/caffe2fluid/doc/BatchNorm.md
@@ -0,0 +1,57 @@
+## BatchNorm
+
+
+### [BatchNorm](http://caffe.berkeleyvision.org/tutorial/layers/batchnorm.html)
+```
+layer {
+    name: "bn"
+    type: "BatchNorm"
+    bottom: "data"
+    top: "bn"
+    batch_norm_param{
+        use_global_stats: true
+    	moving_average_fraction: 0.999
+    	eps: 0.00001
+    }
+}
+```
+
+
+### [paddle.fluid.layers.batch_norm](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-36-batch_norm)
+```python
+paddle.fluid.layers.batch_norm(
+    input, 
+    act=None, 
+    is_test=False, 
+    momentum=0.9, 
+    epsilon=1e-05, 
+    param_attr=None, 
+    bias_attr=None, 
+    data_layout='NCHW', 
+    in_place=False, 
+    name=None, 
+    moving_mean_name=None, 
+    moving_variance_name=None, 
+    do_model_average_for_mean_and_var=False, 
+    fuse_with_relu=False, 
+    use_global_stats=False
+)
+```  
+
+### 功能差异
+#### 输出结果的差异
+Caffe：输出只是单纯的使用均值和方差进行归一化计算，没有缩放变换这一过程，若要完成后面这一过程需要搭配Scale层进行使用。  
+PaddlePaddle：完成了归一化和缩放变换两个过程，是完整的一个Batch Normalization过程。
+
+
+#### 输入参数的差异
+Caffe：共需要3个输入参数：均值向量、方差向量、滑动系数。    
+PaddlePaddle：共需要4个输入参数：均值向量、方差向量、缩放变量`param_attr`和缩放变量`bias_attr`。
+#### 计算方式的差异
+Caffe：输入的均值和方差都需要与滑动系数进行计算得出新均值和方差，再进行归一化处理。    
+PaddlePaddle：直接使用输入的均值和方差进行归一化处理。
+
+
+#### 其他差异
+Caffe：激活函数需要由另外一层完成。  
+PaddlePaddle：可以通过设置`act`和`fuse_with_relu`看是否在进行Batch Normalization后进行激活函数的操作。
diff --git a/caffe2fluid/doc/Convolution.md b/caffe2fluid/doc/Convolution.md
new file mode 100644
index 0000000000000000000000000000000000000000..b835e0b74b2778b5f3cda9115fbc81a5b71d6d75
--- /dev/null
+++ b/caffe2fluid/doc/Convolution.md
@@ -0,0 +1,65 @@
+## Convolution
+
+
+### [Convolution](http://caffe.berkeleyvision.org/tutorial/layers/convolution.html)
+```
+layer{
+    name: "conv"
+    type: "Convolution"
+    bottom: "data"
+    top: "conv"
+    #卷积核的局部学习率和权值衰减因子
+    param{
+	lr_mult: 1
+	decay_mult: 1
+    }
+    #偏置项的局部学习率和权值衰减因子
+    param{
+	lr_mult: 2
+	decay_mult: 0
+    }
+    convolution_param{
+	num_output: 20	#必填项
+	kernel_size: 5	#必填项
+	stride: 1
+	pad: 0
+	group: 1
+	bias_term: True
+	weight_filler {
+	    type: "gaussian"
+	    value: 0.01
+	}
+	bias_filler {
+	    type: "constant"
+	    value: 0
+	}
+    }
+}
+```
+
+
+### [paddle.fluid.layers.conv2d](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-45-conv2d)
+```python
+paddle.fluid.layers.conv2d(
+    input,
+    num_filters,
+    output_size,
+    stride = 1,
+    padding = 0,
+    dilation = 1,
+    groups = None,
+    param_attr=None,
+    bias_attr=None,
+    use_cudnn=True,
+    act=None,
+    name=None
+)
+```  
+
+### 功能差异
+#### 参数初始化的差异
+Caffe：第一个`param`负责设置卷积核的局部学习率和权值衰减因子，第二个`param`则负责设置偏置项的局部学习率和权值衰减因子；而卷积核和偏置项的在`convolution_param`中进行设置；是否使用偏置项可以使用`bias_term`进行设置。           
+PaddlePaddle：卷积核和偏置项的多处设置均分别在一个参数——`param_attr`/`bias_attr`中完成所有操作。二者的默认值为None，而ParamAttr是一个初始化结果，其可以通过`paddle.fluid.ParamAttr(name=None, initializer=None, learning_rate=1.0, regularizer=None, trainable=True, gradient_clip=None, do_model_average=False)`获得；bias_attr同时可以是设置为布尔型，用来表示是否使用偏置项。
+#### 空洞卷积的使用
+Caffe：无法使用空洞卷积。                  
+PaddlePaddle：使用`dilation`来设置空洞卷积。
diff --git a/caffe2fluid/doc/Crop.md b/caffe2fluid/doc/Crop.md
new file mode 100644
index 0000000000000000000000000000000000000000..b56fe8c795490ddc510a52f3eb315e513b682cf6
--- /dev/null
+++ b/caffe2fluid/doc/Crop.md
@@ -0,0 +1,66 @@
+## Crop
+
+
+### [Crop](http://caffe.berkeleyvision.org/tutorial/layers/crop.html)
+```
+layer {
+    name: "crop"
+    type: "Crop"
+    bottom: "data1"
+    bottom: "data2"
+    top: “crop"
+    crop_param{
+        axis: 1
+        offset: 0
+        offset: 2
+    }
+}
+```
+
+
+### [paddle.fluid.layers.crop](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-51-crop)
+```python
+paddle.fluid.layers.crop(
+    x, 
+    shape=None, 
+    offsets=None, 
+    name=None
+)
+```  
+
+### 功能差异
+#### 裁剪参考输入的差异
+Caffe：裁剪参考输入只能是Variable的格式。              
+PaddlePaddle：剪裁参考输入可以是Variable，也可以是一个list或者tuple，其中放入每一个维度的维度数。
+#### 裁剪偏移量输入的差异
+Caffe：只需要设置需要裁剪的维度的偏移量。             
+PaddlePaddle：每一个维度需要设置偏移量。
+### 代码示例
+```  
+# Caffe示例： 
+# data1输入shape：(20，3，128，128)
+# data2输入shape：(20，2，64，64)
+layer {
+    name: "crop"
+    type: "Crop"
+    bottom: "data1"
+    bottom: "data2"
+    top: ”crop"
+    crop_param{
+        axis: 1
+        offset: 0
+        offset: 25
+        offset: 25
+    }
+}
+# 输出shape：(20，2，64，64)
+```  
+```python
+# PaddlePaddle示例：  
+# inputs1输入shape：(20，3，128，128)
+# inputs2输入shape：(20，2，64，64)
+output1 = fluid.layers.crop(x = inputs1, shape=inputs2, offsets=[0,0,25,25])
+# 输出shape：(20，2，64，64)
+output = fluid.layers.crop(x = inputs1, shape=[20,2,64,64], offsets=[0,0,25,25])
+# 输出shape：(20，2，64，64)，其与output1输出结果一致
+```
diff --git a/caffe2fluid/doc/Deconvolution.md b/caffe2fluid/doc/Deconvolution.md
new file mode 100644
index 0000000000000000000000000000000000000000..c1505fcc18b30a409b879355c3028f50df59e05d
--- /dev/null
+++ b/caffe2fluid/doc/Deconvolution.md
@@ -0,0 +1,66 @@
+## Deconvolution
+
+
+### [Deconvolution](http://caffe.berkeleyvision.org/tutorial/layers/deconvolution.html)
+```
+layer{
+    name: "deconv"
+    type: "Deconvolution"
+    bottom: "data"
+    top: "conv"
+    #卷积核的局部学习率和权值衰减因子
+    param{
+	lr_mult: 1
+	decay_mult: 1
+    }
+    #偏置项的局部学习率和权值衰减因子
+    param{
+	lr_mult: 2
+	decay_mult: 0
+    }
+    convolution_param{
+	num_output: 20	#必填项
+	kernel_size: 3	#必填项
+	stride: 1
+	pad: 0
+	group: 1
+	bias_term: True
+	weight_filler {
+	    type: "gaussian"
+	    value: 0.01
+	}
+	bias_filler {
+	    type: "constant"
+	    value: 0
+	}
+    }
+}
+```
+
+
+### [paddle.fluid.layers.conv2d_transpose](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-46-conv2d_transpose)
+```python
+paddle.fluid.layers.conv2d_transpose(
+    input,
+    num_filters,
+    output_size,
+    stride = 1,
+    padding = 0,
+    dilation = 1,
+    groups = None,
+    param_attr=None,
+    bias_attr=None,
+    use_cudnn=True,
+    act=None,
+    name=None
+)
+```  
+
+### 功能差异
+#### 参数初始化的差异
+
+Caffe：第一个`param`负责设置卷积核的局部学习率和权值衰减因子，第二个`param`则负责设置偏置项的局部学习率和权值衰减因子；而卷积核和偏置项的在`convolution_param`中进行设置；是否使用偏置项可以使用`bias_term`进行设置。           
+PaddlePaddle：卷积核和偏置项的多处设置均分别在一个参数——`param_attr`/`bias_attr`中完成所有操作。二者的默认值为None，而ParamAttr是一个初始化结果，其可以通过`paddle.fluid.ParamAttr(name=None, initializer=None, learning_rate=1.0, regularizer=None, trainable=True, gradient_clip=None, do_model_average=False)`获得；bias_attr同时可以是设置为布尔型，用来表示是否使用偏置项。
+#### 空洞卷积的使用
+Caffe：无法使用空洞卷积。                  
+PaddlePaddle：使用`dilation`来设置空洞卷积。
diff --git a/caffe2fluid/doc/Dropout.md b/caffe2fluid/doc/Dropout.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1be734bb4c010938eb234679cf1ecd8bb81614a
--- /dev/null
+++ b/caffe2fluid/doc/Dropout.md
@@ -0,0 +1,33 @@
+## Dropout
+
+
+### [Dropout](http://caffe.berkeleyvision.org/tutorial/layers/dropout.html)
+```
+layer {
+    name: "dropout"
+    type: "Dropout"
+    bottom: "data"
+    top: “dropout"
+    dropout_param{
+	dropout_ratio: 0.5
+    }
+}
+```
+
+
+### [paddle.fluid.layers.dropout](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-56-dropout)
+```python
+paddle.fluid.layers.dropout(
+    x, 
+    dropout_prob, 
+    is_test=False, 
+    seed=None, 
+    name=None, 
+    dropout_implementation='downgrade_in_infer'
+)
+```  
+
+### 功能差异
+#### 输入参数的差异
+Caffe：输出的是PaddlePaddle中`dropout_implementation`设置为`upscale_in_train`的结果。               
+PaddlePaddle：相对于Caffe，多使用了`seed`、`dropout_implementation`和`is_test`几个参数。
diff --git a/caffe2fluid/doc/Eltwise.md b/caffe2fluid/doc/Eltwise.md
new file mode 100644
index 0000000000000000000000000000000000000000..5191579abe84841ab7763efe8d555846841e8191
--- /dev/null
+++ b/caffe2fluid/doc/Eltwise.md
@@ -0,0 +1,89 @@
+## Eltwise
+
+
+### [Eltwise](http://caffe.berkeleyvision.org/tutorial/layers/eltwise.html)
+```
+layer {
+    name: "eltwise"
+    type: "Eltwise"
+    bottom: "num1"
+    bottom: "num2"
+    top: "prod"
+    eltwise_param{
+        operation: PROD	#还有MAX，SUM
+        stable_prod_grad: false
+        # coeff: 1
+        # coeff: -1
+    }
+}
+```
+
+
+### [paddle.fluid.layers.elementwise_sum](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-61-elementwise_add)、[paddle.fluid.layers.elementwise_max](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-63-elementwise_max)、[paddle.fluid.layers.elementwise_mul](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-65-elementwise_mul)
+```python
+paddle.fluid.layers.elementwise_sum(
+    x, 
+    y, 
+    axis = -1, 
+    act = None,
+    name = None
+)
+和
+paddle.fluid.layers.elementwise_max(
+    x, 
+    y, 
+    axis = -1, 
+    act = None,
+    name = None
+)
+和
+paddle.fluid.layers.elementwise_mul(
+    x, 
+    y, 
+    axis = -1, 
+    act = None,name = None
+)
+```  
+
+### 功能差异
+#### 输入数据的差异
+假设逐元素操作的两个输入分别是`X`和`Y`。
+Caffe：`X`和`Y`的`shape`必须按相同，否则会出错。                
+PaddlePaddle：`Y`的`shape`可以是`X`的`shape`可以的一个连续子序列，并通过设置`axis`表示从哪一个维度开始对应。
+
+#### 加法操作的差异
+Caffe：可以通过设置`coeff`参数为加法的每个输入添加一个权重，所以实际上其可以完成剪发操作。              
+PaddlePaddle：无权重设置功能。
+#### 乘法操作的差异
+Caffe：可以通过设置`stable_prod_grad`参数来选择是否渐进较慢的梯度计算方法。                   
+PaddlePaddle：无设置`stable_prod_grad`参数的功能。
+#### 其他差异
+Caffe：激活函数需要由另外一层完成。            
+PaddlePaddle：可以通过设置`act`对逐元素操作后的tensor变量执行非线性激活。
+
+### 代码示例
+``` 
+# Caffe示例：  
+# 输入num1的shape：(2,3,4,5)
+# 输入num2的shape：(2,3,4,5)
+layer {
+	name: "eltwise"
+	type: "Eltwise"
+	bottom: "num1"
+	bottom: "num2"
+	top: "sum"
+	eltwise_param{
+		operation: SUM
+		coeff: 1
+		coeff: 1
+	}
+}
+# 输出shape：(2,3,4,5)
+```  
+```python
+# PaddlePaddle示例：  
+# 输入num1的shape：(2,3,4,5)
+# 输入num2的shape：(3,4)
+output = paddle.fluid.layers.elementwise_sum(x = num1, y = num2, axis = 1)
+# 输出shape：(2,3,4,5)
+```  
diff --git a/caffe2fluid/doc/EuclideanLoss.md b/caffe2fluid/doc/EuclideanLoss.md
new file mode 100644
index 0000000000000000000000000000000000000000..41530d815b6da81abd526ee841a0d97e3c22bec5
--- /dev/null
+++ b/caffe2fluid/doc/EuclideanLoss.md
@@ -0,0 +1,36 @@
+## EuclideanLoss
+
+
+### [EuclideanLoss](http://caffe.berkeleyvision.org/tutorial/layers/euclideanloss.html)
+```
+layer {
+    name: "loss"
+    type: "EuclideanLoss"
+    bottom: "pred"
+    bottom: "label"
+    top: "loss"
+}
+```
+
+
+### [paddle.fluid.layers.square_error_cost](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-167-square_error_cost)
+```python
+paddle.fluid.layers.square_error_cost(
+    input,
+    label
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+Caffe：计算的是整个输入的欧氏距离除以两倍的样本个数，最终获得的是一个值。                                        
+
+
+PaddlePaddle：计算的是`input`和`label`中每个值对应的L2距离，输出的大小和输入大小一致。若要通过PaddlePaddle实现Caffe的这一操作可以通过下面示例完成：  
+```python
+inputs = paddle.fluid.layers.data(name = 'data1', shape = [2,3,227,227], append_batch_size = False, dtype = 'float32')
+labels = paddle.fluid.layers.data(name = 'data1', shape = [2,3,227,227], append_batch_size = False, dtype = 'float32')
+loss = paddle.fluid.layers.square_error_cost(input = inputs, label = labels)
+sum = paddle.fluid.layers.sum(x = loss)
+res = sum/(2*inputs.shape[0])
+```
diff --git a/caffe2fluid/doc/Exp.md b/caffe2fluid/doc/Exp.md
new file mode 100644
index 0000000000000000000000000000000000000000..ae5373dd0b5b56929e40ef5f46d2fe9b9654c1aa
--- /dev/null
+++ b/caffe2fluid/doc/Exp.md
@@ -0,0 +1,40 @@
+## Exp
+
+
+### [Exp](http://caffe.berkeleyvision.org/tutorial/layers/exp.html)
+```
+layer {
+    name: "exp"
+    type: "Exp"
+    bottom: "data"
+    top: "exp"	
+    exp_param{
+	base: -1
+	scale: 1
+	shift: 0
+    }
+}
+```
+
+
+### [paddle.fluid.layers.exp](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-186-exp)
+```python
+paddle.fluid.layers.exp(
+    x,
+    name = None
+)
+```  
+
+### 功能差异
+#### 计算机制的差异 
+Caffe：有三个关于计算的参数，其计算公式为：  
+$$
+y=\begin{cases}
+e^(shift+scale \times x),\quad x\leq 0 \\\\
+base^(shift+scale \times x),\quad x>0
+\end{cases}
+$$
+         
+
+PaddlePaddle：计算公式为：$$y=e^x$$
+
diff --git a/caffe2fluid/doc/Flatten.md b/caffe2fluid/doc/Flatten.md
new file mode 100644
index 0000000000000000000000000000000000000000..f78d2c1fd80d4c8f9d7a45ebb95d24f2fe4e4b7b
--- /dev/null
+++ b/caffe2fluid/doc/Flatten.md
@@ -0,0 +1,67 @@
+## Flatten
+
+
+### [Flatten](http://caffe.berkeleyvision.org/tutorial/layers/flatten.html)
+```
+layer {
+    name: "flatten"
+    type: "Flatten"
+    bottom: "data"
+    top: "flatten"
+    flatten_param{
+	axis: 1
+	end_axis: -1
+    }
+}
+```
+
+
+### [paddle.fluid.layers.flatten](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-72-flatten)
+```python
+paddle.fluid.layers.flatten(
+    x,
+    axis = 1,
+    name = None
+)
+```  
+
+### 功能差异
+#### 转换机制的差异
+Caffe：有两个参数，`axis`代表转换起始点，`end_axis`代表转换终止点，假设输入数据的维度为n，则`axis`和`end_axis`的取值范围都是[-n,n-1]（其中当i是一个大于等于-n的负值时，可以将其等同于i+n）。它有两种用法：当`axis<=end_axis`时，代表将第`axis+1`维数据至第`end_axis+1`维数据压缩至同一纬度的数据；当`axis`是一个大于等于-n的负值或者0且`end_axis=axis+n-1`时，代表在第`end_axis+1`个维度插入一个维度，且该维度大小为1，其余维度后移。  
+PaddlePaddle：只有一个参数`axis`,其取值范围为[0,n]，小于等于`axis`的维度压缩成一个维度，剩下的压缩成另一个维度，当某一边维度数为0时，则添入一个维度大小为1的维度。  
+### 代码示例
+```  
+# Caffe代码示例：
+# 输入shape：(10,3,5,5)  
+layer {
+    name: "flatten"
+    type: "Flatten"
+    bottom: "data"
+    top: "flatten"
+    flatten_param{
+	axis: 1
+	end_axis: -2
+    }
+}
+# 输出shape：(10,15,10）
+layer {
+    name: "flatten"
+    type: "Flatten"
+    bottom: "data"
+    top: "flatten"
+    flatten_param{
+	axis: 1
+	end_axis: -2
+    }
+}
+# 输出shape：(10,3,5,1,5）
+
+```  
+```python
+# PaddlePaddle示例：  
+# 输入shape：(10,3,5,5)  
+output1 = paddle.fluid.layers.flatten(x = inputs , axis = 2)
+# 输出shape：(30,15)
+output2 = paddle.fluid.layers.flatten(x = inputs , axis = 4)
+# 输出shape：(450,1)
+```  
diff --git a/caffe2fluid/doc/InnerProduct.md b/caffe2fluid/doc/InnerProduct.md
new file mode 100644
index 0000000000000000000000000000000000000000..ca4c093461db8656a64c30d205d460a361f09dbb
--- /dev/null
+++ b/caffe2fluid/doc/InnerProduct.md
@@ -0,0 +1,64 @@
+## InnerProduct
+### [InnerProduct](http://caffe.berkeleyvision.org/tutorial/layers/innerproduct.html)
+```
+layer{
+    name: "fc"
+    type: "InnerProduct"
+    bottom: "data"
+    top: "fc"
+    #卷积核的局部学习率和权值衰减因子
+    param{
+	lr_mult: 1
+	decay_mult: 1
+    }
+    #偏置项的局部学习率和权值衰减因子
+    param{
+	lr_mult: 2
+	decay_mult: 0
+    }
+    InnerProduct{
+	num_output: 20	#必填项
+	bias_term: True
+	weight_filler {
+	    type: "gaussian"
+	    value: 0.01
+	}
+	bias_filler {
+	    type: "constant"
+	    value: 0
+	}
+    }
+}
+```
+
+
+### [paddle.fluid.layers.fc](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-71-fc)
+```python
+paddle.fluid.layers.fc(
+    input,
+    size,
+    num_flatten_dims=1,
+    param_attr=None,
+    bias_attr=None,
+    act=None,
+    is_test=False,
+    name=None
+)
+```  
+
+### 功能差异
+#### 参数初始化的差异
+
+Caffe：第一个`param`负责设置卷积核的局部学习率和权值衰减因子，第二个`param`则负责设置偏置项的局部学习率和权值衰减因子；而卷积核和偏置项的在`convolution_param`中进行设置；是否使用偏置项可以使用`bias_term`进行设置。  
+PaddlePaddle：Caffe中的卷积核和偏置项的多处设置均分别在一个参数——`param_attr`/`bias_attr`中完成所有操作。二者的默认值为None，而ParamAttr是一个初始化结果，其可以通过`paddle.fluid.ParamAttr(name=None, initializer=None, learning_rate=1.0, regularizer=None, trainable=True, gradient_clip=None, do_model_average=False)`获得；bias_attr同时可以是设置为布尔型，用来表示是否使用偏置项。
+#### 参数格式的差异
+Caffe：输入参数的数据格式是`(filter_num, channel*height*width)`。  
+PaddlePaddle：在`num_flatten_dims=1`且维度为4的情况下，其输入参数的输入数据格式则是`(channel*height*width, filter_num)`，而其他不管什么情况PaddlePaddle的filter_num都始终应该放在第二维。
+#### 输入数据扁平化的差异
+Caffe：将输入数据的第一维默认为batch size，其他剩余的几个维度扁平化压缩成一个向量进行全连接的计算。                     
+PaddlePaddle：通过设置`num_flatten_dims`的值，确认后`rank(input)-num_flatten_dim`个维度扁平化压缩成一个向量进行全连接计算。
+
+
+#### 其他差异
+Caffe：需要在另一个层中定义激活函数。  
+PaddlePaddle：可以通过设置`act`这一参数来确定输出的激活函数。
diff --git a/caffe2fluid/doc/Input.md b/caffe2fluid/doc/Input.md
new file mode 100644
index 0000000000000000000000000000000000000000..32094a8dc65c08591149d65513b77dc32061efff
--- /dev/null
+++ b/caffe2fluid/doc/Input.md
@@ -0,0 +1,70 @@
+## Input
+### [Input](http://caffe.berkeleyvision.org/tutorial/layers/input.html)
+```
+layer {
+    name: "input"
+    type: "Input"
+    top: "input"	
+    input_param{
+        shape{
+	    dim: 10
+	    dim: 3
+	    dim: 227
+	    dim: 227
+	}
+    }
+}
+```
+
+
+### [paddle.fluid.layers.data](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-20-data)
+```python
+paddle.fluid.layers.data(
+    name, 
+    shape, 
+    append_batch_size=True, 
+    dtype='float32', 
+    lod_level=0, 
+    type=VarType.LOD_TENSOR, 
+    stop_gradient=True
+)
+```  
+
+### 功能差异
+#### 输入shape的差异
+Caffe：输入的shape中每一个维度的大小都需要详细定义。  
+PaddlePaddle：可以根据设置设置`append_batch_size`来确定是否将数据第一个维度的大小加入到shape中，若该参数为True，输入数据第一个维度的大小则由传入数据决定，若该参数为False，则shape的第一个维度为输入数据第一个维度的大小。   
+
+
+
+#### 其他差异
+Caffe：不需要强制定义输入数据的类型。  
+PaddlePaddle：需要强制定义输入数据的类型，同时可以通过设置`lod_level`表示输入的数据是不是一个序列，设置`stop_gradient`表示是否应该停止计算梯度。
+
+
+### 代码示例
+``` 
+# Caffe示例：
+layer{
+    name: "input"
+    type: "Input"
+    top: "input"	
+    input_param{
+    	shape{
+	    dim: 10
+	    dim: 3
+	    dim: 227
+	    dim: 227
+	}
+    }
+}
+# 数据shape为[10,3,227,227]
+```
+
+``` python
+# PaddlePaddle示例：
+inputs1 = paddle.fluid.layers.data(name = 'data1', shape = [10,3,227,227], dtype = 'float32', append_batch_size = False)
+# 数据shape为[10,3,227,227]
+inputs2 = paddle.fluid.layers.data(name = 'data2', shape = [3,227,227], dtype = 'float32')
+# 数据shape为[-1,3,227,227]
+```  
diff --git a/caffe2fluid/doc/LRN.md b/caffe2fluid/doc/LRN.md
new file mode 100644
index 0000000000000000000000000000000000000000..d5e84c98ec4f3fda4523cc758133a23d9d342301
--- /dev/null
+++ b/caffe2fluid/doc/LRN.md
@@ -0,0 +1,50 @@
+## LRN
+
+
+### [LRN](http://caffe.berkeleyvision.org/tutorial/layers/lrn.html)
+```
+layer {
+    name: "lrn"
+    type: "LRN"
+    bottom: "data"
+    top: "lrn"	
+    lrn_parame{
+	loal_size: 5
+	alphe: 1
+	beta: 5
+	norm_region: 'ACROSS_CHANNELS'
+    }
+}
+```
+
+
+### [paddle.fluid.layers.lrn](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-99-lrn)
+```python
+paddle.fluid.layers.lrn(
+    input, 
+    n=5, 
+    k=1.0, 
+    alpha=0.0001, 
+    beta=0.75, 
+    name=None
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+Caffe：  
+计算机制：  
+$$output(i,x,y)=input(i,x,y)/(1+\frac{\alpha}{n}\sum_{j=max(0,i-\frac{n}{2})}{min(C,i+\frac{n}{2}}{input(j,x,y)^2})^\beta$$  
+位移数只能是1，同时在计算缩放参数时还除以累加通道数。  
+  
+
+
+PaddlePaddle：  
+计算机制：  
+$$output(i,x,y)=input(i,x,y)/(k+\alpha\sum_{j=max(0,i-\frac{n}{2})}{min(C,i+\frac{n}{2}}{input(j,x,y)^2})^\beta$$  
+能通过设置k来定义位移数。
+
+
+#### 其他差异
+Caffe：可以通过设置`norm_region`参数来制定规范化方式，分别为通道间规范化（`ACROSS_CHANNELS`）和通道内规范化（`WITHIN_CHANNEL`）。     
+PaddlePaddle：默认只能使用通道间规范化。
diff --git a/caffe2fluid/doc/Log.md b/caffe2fluid/doc/Log.md
new file mode 100644
index 0000000000000000000000000000000000000000..d26139c829f606d6662919f9b168b12cde128fa2
--- /dev/null
+++ b/caffe2fluid/doc/Log.md
@@ -0,0 +1,39 @@
+## Log
+
+
+### [Log](http://caffe.berkeleyvision.org/tutorial/layers/log.html)
+```
+layer {
+    name: "log"
+    type: "Log"
+    bottom: "data"
+    top: "log"
+    log_param{
+        base: -1
+        scale: 1
+	shift: 0
+    }
+}
+```
+
+
+### [paddle.fluid.layers.log](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-93-log)
+```python
+paddle.fluid.layers.log(
+    x,
+    name=None
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+
+Caffe：有三个关于计算的参数，其计算公式为：  
+$$
+y=\begin{cases}
+ln(shift+scale \times x),\quad base\leq 0 \\\\
+log_base(shift+scale \times x),\quad base>0
+\end{cases}
+$$              
+             
+PaddlePaddle：计算公式为：$$y=ln(x)$$
diff --git a/caffe2fluid/doc/Pooling.md b/caffe2fluid/doc/Pooling.md
new file mode 100644
index 0000000000000000000000000000000000000000..773964b96927807243c2689662e56e27f0f86159
--- /dev/null
+++ b/caffe2fluid/doc/Pooling.md
@@ -0,0 +1,91 @@
+## Pooling
+
+### [Pooling](http://caffe.berkeleyvision.org/tutorial/layers/pooling.html)
+```
+layer{
+    name: "pool"
+    type: "Pooling"
+    bottom: "conv"
+    top: "pool"
+    pooling_param{
+	pool: MAX
+	kernel_size: 3	#必填项
+	stride: 1
+	pad: 0
+    }
+}
+```
+### [paddle.fluid.layers.pool2d](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-115-pool2d)
+```python
+paddle.fluid.layers.pool2d(
+    input,
+    pool_size,
+    pool_type = 'max',
+    pool_stride = 1,
+    pool_padding = 0,
+    global_pooling = False,
+    use_cudnn = True,
+    ceil_mode = False,
+    name = None,
+    exclusive = True
+)
+```  
+  
+### 功能差异
+#### 计算输出高度和宽度的差异
+计算池化的输出高度和宽度有两种方式，分别为向上取整（ceil）和向下取整（floor），其计算方式如下列所示：
+
+**向上取整：**  
+	`H_out = (H_in-ksize[0]+2*padding[0])/strides[0]+1`  
+	`W_out = (W_in-ksize[1]+2*padding[1])/strides[1]+1`  
+
+**向下取整：**  
+	`H_out = (H_in-ksize[0]+2*padding[0]+strides[0]-1)/strides[0]+1`  
+	`W_out = (W_in-ksize[1]+2*padding[1]+strides[1]-1)/strides[1]+1`    
+
+Caffe：只能使用向上取整的方式来计算输入输出的大小。  
+PaddlePaddle：可以使用`ceil_mode`参数来定义使用何种计算方式，当`ceil_mode=False`（默认值）时使用向下取整的方式来计算，反之为`True`时则使用向上取整的方式进行计算。  
+
+
+
+#### 池化方式的差异
+Caffe：提供了三种池化方式——最大池化、均值池化和随机池化（随机池化通过对像素点按照数值大小赋予概率，再按照概率进行亚采样）。  
+PaddlePaddle：提供了两种池化方式——最大池化和均值池化。
+ 
+
+
+#### 其他差异  
+Caffe：无`exclusive`参数。  
+PaddlePaddle：使用了一个`exclusive`参数，其代表在进行平均池化时是否忽略填充值。  
+
+
+### 代码示例
+
+```  
+# Caffe示例：  
+# 输入shape：(1,3,228,228)  
+# 输出shape：(1,3,114,114)
+layer{
+    name: "pool"
+    type: "Pooling"
+    bottom: "conv"
+    top: "pool"
+    pooling_param{
+	pool: MAX
+	kernel_size: 3	
+	stride: 2
+    }
+}
+```  
+``` python
+# PaddlePaddle示例：  
+# 输入shape：(1,3,228,228)  
+# 输出shape：(1,3,113,113)
+pool1 = paddle.fluid.layers.pool2d(input = inputs , pool_size = 3, pool_type = 'max', pool_stride = 2, ceil_mode=False)
+```  
+
+
+
+
+
+
diff --git a/caffe2fluid/doc/Power.md b/caffe2fluid/doc/Power.md
new file mode 100644
index 0000000000000000000000000000000000000000..7f195b81c57c918bb5984ac0189df167c8bfc327
--- /dev/null
+++ b/caffe2fluid/doc/Power.md
@@ -0,0 +1,32 @@
+## Power
+
+
+### [Power](http://caffe.berkeleyvision.org/tutorial/layers/power.html)
+```
+layer {
+    name: "power"
+    type: "Power"
+    bottom: "data"
+    top: "power"	
+    power_param{
+	power: 1
+	scale: 1
+	shift: 0
+    }
+}
+```
+
+
+### [paddle.fluid.layers.power](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-117-pow)
+```python
+paddle.fluid.layers.power(
+    x,
+    factor = 1.0,
+    name = None
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+Caffe：有三个关于计算的参数，其计算公式为：$$y=(shift+scale \times x)^2$$            
+PaddlePaddle：只有一个关于计算的参数`factor`，其计算公式为：$$y=x^factor$$
diff --git a/caffe2fluid/doc/ReLU.md b/caffe2fluid/doc/ReLU.md
new file mode 100644
index 0000000000000000000000000000000000000000..e995a67d6857b5fe3adb7131ae1819723621c14c
--- /dev/null
+++ b/caffe2fluid/doc/ReLU.md
@@ -0,0 +1,48 @@
+## ReLU
+
+
+### [ReLU](http://caffe.berkeleyvision.org/tutorial/layers/relu.html)
+```
+layer {
+    name: "relu"
+    type: "ReLU"
+    bottom: "data"
+    top: "relu"
+    relu_param{
+	negative_slope: 0
+    }	
+}
+```
+
+
+### [paddle.fluid.layers.relu](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-128-relu)
+```python
+paddle.fluid.layers.relu(
+    x, 
+    name=None
+)
+```
+和  
+```python
+paddle.fluid.layers.leaky_relu(
+    x, 
+    alpha = 0.02,
+    name=None
+)
+```
+
+
+### 功能差异
+#### 实现的差异
+Caffe：使用这个Layer即可分别实现ReLU和Leaky ReLU两个功能。     
+$$
+y=\begin{cases}
+x,\quad x\geq 0 \\\\
+\alpha \times x,\quad x<0
+\end{cases}
+$$       
+PaddlePaddle：只能通过两个函数分别实现ReLU和Leaky ReLU。         
+$$
+y=max(x,\alpha \times x)
+$$
+当alpha设置为0时，也可以直接使用Leaky ReLU带带Caffe中的ReLU层。
diff --git a/caffe2fluid/doc/Reduction.md b/caffe2fluid/doc/Reduction.md
new file mode 100644
index 0000000000000000000000000000000000000000..83455f7bec0bbdd423acd5bd1054cee723f49b81
--- /dev/null
+++ b/caffe2fluid/doc/Reduction.md
@@ -0,0 +1,69 @@
+## Reduction
+
+
+### [Reduction](http://caffe.berkeleyvision.org/tutorial/layers/reshape.html)
+```
+layer {
+    name: "reduce"
+    type: "Reduction"
+    bottom: "reduce"
+    top: “reduce"
+    reduction_param{
+        operation: SUM
+	axis: 1
+	coeff: 2
+    }
+}
+```
+
+
+### [paddle.fluid.layers.reduce_sum](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-127-reduce_sum)、[paddle.fluid.layers.reduce_mean](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-124-reduce_mean)
+```python
+paddle.fluid.layers.reduce_sum(
+    input, 
+    dim=None, 
+    keep_dim=False, 
+    name=None
+)
+和
+paddle.fluid.layers.reduce_mean(
+    input, 
+    dim=None, 
+    keep_dim=False, 
+    name=None
+)
+```  
+
+### 功能差异
+#### 输入参数的差异
+Caffe：一个层里面可以是`SUM`、`ASUM`、`SUMSQ`或者`MEAN`这四种操作。                                          
+PaddlePaddle：只能完成里面的两种操作。同时Caffe可以设置`coeff`来将每个值乘以一个系数。
+
+#### 输出的差异
+Caffe：`axis`往后的每个维度都会缩减为一个维度。              
+PaddlePaddle：只会缩减`dim`中list定义的维度，并根据`keep_dim`确定是否在输出Tensor中保留减小的维度。
+### 代码示例
+```  
+# Caffe示例：  
+# 输入shape：(30，3，6，8)
+layer {
+    name: "reduce"
+    type: "Reduction"
+    bottom: "reduce"
+    top: “reduce"
+    reduction_param{
+	operation: SUM
+	axis: 2
+	coeff: 2
+    }
+}
+# 输出shape：(30,3,)
+```  
+```python 
+# PaddlePaddle示例：  
+# 输入shape：(30，3，6，8)
+output1 = fluid.layers.reduce_mean(input = inputs, dim=[1])
+# 输出shape：(30,6,8)
+output2 = fluid.layers.reduce_mean(input = inputs, dim=[1], keep_dim=True, name=None)
+# 输出shape：(30,1,6,8)
+```  
diff --git a/caffe2fluid/doc/Reshape.md b/caffe2fluid/doc/Reshape.md
new file mode 100644
index 0000000000000000000000000000000000000000..abc846a76734dcb7a4e83b901d4721a7ebccc86e
--- /dev/null
+++ b/caffe2fluid/doc/Reshape.md
@@ -0,0 +1,93 @@
+## Reshape
+
+
+### [Reshape](http://caffe.berkeleyvision.org/tutorial/layers/reshape.html)
+```
+layer {
+    name: "reshape"
+    type: "Reshape"
+    bottom: "data"
+    top: "reshape"
+    reshape_param{
+	shape{
+	    dim: 1
+	    ...
+	}
+	axis: 0
+	num_axes: -1
+    }
+}
+```
+
+
+### [paddle.fluid.layers.reshape](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-130-reshape)
+```python
+paddle.fluid.layers.reshape(
+    x, 
+    shape, 
+    actual_shape=None, 
+    act=None, 
+    inplace=False, 
+    name=None
+)
+```  
+
+### 功能差异
+#### reshape机制的差异
+Caffe：使用0和-1分别代表复制的维度数和推断的维度数，但使用了`axis`和`num_axes`定义了其他的使用方法。当单独使用`axis`时，表示输出数据的前`axis`个维度由原始输入数据的前`axis`个维度复制而来，而`shape`里的维度信息则添加在这几个维度之后；当同时使用`axis`和`num_axes`两个参数时，表示`shape`中的第`1`个维度至第`1+num_axes`维度定义为输出中的第`axis+1`和`axis+num_axes+1`个维度，其余维度的维度数由原始输入数据的维度数代替，直至输出数据和输入数据摊平成一维时大小相同。   
+PaddlePaddle：使用0和1分别代表复制的维度数和推断的维度数。
+
+
+#### 输出的差异
+Caffe：Reshape层在不改变数据的情况下改变输入blob的维度，处理过程只在输入blob上进行，没有进行数据的拷贝。            
+PaddlePaddle：可以通过设置`inplace`表示是否对数据进行拷贝。
+#### 其他差异
+Caffe：激活函数需要由另外一层完成。            
+PaddlePaddle：可以通过设置`act`对reshpe后的tensor变量执行非线性激活。
+
+
+
+### 代码示例
+```  
+# Caffe示例：  
+# 输入shape：(2,4,6)
+layer {
+    name: "reshape"
+    type: "Reshape"
+    bottom: "data"
+    top: "reshape"
+    reshape_param{
+	shape{
+	    dim: 3
+	    dim: 2
+	}
+	axis: 2
+	num_axes: 1
+    }
+}
+# 输出shape：(2,4,3,2）
+layer {
+    name: "reshape"
+    type: "Reshape"
+    bottom: "data"
+    top: "reshape"
+    reshape_param{
+	shape{
+	    dim: 3
+	    dim: 2
+	    dim: 4
+	}
+	axis: 1
+    }
+}
+# 输出shape：(2,3,2,4)
+
+```  
+``` 
+# PaddlePaddle示例：  
+# 输入shape：(2,4,6)
+output1 = paddle.fluid.layers.reshape(x = inputs , shape = [2,4,-1,3])
+# 输出shape：(2,4,2,3)
+output2 = paddle.fluid.layers.reshape(x = inputs , axis = [0,2,2,6])
+# 输出shape：(2,2,2,6)
+```  
diff --git a/caffe2fluid/doc/Scale.md b/caffe2fluid/doc/Scale.md
new file mode 100644
index 0000000000000000000000000000000000000000..1f12f603fe31d3a8793a86f932e726acb13a2b05
--- /dev/null
+++ b/caffe2fluid/doc/Scale.md
@@ -0,0 +1,61 @@
+## Scale
+
+
+### [Scale](http://caffe.berkeleyvision.org/tutorial/layers/scale.html)
+```
+layer {
+    name: "scale"
+    type: "Scale"
+    bottom: "data"
+    top: "scale"
+    scale_param{
+	axis: 1
+	num_axes: 1
+        filter{
+	    type: "constant"
+	    value: 2
+	}
+        bias_term: true
+        bias_filter{
+	    type: "constant"
+	    value: 0.5
+        }
+    }
+}
+```
+
+
+### [paddle.fluid.layers.scale](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-137-scale)
+```python
+paddle.fluid.layers.scale(
+    x, 
+    scale=1.0,  
+    bias=0.0, 
+    bias_after_scale=True, 
+    act=None, 
+    name=None
+)
+```  
+
+### 功能差异
+#### 输入参数的差异
+Caffe：设置了`filter`和`bias_filter`，它们在训练阶段被用来初始化，在测试阶段除非`bottom`只有一个输入否则他们的值都将被忽略，并由训练所获得的可能是多个维度的输入参数所代替。输入参数的维度由`axis`来定义，以大小为`100*3*40*60`的输入为例，其输入参数维度如下所示：  
+
+|   axis值    | 可能维度1 | 可能维度2 | 可能维度3 |  可能维度4  |
+| :---------: | :-------: | :-------: | :-------: | :---------: |
+| axis==0==-4 |    $$100$$    |   $$100\times3$$   | $$100\times3\times40$$  | $$100\times3\times40\times60$$ |
+| axis==1==-3 |     $$3$$     |   $$3\times40$$    |  $$3\times40\times60$$  |             |
+| axis==2==-2 |    $$40$$     |   $$40\times60$$   |           |             |
+| axis==3==-1 |    $$60$$     |           |           |             |
+
+  
+PaddlePaddle：不存在输入参数的，它的`scale`和`bias`在定义中设置了。  
+
+#### 计算方式的差异
+Caffe：只能在缩放之后添加bias。  
+PaddlePaddle：可以通过设置`bias_after_scale`设置是在缩放之后还是之前添加bias。
+
+
+#### 其他差异
+Caffe：激活函数需要由另外一层完成。  
+PaddlePaddle：可以通过设置`act`看是否在进行Scale后进行激活函数的操作。
diff --git a/caffe2fluid/doc/SigmoidCrossEntropyLoss.md b/caffe2fluid/doc/SigmoidCrossEntropyLoss.md
new file mode 100644
index 0000000000000000000000000000000000000000..165f574f83b915be0b04185e1cf2cb0d89b4f9b7
--- /dev/null
+++ b/caffe2fluid/doc/SigmoidCrossEntropyLoss.md
@@ -0,0 +1,36 @@
+## SigmoidCrossEntropyLoss
+
+
+### [SigmoidCrossEntropyLoss](http://caffe.berkeleyvision.org/tutorial/layers/sigmoidcrossentropyloss.html)
+```
+layer {
+    name: "loss"
+    type: "SigmoidCrossEntropyLoss"
+    bottom: "pred"
+    bottom: "label"
+    top: "loss"
+}
+```
+
+
+### [paddle.fluid.layers.sigmoid_cross_entropy_with_logits](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-158-sigmoid_cross_entropy_with_logits)
+```python
+paddle.fluid.layers.sigmoid_cross_entropy_with_logits(
+    x, 
+    label, 
+    ignore_index=-100, 
+    name=None, 
+    normalize=False
+)
+```  
+
+### 功能差异
+#### 输入的差异
+Caffe：输入的数据维度最大是4维（`N*C*H*W`）。                 
+PaddlePaddle：输入只能是2维（`N*H`）。
+#### 输出的差异
+Caffe：输出的数据大小是`1*1*1*1`，即将所有位置上的loss取均值。                      
+PaddlePaddle：输出和输入大小一致，即`N*H`。
+#### 其他差异
+PaddlePaddle：可以通过设定`ignore_index`来确定忽略的目标值，同时它有一个`normalize`参数可以输出除以除去`ignore_index`对应目标外的目标数所得的结果。
+
diff --git a/caffe2fluid/doc/Slice.md b/caffe2fluid/doc/Slice.md
new file mode 100644
index 0000000000000000000000000000000000000000..2bae73ce6128b00cbcdbe6503436bca4a166aa01
--- /dev/null
+++ b/caffe2fluid/doc/Slice.md
@@ -0,0 +1,69 @@
+## Slice
+
+
+### [Slice](http://caffe.berkeleyvision.org/tutorial/layers/slice.html)
+```
+layer {
+    name: "slice"
+    type: "Slice"
+    bottom: "data"
+    top: "out1"
+    top: "out2"
+    top: "out3"
+    slice_param{
+	axis: 1
+	alice_point: 1
+	alice_point: 2
+	# slice_dim: 1
+    }
+}
+```
+
+
+### [paddle.fluid.layers.slice](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-160-slice)
+```python
+paddle.fluid.layers.slice(
+    input, 
+    axes, 
+    starts, 
+    ends
+)
+```  
+
+### 功能差异
+#### 输入参数的差异
+Caffe：输入的`axis`和`alice_point`等参数都是数值。               
+PaddlePaddle：输入的`axes`、`starts`和`ends`等输入参数都是list类型。
+#### slice机制的差异
+Caffe：只能在一个维度上截取，但可以截取多个切片。            
+PaddlePaddle：可以在多个维度上截取，但只能截取到一个切片。
+#### 其他差异
+Caffe：可以使用`slice_dim`代替`axis`，但是其只能使用正值。                
+PaddlePaddle：如果传递给`starts`或`end`的值大于n（此维度中的元素数目），则表示n。
+### 代码示例
+```  
+# Caffe示例：  
+# 输入shape：(2,6)
+layer {
+    name: "slice"
+    type: "Slice"
+    bottom: "data"
+    top: "out1"
+    top: "out2"
+    top: "out3"
+    slice_param{
+	axis: 1	#使用-1效果相同
+	alice_point: 1
+	alice_point: 2
+    }
+}
+# 输出3个数组，第一个shape：(2,1)，第二个shape：(2,1)，第三个shape：(2,4)
+```  
+```python
+# PaddlePaddle示例：  
+# 输入shape：(2,6)
+output1 = paddle.fluid.layers.slice(input = inputs, axes = [1], starts= [1], ends = [3])
+# 输出shape：(2，2)
+output2 = paddle.fluid.layers.slice(input = inputs, axes = [0,1], starts= [0,1], ends = [1,3])
+# 输出shape：(1,2)
+```  
diff --git a/caffe2fluid/doc/Sofmax.md b/caffe2fluid/doc/Sofmax.md
new file mode 100644
index 0000000000000000000000000000000000000000..22477ffbcda0bd5d93831d3e03932f04eae8c158
--- /dev/null
+++ b/caffe2fluid/doc/Sofmax.md
@@ -0,0 +1,27 @@
+## Sofmax
+
+
+### [Softmax](http://caffe.berkeleyvision.org/tutorial/layers/softmax.html)
+```
+layer {
+    name: "softmax"
+    type: "Softmax"
+    bottom: "fc"
+    top: "softmax"	
+}
+```
+
+
+### [paddle.fluid.layers.softmax](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-163-softmax)
+```python
+paddle.fluid.layers.softmax(
+    input, 
+    use_cudnn=True, 
+    name=None
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+Caffe：计算softmax之前，对每个样本中的每个值减去该样本中的最大值。                 
+PaddlePaddle：省略了这一操作直接计算softmax。
diff --git a/caffe2fluid/doc/SofmaxWithLoss.md b/caffe2fluid/doc/SofmaxWithLoss.md
new file mode 100644
index 0000000000000000000000000000000000000000..d0251e3758695393ba5117fa8358ca83cddf92f5
--- /dev/null
+++ b/caffe2fluid/doc/SofmaxWithLoss.md
@@ -0,0 +1,77 @@
+## SofmaxWithLoss
+
+
+### [SofmaxWithLoss](http://caffe.berkeleyvision.org/tutorial/layers/softmaxwithloss.html)
+```
+layer {
+    name: "loss"
+    type: "SoftmaxWithLoss"
+    bottom: "pred"
+    bottom: "label"
+    top: "loss"
+    loss_param{
+	ignore_label: -1
+	normalize: 0
+	normalization: FULL
+    }
+}
+```
+
+
+### [paddle.fluid.layers.softmax_with_cross_entropy](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-164-softmax_with_cross_entropy)
+```python
+paddle.fluid.layers.softmax_with_cross_entropy(
+    logits,
+    label,
+    soft_label = False,
+    ignore_index = -100,
+    numeric_stable_mode = False, 
+    return_softmax = False
+)
+```  
+
+### 功能差异
+#### 计算机制的差异
+计算softmax的loss时，根据每个样本是否被分配至多个类别中可以分为两类——硬标签和软标签，具体如下：  
+  
+**硬标签：** 即one-hot label，每个样本仅分到一个类别中。在硬标签中，根据是否对未初始化的log概率进行预处理，又可以分为两类，预处理主要是完成对每个样本中的每个log概率减去该样本中的最大的log概率。  
+ 
+**软标签：** 每个样本至少被分配到一个类别中。  
+  
+
+Caffe：只可以使用硬标签的输入，同时进行预处理操作。                     
+PaddlePaddle：可以使用`soft_label`来设置是使用软标签（True）还是硬标签（False）；将`numeric_stable_mode`设为True，同时在GPU环境下运行，可是在使用硬标签之前先进行预处理。此外，软标签和硬标签的label输入略有不同，当log概率的输入大小为`N*K`时（`N`代表batch size，`K`代表类别数量），软标签的输入大小为`N*K`，其重的数值数据类型为`float`或者`double`，每一个batch中的值都是0或者1（1代表属于这个类别，0则代表不属于）；硬标签的输入大小为`N*1`，其重的数值数据类型为`int`，每一个batch中的值都是大于等于0且小于K（代表属于某一个类别）。在Caffe中，则只可以使用硬标签的输入，同时进行预处理操作。 
+ 
+#### 输出的差异
+Caffe：输出是对所有样本的loss进行归一化后的结果，同时根据`normalize`和`normalization`的设置，归一化形式略有不同，当`normalization`是FULL或0时整个loss取和后除以batch的大小，当`normalization`是VALID或1时整个loss取和后除以除`ignore_label`以外的样本数，为NONE时则取和；当`normalization`未设置时，采用`normalize`的值进行判断，若`normalize==1`则归一化方式是VALID，若`normalize==0`则归一化方式是FULL。                    
+PaddlePaddle：输出是每个样本的loss所组成的一个向量，同时如果将参数`return_softmax`设为True，则输出的是loss向量和softmax值组成的一个元组。
+
+### 代码示例
+```  
+# Caffe示例：
+# pred输入shape：(100,10)  
+# label输入shape：(100,1)  
+# 输出shape：()
+layer {
+    name: "loss"
+    type: "SoftmaxWithLoss"
+    bottom: "pred"
+    bottom: "label"
+    top: "loss"
+    loss_param{
+	ignore_label: -1
+	normalize: 0
+	normalization: FULL
+
+    }
+}
+```
+
+  
+```python  
+# PaddlePaddle示例：
+# pred输入shape：(100,10)  
+# label输入shape：(100,1)  
+# 输出shape：(10,1)
+softmaxwithloss= paddle.fluid.layers.softmax_with_cross_entropy(logits = logs, label = labels, soft_label=False, ignore_index=-100, numeric_stable_mode=False, return_softmax=False)
+```
diff --git a/caffe2fluid/doc/Tile.md b/caffe2fluid/doc/Tile.md
new file mode 100644
index 0000000000000000000000000000000000000000..a4a8214747a9eedff81b2545e728096034b9d370
--- /dev/null
+++ b/caffe2fluid/doc/Tile.md
@@ -0,0 +1,31 @@
+## Tile
+
+
+### [Tile](http://caffe.berkeleyvision.org/tutorial/layers/tile.html)
+```
+layer {
+    name: "tile"
+    type: "Tile"
+    bottom: "data"
+    top: "concat"
+    tile_param{
+        axis: 1
+        tiles: 2
+    }
+}
+```
+
+
+### [paddle.fluid.layers.concat](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-70-expand)
+```python
+paddle.fluid.layers.concat(
+    x, 
+    expand_times, 
+    name=None
+)
+```  
+
+### 功能差异
+#### 输入参数的差异
+Caffe：只能在一个维度上进行复制。                    
+PaddlePaddle：`expand_times`为一个list或tuple，它存放的是每个维度复制的倍数。
diff --git "a/caffe2fluid/doc/\346\216\245\345\217\243\351\200\237\346\237\245\350\241\250.md" "b/caffe2fluid/doc/\346\216\245\345\217\243\351\200\237\346\237\245\350\241\250.md"
new file mode 100644
index 0000000000000000000000000000000000000000..aff8d032359b4f5445cc70e5e236a26fabc3f2fa
--- /dev/null
+++ "b/caffe2fluid/doc/\346\216\245\345\217\243\351\200\237\346\237\245\350\241\250.md"
@@ -0,0 +1,42 @@
+# Caffe常用层速查表
+
+**备注说明**  
+1. 接口对应：即表示PaddlePaddle接口与Caffe层接口基本一致，但使用者需注意参数名及参数类型  
+2. PaddlePaddle实现：部分接口提供了PaddlePaddle实现的示例代码，可供用户参考  
+3. 差异对比：接口存在差异，具体可查看差异对比文档  
+
+| 序号 | Caffe层                                                      | PaddlePaddle接口                                             | 备注                                                         |
+| ---- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| 1    | [AbsVal](http://caffe.berkeleyvision.org/tutorial/layers/absval.html) | [paddle.fluid.layers.abs](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-182-abs) | 接口对应                                                     |
+| 2    | [Accuracy](http://caffe.berkeleyvision.org/tutorial/layers/accuracy.html) | [paddle.fluid.layers.accuracy](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-253-accuracy) | [差异对比](Accuracy.md) |
+| 3    | [ArgMax](http://caffe.berkeleyvision.org/tutorial/layers/argmax.html) | [paddle.fluid.layers.argmax](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-204-argmax) | [差异对比](ArgMax.md) |
+| 4    | [BatchNorm](http://caffe.berkeleyvision.org/tutorial/layers/batchnorm.html) | [paddle.fluid.layers.batch_norm](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-36-batch_norm) | [差异对比](BatchNorm.md) |
+| 5    | [BNLL](http://caffe.berkeleyvision.org/tutorial/layers/bnll.html) | [paddle.fluid.layers.softplus](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-194-softplus) | 接口对应                                                     |
+| 6    | [Concat](http://caffe.berkeleyvision.org/tutorial/layers/concat.html) | [paddle.fluid.layers.concat](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-209-concat) | 接口对应                                                     |
+| 7    | [Convolution](http://caffe.berkeleyvision.org/tutorial/layers/convolution.html) | [paddle.fluid.layers.conv2d](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-45-conv2d) | [差异对比](Convolution.md) |
+| 8    | [Crop](http://caffe.berkeleyvision.org/tutorial/layers/crop.html) | [paddle.fluid.layers.crop](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-51-crop) | [差异对比](Crop.md) |
+| 9    | [Deconvolution](http://caffe.berkeleyvision.org/tutorial/layers/deconvolution.html) | [paddle.fluid.layers.conv2d_transpose](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-46-conv2d_transpose) | [差异对比](Deconvolution.md) |
+| 10   | [Dropout](http://caffe.berkeleyvision.org/tutorial/layers/dropout.html) | [paddle.fluid.layers.dropout](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-56-dropout) | [差异对比](Dropout.md) |
+| 11   | [Eltwise](http://caffe.berkeleyvision.org/tutorial/layers/eltwise.html) | -                                                            | [PaddlePaddle实现](Eltwise.md) |
+| 12   | [ELU](http://caffe.berkeleyvision.org/tutorial/layers/elu.html) | [paddle.layers.fluid.elu](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-68-elu) | 接口对应                                                     |
+| 13   | [EuclideanLoss](http://caffe.berkeleyvision.org/tutorial/layers/euclideanloss.html) | [paddle.fluid.layers.square_error_cost](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-167-square_error_cost) | [差异对比](EuclideanLoss.md) |
+| 14   | [Exp](http://caffe.berkeleyvision.org/tutorial/layers/exp.html) | [paddle.fluid.layers.exp](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-186-exp) | [差异对比](Exp.md) |
+| 15   | [Flatten](http://caffe.berkeleyvision.org/tutorial/layers/flatten.html) | [paddle.fluid.layer.flatten](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-72-flatten) | [差异对比](Flatten.md) |
+| 16   | [InnerProduct](http://caffe.berkeleyvision.org/tutorial/layers/innerproduct.html) | [paddle.fluid.layers.fc](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-71-fc) | [差异对比](InnerProduct.md) |
+| 17   | [Input](http://caffe.berkeleyvision.org/tutorial/layers/input.html) | [paddle.fluid.layers.data](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-20-data) | [差异对比](Input.md) |
+| 18   | [Log](http://caffe.berkeleyvision.org/tutorial/layers/log.html) | [paddle.fluid.layers.log](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-93-log) | [差异对比](Log.md) |
+| 19   | [LRN](http://caffe.berkeleyvision.org/tutorial/layers/lrn.html) | [paddle.fluid.layers.lrn](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-99-lrn) | [差异对比](LRN.md) |
+| 20   | [Pooling](http://caffe.berkeleyvision.org/tutorial/layers/pooling.html) | [paddle.fluid.layers.pool2d](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-115-pool2d) | [差异对比](Pooling.md) |
+| 21   | [Power](http://caffe.berkeleyvision.org/tutorial/layers/power.html) | [paddle.fluid.layers.pow](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-117-pow) | [差异对比](Power.md) |
+| 22   | [PReLU](http://caffe.berkeleyvision.org/tutorial/layers/prelu.html) | [paddle.layers.fluid.prelu](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-118-prelu) | 接口对应                                                     |
+| 23   | [Reduction](http://caffe.berkeleyvision.org/tutorial/layers/reduction.html) | -                                                            | [PaddlePaddle实现](Reduction.md) |
+| 24   | [ReLU](http://caffe.berkeleyvision.org/tutorial/layers/relu.html) | -                                                            | [PaddlePaddle实现](ReLU.md) |
+| 25   | [Reshape](http://caffe.berkeleyvision.org/tutorial/layers/reshape.html) | [paddle.fluid.layers.reshape](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-130-reshape) | [差异对比](Reshape.md) |
+| 26   | [Scale](http://caffe.berkeleyvision.org/tutorial/layers/scale.html) | [paddle.fluid.layers.scale](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-137-scale) | [差异对比](Scale.md) |
+| 27   | [SigmoidCrossEntropyLoss](http://caffe.berkeleyvision.org/tutorial/layers/sigmoidcrossentropyloss.html) | [paddle.fluid.layers.sigmoid_cross_entropy_with_logits](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-158-sigmoid_cross_entropy_with_logits) | [差异对比](SigmoidCrossEntropyLoss.md) |
+| 28   | [Sigmoid](http://caffe.berkeleyvision.org/tutorial/layers/sigmoid.html) | [paddle.fluid.layers.sigmoid](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-192-sigmoid) | 接口对应                                                     |
+| 29   | [Slice](http://caffe.berkeleyvision.org/tutorial/layers/slice.html) | [paddle.fluid.layers.slice](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-160-slice) | [差异对比](Slice.md) |
+| 30   | [SoftmaxWithLoss](http://caffe.berkeleyvision.org/tutorial/layers/softmaxwithloss.html) | [paddle.fluid.layers.softmax_with_cross_entropy](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-164-softmax_with_cross_entropy) | [差异对比](SofmaxWithLoss.md) |
+| 31   | [Softmax](http://caffe.berkeleyvision.org/tutorial/layers/softmax.html) | [paddle.fluid.layers.softmax](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-163-softmax) | [差异对比](Sofmax.md) |
+| 32   | [TanH](http://caffe.berkeleyvision.org/tutorial/layers/tanh.html) | [paddle.fluid.layers.tanh](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-199-tanh) | 接口对应                                                     |
+| 33   | [Tile](http://caffe.berkeleyvision.org/tutorial/layers/tile.html) | [paddle.fluid.layers.expand](http://paddlepaddle.org/documentation/docs/zh/1.3/api_cn/layers_cn.html#permalink-70-expand) | [差异对比](Tile.md) |