-

-Figure 1. Archetecture of Deep Speech 2 Network.
-
-
-We don't have to persist on this 2-3-7-1-1-1 depth \[[2](#references)\]. Similar networks with different depths might also work well. As in \[[1](#references)\], authors use a different depth (e.g. 2-2-3-1-1-1) for final experiments.
-
-Key ingredients about the layers:
-
-- **Data Layers**:
- - Frame sequences data of audio **spectrogram** (with FFT).
- - Token sequences data of **transcription** text (labels).
- - These two type of sequences do not have the same lengthes, thus a CTC-loss layer is required.
-- **2D Convolution Layers**:
- - Not only temporal convolution, but also **frequency convolution**. Like a 2D image convolution, but with a variable dimension (i.e. temporal dimension).
- - With striding for only the first convlution layer.
- - No pooling for all convolution layers.
-- **Uni-directional RNNs**
- - Uni-directional + row convolution: for low-latency inference.
- - Bi-direcitional + without row convolution: if we don't care about the inference latency.
-- **Row convolution**:
- - For looking only a few steps ahead into the feature, instead of looking into a whole sequence in bi-directional RNNs.
- - Not nessesary if with bi-direcitional RNNs.
- - "**Row**" means convolutions are done within each frequency dimension (row), and no convolution kernels shared across.
-- **Batch Normalization Layers**:
- - Added to all above layers (except for data and loss layer).
- - Sequence-wise normalization for RNNs: BatchNorm only performed on input-state projection and not state-state projection, for efficiency consideration.
-
-
-Required Components | PaddlePaddle Support | Need to Develop
-:------------------------------------- | :-------------------------------------- | :-----------------------
-Data Layer I (Spectrogram) | Not supported yet. | TBD (Task 3)
-Data Layer II (Transcription) | `paddle.data_type.integer_value_sequence` | -
-2D Convolution Layer | `paddle.layer.image_conv_layer` | -
-DataType Converter (vec2seq) | `paddle.layer.block_expand` | -
-Bi-/Uni-directional RNNs | `paddle.layer.recurrent_group` | -
-Row Convolution Layer | Not supported yet. | TBD (Task 4)
-CTC-loss Layer | `paddle.layer.warp_ctc` | -
-Batch Normalization Layer | `paddle.layer.batch_norm` | -
-CTC-Beam search | Not supported yet. | TBD (Task 6)
-
-### Row Convolution
-
-TODO by Assignees
-
-### Beam Search with CTC and LM
-
-TODO by Assignees
-
-## Future Work
-
-- Efficiency Improvement
-- Accuracy Improvement
-- Low-latency Inference Library
-- Large-scale benchmarking
-
-## References
-
-1. Dario Amodei, etc., [Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin](http://proceedings.mlr.press/v48/amodei16.pdf). ICML 2016.
-2. Dario Amodei, etc., [Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin](https://arxiv.org/abs/1512.02595). arXiv:1512.02595.
diff --git a/doc/design/var_desc.md b/doc/design/var_desc.md
deleted file mode 100644
index 0b2958c1b10ef6a6ce51aa75f61e15a7f2d94b3f..0000000000000000000000000000000000000000
--- a/doc/design/var_desc.md
+++ /dev/null
@@ -1,69 +0,0 @@
-## Background
-PaddlePaddle divides the description of neural network computation graph into two stages: compile time and runtime.
-
-PaddlePaddle use proto message to describe compile time graph because
-
-1. Computation graph should be able to be saved to a file.
-1. In distributed training, the graph will be serialized and send to multiple workers.
-
-The computation graph is constructed by Data Node and Operation Node. The concept to represent them is in the table below.
-
-| |compile time|runtime|
-|---|---|---|
-|Data|VarDesc(proto)|Variable(cpp)|
-|Operation|OpDesc(proto)|Operator(cpp)|
-
-
-## Definition of VarDesc
-
-A VarDesc should have a name, and value. The are two kinds of variable type in compile time, they are `LoDTensor` and `SelectedRows`.
-
-```proto
-message VarDesc {
- required string name = 1;
- enum VarType {
- LOD_TENSOR = 0;
- SELECTED_ROWS = 1;
- }
- required VarType type = 2;
- optional LoDTensorDesc lod_desc = 3;
- optional TensorDesc selected_rows_desc = 4;
- optional bool persistable = 5 [ default = false ];
-}
-```
-
-## Definition of TensorDesc
-
-```proto
-enum DataType {
- BOOL = 0;
- INT16 = 1;
- INT32 = 2;
- INT64 = 3;
- FP16 = 4;
- FP32 = 5;
- FP64 = 6;
-}
-
-message TensorDesc {
- required DataType data_type = 1;
- repeated int64 dims = 2; // [UNK, 640, 480] is saved as [-1, 640, 480]
-}
-```
-
-A TensorDesc describes `SelectedRows` and `LoDTensor`. For details of `SelectedRows`, please reference [`SelectedRows`](./selected_rows.md).
-
-## Definition of LodTensorDesc
-
-```proto
-message LoDTensorDesc {
- required TensorDesc tensor = 1;
- optional int lod_level = 2;
-}
-```
-
-A LoDTensorDesc contains a tensor and a lod_level.
-
-## Definition of Variable in Python
-
-For Variable in Python, please reference [`Python API`](./python_api.md).
diff --git a/doc/faq/build_and_install/index_cn.rst b/doc/faq/build_and_install/index_cn.rst
deleted file mode 100644
index f1677e216f31d79b53ac29a0afbf6fbb886a0dcd..0000000000000000000000000000000000000000
--- a/doc/faq/build_and_install/index_cn.rst
+++ /dev/null
@@ -1,111 +0,0 @@
-###################
-编译安装与单元测试
-###################
-
-.. contents::
-
-1. 运行Docker GPU镜像出现 "CUDA driver version is insufficient"
-----------------------------------------------------------------
-
-用户在使用PaddlePaddle GPU的Docker镜像的时候,常常出现 `Cuda Error: CUDA driver version is insufficient for CUDA runtime version`, 原因在于没有把机器上CUDA相关的驱动和库映射到容器内部。
-具体的解决方法是:
-
-.. code-block:: bash
-
- $ export CUDA_SO="$(\ls usr/lib64/libcuda* | xargs -I{} echo '-v {}:{}') $(\ls /usr/lib64/libnvidia* | xargs -I{} echo '-v {}:{}')"
- $ export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
- $ docker run ${CUDA_SO} ${DEVICES} -it paddledev/paddlepaddle:latest-gpu
-
-更多关于Docker的安装与使用, 请参考 `PaddlePaddle Docker 文档