This is an implementation of CRNN (CNN+LSTM+CTC) for chinese text recognition.
## Building MXNet with warp-ctc
1. In order to use `mxnet.symbol.WarpCTC` layer, you need to first build Baidu's [warp-ctc](https://github.com/baidu-research/warp-ctc) library from source
2. Then build MXNet from source with warp-ctc config flags enabled.
## Data Preparation
1. Download the [Synthetic Chinese Dataset](https://pan.baidu.com/s/1dFda6R3)(contributed by https://github.com/senlinuc/caffe_ocr)
This dataset contains almost 3.6 million synthetic chinese text images with 5,990 different categories. Each image has a length of 10
characters.
2. Create train.txt and text.txt with the format like this:
image_name1 label1_1 label1_2 label1_3...
image_name2 label2_1 label2_2 label2_3...
## Training
1. Revide the path of images and txt files in train.py
2. Run
```
$ python train.py 2>&1 | tee log.txt
```
3. After almost 19 epoches, you can get 99.0502% validation accuracy.