使用rec_chinese_lite_train.yml配置文件训练中文数据集效果不好
Created by: NextGuido
我更改了rec_chinese_lite_train.yml
配置文件的内容,如下:
Global:
algorithm: CRNN
use_gpu: true
epoch_num: 300
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_CRNN
save_epoch_step: 3
eval_batch_step: 2000
train_batch_size_per_card: 160
test_batch_size_per_card: 160
image_shape: [3, 32, 320]
max_text_length: 25
character_type: ch
character_dict_path: ./ppocr/utils/18548.txt
loss_type: ctc
reader_yml: ./configs/rec/rec_chinese_reader_csv.yml
pretrain_weights: ./pretrain_models/rec_mv3_none_bilstm_ctc/best_accuracy
checkpoints:
save_inference_dir:
Architecture:
function: ppocr.modeling.architectures.rec_model,RecModel
Backbone:
function: ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3
scale: 0.5
model_name: small
Head:
function: ppocr.modeling.heads.rec_ctc_head,CTCPredict
encoder_type: rnn
SeqRNN:
hidden_size: 48
Loss:
function: ppocr.modeling.losses.rec_ctc_loss,CTCLoss
Optimizer:
function: ppocr.optimizer,AdamDecay
base_lr: 0.0005
beta1: 0.9
beta2: 0.999
其实只更改了epoch_num,batch_size,character_dict_path,reader_yml,pretrain_weights
这几处,但是迭代了100轮左右,现在的train acc只有0.2左右。我的字符集有18000多个,请问这个轻量级的网络目前来看是不是欠拟合了?因为我发现如果我换成resnet50作为backbone,网络可以训练的不错