Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • Issue
  • #27238

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 9月 10, 2020 by saxon_zh@saxon_zhGuest

训练模型时 出现 [operator < read > error]

Created by: lagelanhai

  • 版本、环境信息:    1)PaddlePaddle版本:PaddlePaddle 1.8.4    2)系统环境:centos7

2020-09-10 16:26:35,460-INFO: {'Global': {'debug': False, 'algorithm': 'CRNN', 'use_gpu': False, 'epoch_num': 1000, 'log_smooth_window': 20, 'print_batch_step': 10, 'save_model_dir': './output/rec_CRNN', 'save_epoch_step': 300, 'eval_batch_step': 500, 'train_batch_size_per_card': 256, 'test_batch_size_per_card': 256, 'image_shape': [3, 32, 100], 'max_text_length': 25, 'character_type': 'ch', 'use_space_char': True, 'loss_type': 'ctc', 'distort': True, 'character_dict_path': './ppocr/utils/ic15_dict.txt', 'reader_yml': './configs/rec/rec_icdar15_reader.yml', 'pretrain_weights': './pretrain_models/rec_mv3_none_bilstm_ctc/best_accuracy', 'checkpoints': None, 'save_inference_dir': None, 'infer_img': None}, 'Architecture': {'function': 'ppocr.modeling.architectures.rec_model,RecModel'}, 'Backbone': {'function': 'ppocr.modeling.backbones.rec_mobilenet_v3,MobileNetV3', 'scale': 0.5, 'model_name': 'large'}, 'Head': {'function': 'ppocr.modeling.heads.rec_ctc_head,CTCPredict', 'encoder_type': 'rnn', 'SeqRNN': {'hidden_size': 96}}, 'Loss': {'function': 'ppocr.modeling.losses.rec_ctc_loss,CTCLoss'}, 'Optimizer': {'function': 'ppocr.optimizer,AdamDecay', 'base_lr': 0.0005, 'beta1': 0.9, 'beta2': 0.999, 'decay': {'function': 'cosine_decay', 'step_each_epoch': 20, 'total_epoch': 1000}}, 'TrainReader': {'reader_function': 'ppocr.data.rec.dataset_traversal,SimpleReader', 'num_workers': 1, 'img_set_dir': './train_data/zhengTest', 'label_file_path': './train_data/zhengTest/rec_gt_train.txt'}, 'EvalReader': {'reader_function': 'ppocr.data.rec.dataset_traversal,SimpleReader', 'img_set_dir': './train_data/zhengTest', 'label_file_path': './train_data/zhengTest/rec_gt_test.txt'}, 'TestReader': {'reader_function': 'ppocr.data.rec.dataset_traversal,SimpleReader'}} 2020-09-10 16:26:35,980-INFO: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000000] in Optimizer will not take effect, and it will only be applied to other Parameters! 2020-09-10 16:26:37,497-INFO: Distort operation can only support in GPU.Distort will be set to False. 2020-09-10 16:26:37,498-INFO: places would be ommited when DataLoader is not iterable 2020-09-10 16:26:37,498-INFO: Distort operation can only support in GPU.Distort will be set to False. 2020-09-10 16:26:37,728-INFO: Loading parameters from ./pretrain_models/rec_mv3_none_bilstm_ctc/best_accuracy... 2020-09-10 16:26:37,782-WARNING: variable ctc_fc_b_attr not used 2020-09-10 16:26:37,782-WARNING: variable ctc_fc_w_attr not used 2020-09-10 16:26:37,818-INFO: Finish initing model from ./pretrain_models/rec_mv3_none_bilstm_ctc/best_accuracy !!! The CPU_NUM is not specified, you should set CPU_NUM in the environment variable list. CPU_NUM indicates that how many CPUPlace are used in the current task. And if this parameter are set as N (equal to the number of physical CPU core) the program may be faster.

export CPU_NUM=8 # for example, set CPU_NUM as number of physical CPU core which is 8.

!!! The default number of CPU_NUM=1. W0910 16:26:37.854447 23655 build_strategy.cc:170] fusion_group is not enabled for Windows/MacOS now, and only effective when running with CUDA GPU. Process Process-1: 2020-09-10 16:26:38,041-WARNING: Your reader has raised an exception! Traceback (most recent call last): File "/usr/local/python3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/local/python3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/local/python3/lib/python3.6/site-packages/paddle/reader/decorator.py", line 556, in _read_into_queue six.reraise(*sys.exc_info()) File "/usr/local/python3/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/usr/local/python3/lib/python3.6/site-packages/paddle/reader/decorator.py", line 549, in _read_into_queue for sample in reader(): File "/root/PaddleORC/PaddleOCR/ppocr/data/rec/dataset_traversal.py", line 324, in batch_iter_reader for outs in sample_iter_reader(): File "/root/PaddleORC/PaddleOCR/ppocr/data/rec/dataset_traversal.py", line 286, in sample_iter_reader self.num_workers)) Exception: The number of the whole data (8) is smaller than the batch_size * devices_num * num_workers (256) Exception in thread Thread-1: Traceback (most recent call last): File "/usr/local/python3/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/local/python3/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1145, in thread_main six.reraise(*sys.exc_info()) File "/usr/local/python3/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1125, in thread_main for tensors in self._tensor_reader(): File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1195, in tensor_reader_impl for slots in paddle_reader(): File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/data_feeder.py", line 506, in reader_creator for item in reader(): File "/usr/local/python3/lib/python3.6/site-packages/paddle/reader/decorator.py", line 572, in queue_reader raise ValueError("multiprocess reader raises an exception") ValueError: multiprocess reader raises an exception

/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception. "The following exception is not an EOF exception.") Traceback (most recent call last): File "tools/train.py", line 123, in main() File "tools/train.py", line 100, in main program.train_eval_rec_run(config, exe, train_info_dict, eval_info_dict) File "/root/PaddleORC/PaddleOCR/tools/program.py", line 345, in train_eval_rec_run return_numpy=False) File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1071, in run six.reraise(*sys.exc_info()) File "/usr/local/python3/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1066, in run return_merged=return_merged) File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 1167, in _run_impl return_merged=return_merged) File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/executor.py", line 879, in _run_parallel tensors = exe.run(fetch_var_names, return_merged)._move_to_list() paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::operators::reader::BlockingQueue<std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor > >::Receive(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 3 paddle::operators::reader::PyReader::ReadNext(std::vector<paddle::framework::LoDTensor, std::allocatorpaddle::framework::LoDTensor >) 4 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, unsigned long> >::_M_invoke(std::_Any_data const&) 5 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) 6 ThreadPool::ThreadPool(unsigned long)::{lambda()#1 (closed)}::operator()() const


Python Call Stacks (More useful to users):

File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2610, in append_op attrs=kwargs.get("attrs", None)) File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/reader.py", line 1080, in _init_non_iterable attrs={'drop_last': self._drop_last}) File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/reader.py", line 978, in init self._init_non_iterable() File "/usr/local/python3/lib/python3.6/site-packages/paddle/fluid/reader.py", line 620, in from_generator iterable, return_list, drop_last) File "/root/PaddleORC/PaddleOCR/ppocr/modeling/architectures/rec_model.py", line 135, in create_feed iterable=False) File "/root/PaddleORC/PaddleOCR/ppocr/modeling/architectures/rec_model.py", line 185, in call image, labels, loader = self.create_feed(mode) File "/root/PaddleORC/PaddleOCR/tools/program.py", line 170, in build dataloader, outputs = model(mode=mode) File "tools/train.py", line 50, in main config, train_program, startup_program, mode='train') File "tools/train.py", line 123, in main()


Error Message Summary:

Error: Blocking queue is killed because the data reader raises an exception [Hint: Expected killed_ != true, but received killed_:1 == true:1.] at (/paddle/paddle/fluid/operators/reader/blocking_queue.h:141) [operator < read > error]

不太清楚这是什么问题

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/Paddle#27238
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7