word2vec 测试问题汇总
Created by: ccmeteorljh
1. 数据预处理部分:
-
路径部分请os.path.join(), 否则输出会多了 ‘/’ https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleRec/word2vec/preprocess.py#L110 https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleRec/word2vec/preprocess.py#L150
build dict : data/text//text8
2. 训练部分:
- python3.5 下报错:
Traceback (most recent call last):
File "train.py", line 228, in <module>
train(args)
File "train.py", line 198, in train
filelist, 0, 1)
File "/home/crim/models/fluid/PaddleRec/word2vec/reader.py", line 79, in __init__
self.dict_size)) + " word_all_count = " + str(word_all_count)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
- 小数据训练报维度不匹配
write word2id file to : data/test_build_dict_word_to_id_
('corpus_size:', 17005207)
dict_size = 63642 word_all_count = 17005207
CPU_NUM:5
Traceback (most recent call last):
File "train.py", line 228, in <module>
train(args)
File "train.py", line 223, in train
id_frequencys_pow)
File "train.py", line 151, in train_loop
loss_val = train_exe.run(fetch_list=[loss.name])
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/parallel_executor.py", line 303, in run
self.executor.run(fetch_list, fetch_var_name)
paddle.fluid.core.EnforceNotMet: Invoke operator elementwise_add error.
Python Callstacks:
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/framework.py", line 1317, in append_op
attrs=kwargs.get("attrs", None))
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/layer_helper.py", line 56, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/layers/nn.py", line 8843, in _elementwise_op
'use_mkldnn': use_mkldnn})
File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/layers/nn.py", line 8884, in elementwise_add
return _elementwise_op(LayerHelper('elementwise_add', **locals()))
File "/home/crim/models/fluid/PaddleRec/word2vec/net.py", line 106, in skip_gram_word2vec
neg_xent, dim=1))
File "train.py", line 208, in train
neg_num=args.nce_num)
File "train.py", line 228, in <module>
train(args)
C++ Callstacks:
Enforce failed. Expected x_dims[i + axis] == y_dims[i], but received x_dims[i + axis]:1 != y_dims[i]:100.
Broadcast dimension mismatch. at [/paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:63]
PaddlePaddle Call Stacks:
0 0x7fcec43777bdp void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 365
1 0x7fcec4377b07p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 87
2 0x7fcec46a0fb0p paddle::operators::get_mid_dims(paddle::framework::DDim const&, paddle::framework::DDim const&, int, int*, int*, int*) + 400
3 0x7fcec4e6d908p void paddle::operators::ElementwiseComputeEx<paddle::operators::AddFunctor<float>, paddle::platform::CPUDeviceContext, float, float>(paddle::framework::ExecutionContext const&, paddle::framework::Tensor const*, paddle::framework::Tensor const*, int, paddle::operators::AddFunctor<float>, paddle::framework::Tensor*) + 792
4 0x7fcec4e713d9p paddle::operators::ElementwiseAddKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 473
5 0x7fcec4e718c3p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::ElementwiseAddKernel<paddle::platform::CPUDeviceContext, float>, paddle::operators::ElementwiseAddKernel<paddle::platform::CPUDeviceContext, double>, paddle::operators::ElementwiseAddKernel<paddle::platform::CPUDeviceContext, int>, paddle::operators::ElementwiseAddKernel<paddle::platform::CPUDeviceContext, long> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 35
6 0x7fcec5390953p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 659
7 0x7fcec538f47bp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 267
8 0x7fcec5203353p
9 0x7fcec5202fecp paddle::framework::details::ComputationOpHandle::RunImpl() + 124
10 0x7fcec51fd16cp paddle::framework::details::OpHandleBase::Run(bool) + 28
11 0x7fcec51e7d9ap
12 0x7fcec4ba19a3p std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&) + 35
13 0x7fcec4b6a697p std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&) + 39
14 0x7fcedcf7ba99p
15 0x7fcec51e6b92p
16 0x7fcec4b6bac4p ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const + 404
17 0x7fced53d17e0p
18 0x7fcedcf746bap
19 0x7fcedccaa41dp clone + 109
- 给出要想复现论文结果的必备配置提示,embedding_size=300
- infer 问题
- fluid.io.load_params代码建议用
try.. except..
包起来
- 文档易用性: 全量数据集给出组织结构,并且和文档中的命令对应其来
- 多机运行失败
Traceback (most recent call last):
File "cluster_train.py", line 250, in <module>
train(args)
File "cluster_train.py", line 202, in train
if not os.path.isdir(args.model_output_dir) and args.train_id == 0:
AttributeError: 'Namespace' object has no attribute 'train_id'