Spanbert pytorch模型转换失败 (#267) · Issue · PaddlePaddle / X2Paddle

Spanbert pytorch模型转换失败

Created by: Akeepers

目标是将spanbert 转化为paddle模型
x2paddle采用的是第一种安装方式

pytorch转onnx的时候没有出错，onnx转paddle的时候报错, 以下是转换脚本和报错信息：

x2paddle --framework=onnx --model=spanBert.onnx --save_dir=pd_model

paddle.__version__ = 1.8.1
Now translating model from onnx to paddle.
model ir_version: 6, op version: 9
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1334456625
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1334456625
2020-06-05 11:29:45,087-WARNING: During conversion of your  model, some operators will be assignd node.out_shape==None, refer to https://github.com/onnx/onnx/blob/master/docs/ShapeInference.md
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1334456625
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1334561477
Total nodes: 2372
Traceback (most recent call last):
  File "/home/yangpan/anaconda3/envs/tf/bin/x2paddle", line 11, in <module>
    load_entry_point('x2paddle==0.7.1', 'console_scripts', 'x2paddle')()
  File "/home/yangpan/anaconda3/envs/tf/lib/python3.6/site-packages/x2paddle-0.7.1-py3.6.egg/x2paddle/convert.py", line 251, in main
    onnx2paddle(args.model, args.save_dir, params_merge)
  File "/home/yangpan/anaconda3/envs/tf/lib/python3.6/site-packages/x2paddle-0.7.1-py3.6.egg/x2paddle/convert.py", line 173, in onnx2paddle
    mapper = ONNXOpMapper(model, save_dir)
  File "/home/yangpan/anaconda3/envs/tf/lib/python3.6/site-packages/x2paddle-0.7.1-py3.6.egg/x2paddle/op_mapper/onnx_op_mapper.py", line 97, in __init__
    self.elementwise_map(node)
  File "/home/yangpan/anaconda3/envs/tf/lib/python3.6/site-packages/x2paddle-0.7.1-py3.6.egg/x2paddle/op_mapper/onnx_op_mapper.py", line 282, in elementwise_map
    if len(val_x_shape) < len(val_y_shape):
TypeError: object of type 'NoneType' has no len()

pytorch转nnox的代码如下：

import torch
import torchvision
from pytorch_pretrained_bert import BertModel
import numpy as np

config_file="/home/yangpan/projects/coref/models/config.json"
model_file="/home/yangpan/projects/coref/models/spanbert_hf.tar.gz"
vocab_file="/home/yangpan/projects/coref/models/bert-large-cased-vocab.txt"

Batch_size=1
seg_length=256

# 指定输入大小的shape
# dummy_input0 = torch.LongTensor(Batch_size, seg_length).to(torch.device("cuda"))
# dummy_input1 = torch.LongTensor(Batch_size, seg_length).to(torch.device("cuda"))
# dummy_input2 = torch.LongTensor(Batch_size, seg_length).to(torch.device("cuda"))
# dummy_input3 = torch.LongTensor(Batch_size, seg_length).to(torch.device("cuda"))
# dummy_input0 = torch.LongTensor(Batch_size, seg_length)
# dummy_input1 = torch.LongTensor(Batch_size, seg_length)
# dummy_input2 = torch.LongTensor(Batch_size, seg_length)
# dummy_input3 = torch.LongTensor(Batch_size, seg_length)


a0=np.random.randint(0,10,size=[Batch_size,seg_length])
a1=np.random.randint(0,10,size=[Batch_size,seg_length])
a2=np.random.randint(0,10,size=[Batch_size,seg_length])
a3=np.random.randint(0,10,size=[Batch_size,seg_length])
dummy_input0=torch.from_numpy(a0)
dummy_input1=torch.from_numpy(a1)
dummy_input2=torch.from_numpy(a2)
dummy_input3=torch.from_numpy(a3)

model = BertModel.from_pretrained(pretrained_model_name_or_path='spanbert-large-cased')
# model = BertModel.from_pretrained(config_file=config_file,
#                                   pretrained_model_name_or_path=model_file,
#                                   vocab_file=vocab_file)

torch.onnx.export(model, dummy_input0,"spanBert.onnx",verbose=True)

PaddlePaddle / X2Paddle 大约 2 年 前同步成功

Spanbert pytorch模型转换失败

PaddlePaddle / X2Paddle
大约 2 年前同步成功