【吐槽】静态图新增PythonOP时,报错不够平易近人
Created by: GT-ZhangAcer
静态图新增PythonOP时,报错有点小问题
1)PaddlePaddle版本:1.7 2)CPU/GPU:AMD ZEN 3)系统环境:Windows 10 Home
V1.7报错方面解决提升太给力了,于是就尝试一下在静态图下新增PythonOP,在报错方面体验不是太友好。
1、疏忽漏掉参数,报错信息提示太少,很难Get到自己的失误。 2、疏忽少返回参数,不报错,内存爆了。 3、某个变量忘记转numpy,不报错,内存爆了。
以下是代码部分,拿了几个小数据实验,直接复制即可复现。 代码部分:
# Author: Acer Zhang
# Datetime:2020/3/10 21:43
import paddle.fluid as fluid
import numpy
gpu = fluid.CPUPlace()
exe = fluid.Executor(gpu)
train_data = [[1], [2], [3], [4], [5]]
y_true = [[3], [6], [9], [12], [15]]
# 定义网络
x = fluid.data(name="x", shape=[-1, 1], dtype="float32")
y = fluid.data(name="y", shape=[-1, 1], dtype="float32")
y_predict = fluid.layers.fc(input=x, size=1)
loss = fluid.default_main_program().current_block().create_var(name="loss_tmp", shape=[-1, 1], dtype="float32")
def square_error_cost(ipt_a, ipt_b):
a = numpy.array(ipt_a)
b = numpy.array(ipt_b)
ab_cost = numpy.square(a - b)
return ab_cost
# 正常版本
def backward_square_error_cost(out, target, ab_mean, d_ab_mean):
a = numpy.array(out)
b = numpy.array(target)
d_ab_mean = numpy.array(d_ab_mean)
d = numpy.array(2 * (a - b)) * d_ab_mean
return d, -d
# 版本1-爆内存 忘记转换为numpy数组对象
def backward_square_error_cost1(out, target, ab_mean, d_ab_mean):
# a = numpy.array(out)
# b = numpy.array(target)
# d_ab_mean = numpy.array(d_ab_mean)
d = numpy.array(2 * (out - target)) * d_ab_mean
return d, -d
# 版本2-爆内存 少返回一个变量
def backward_square_error_cost2(out, target, ab_mean, d_ab_mean):
a = numpy.array(out)
b = numpy.array(target)
d_ab_mean = numpy.array(d_ab_mean)
d = numpy.array(2 * (out - target)) * d_ab_mean
return d
# 正常版本
fluid.layers.py_func(func=square_error_cost, x=[y_predict, y], backward_func=backward_square_error_cost,
out=loss)
# 版本3-报错不具体 少一个输入变量
# fluid.layers.py_func(func=square_error_cost, x=y_predict, backward_func=backward_square_error_cost,
# out=loss)
# 定义优化方法
loss = fluid.layers.mean(loss)
sgd_optimizer = fluid.optimizer.Adam(learning_rate=0.1)
sgd_optimizer.minimize(loss)
# 开始训练,迭代100次
exe.run(fluid.default_startup_program())
for i in range(100):
for data_id in range(len(y_true)):
data_x = numpy.array(train_data[data_id]).astype("float32").reshape((1, 1))
data_y = numpy.array(y_true[data_id]).astype("float32").reshape((1, 1))
outs = exe.run(
feed={'x': data_x, 'y': data_y},
fetch_list=[loss])
print("loss:", outs[0])
版本3报错截图 又复现不出来了...现在也是爆内存了,之前是索引不到某个变量。
如果能对这些异常给予合适的报错提示就好了,虽然代码就在自己面前,但太难Debug了。