icnet 模型单卡和多卡在100个batch 出现loss为nan
已关闭
icnet 模型单卡和多卡在100个batch 出现loss为nan
Created by: ccmeteorljh
paddle_version: 9月13号编译的版本; 运行结果:
Iter[90]; train loss: 2.035; sub4_loss: 1.796; sub24_loss: 1.392; sub124_loss: 1.191
kpis train_cost 2.035351
Iter[100]; train loss: 2.213; sub4_loss: 1.871; sub24_loss: 1.518; sub124_loss: 1.307
kpis train_cost 2.213448
Saved checkpoint: ./chkpnt//100
Iter[110]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[120]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[130]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[140]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[150]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[160]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[170]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[180]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[190]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan
kpis train_cost nan
Iter[200]; train loss: nan; sub4_loss: nan; sub24_loss: nan; sub124_loss: nan