paddle 大batchsize CPU正常,但是GPU bug, 性能比tf差480多倍
Created by: anpark
为使您的问题得到快速解决,在建立Issues前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】
如果您没有查询到相似问题,为快速解决您的提问,建立issue时请提供如下细节信息:
- 标题:paddle 大batchsize CPU正常,但是GPU bug, 性能比tf差480多倍
- 版本、环境信息: 1)PaddlePaddle版本:1.5.1 2)CPU:intel 3)GPU:k40 4)系统环境:linux centos 6.3, cuda 9.0, cudnn 7.0
- 训练信息 1)单机,单卡 2)显存信息 11G 3)Operator信息
- 复现信息:如为报错,请给出复现环境、复现步骤
- 问题描述:CPU版本正常,但是GPU慢很多,初步发现是sync 操作 卡住
Thank you for contributing to PaddlePaddle. Before submitting the issue, you could search issue in the github in case that there was a similar issue submitted or resolved before. If there is no solution,please make sure that this is a training issue including the following details: System information -PaddlePaddle version (eg.1.1)or CommitID -CPU: including CPUMKL/OpenBlas/MKLDNN version -GPU: including CUDA/CUDNN version -OS Platform (eg.Mac OS 10.14) -Other imformation: Distriuted training/informantion of operator/ Graphics card storage To Reproduce Steps to reproduce the behavior Describe your current behavior Code to reproduce the issue Other info / logs