paddle 大batchsize CPU正常，但是GPU bug, 性能比tf差480多倍 (#19584) · Issue · PaddlePaddle / Paddle

paddle 大batchsize CPU正常，但是GPU bug, 性能比tf差480多倍

Created by: anpark

为使您的问题得到快速解决，在建立Issues前，请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】

如果您没有查询到相似问题，为快速解决您的提问，建立issue时请提供如下细节信息：

标题：paddle 大batchsize CPU正常，但是GPU bug, 性能比tf差480多倍
版本、环境信息： 1）PaddlePaddle版本：1.5.1 2）CPU：intel 3）GPU：k40 4）系统环境：linux centos 6.3, cuda 9.0, cudnn 7.0
训练信息 1）单机，单卡 2）显存信息 11G 3）Operator信息
复现信息：如为报错，请给出复现环境、复现步骤
问题描述：CPU版本正常，但是GPU慢很多，初步发现是sync 操作卡住

Thank you for contributing to PaddlePaddle. Before submitting the issue, you could search issue in the github in case that there was a similar issue submitted or resolved before. If there is no solution,please make sure that this is a training issue including the following details: System information -PaddlePaddle version （eg.1.1）or CommitID -CPU: including CPUMKL/OpenBlas/MKLDNN version -GPU: including CUDA/CUDNN version -OS Platform (eg.Mac OS 10.14) -Other imformation: Distriuted training/informantion of operator/ Graphics card storage To Reproduce Steps to reproduce the behavior Describe your current behavior Code to reproduce the issue Other info / logs

PaddlePaddle / Paddle 大约 2 年 前同步成功

paddle 大batchsize CPU正常，但是GPU bug, 性能比tf差480多倍

PaddlePaddle / Paddle
大约 2 年前同步成功