Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • models
  • Issue
  • #1052

M
models
  • 项目概览

PaddlePaddle / models
大约 2 年 前同步成功

通知 232
Star 6828
Fork 2962
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 602
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 255
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
M
models
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 602
    • Issue 602
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 255
    • 合并请求 255
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 7月 16, 2018 by saxon_zh@saxon_zhGuest

Some profiling results on Transformer

Created by: kuke

See profiling script in #1051

Profiling on single GPU:

Namespace(batch_size=3200, device='GPU', num_iters=10, opts=[], pool_size=200000, special_token=['<s>', '<e>', '<unk>'], src_vocab_fpath='data/vocab.bpe.32000', train_file_pattern='data/train.tok.clean.bpe.32000.en-de', trg_vocab_fpath='data/vocab.bpe.32000', use_token_batch=True)
Warming up ...
batch: 0, sum loss: 12268.457031, avg loss: 10.885942, ppl: 53420.105469
batch: 1, sum loss: 12178.395508, avg loss: 10.863868, ppl: 52253.792969
batch: 2, sum loss: 8879.164062, avg loss: 10.815060, ppl: 49764.625000

Profiling ...
batch: 0, sum loss: 12244.498047, avg loss: 10.864683, ppl: 52296.417969
batch: 1, sum loss: 12141.288086, avg loss: 10.830766, ppl: 50552.402344
batch: 2, sum loss: 8841.737305, avg loss: 10.769473, ppl: 47546.957031
batch: 3, sum loss: 9886.705078, avg loss: 10.781576, ppl: 48125.917969
batch: 4, sum loss: 14117.681641, avg loss: 10.818147, ppl: 49918.488281
batch: 5, sum loss: 12966.675781, avg loss: 10.760727, ppl: 47132.917969
batch: 6, sum loss: 13092.765625, avg loss: 10.749397, ppl: 46601.933594
batch: 7, sum loss: 11771.145508, avg loss: 10.681621, ppl: 43548.066406
batch: 8, sum loss: 13585.663086, avg loss: 10.680552, ppl: 43501.578125
batch: 9, sum loss: 14588.139648, avg loss: 10.632755, ppl: 41471.234375

------------------------->     Profiling Report     <-------------------------

Place: All
Time unit: ms
Sorted by total time in descending order in the same thread

Event                                       Calls       Total       Min.        Max.        Ave.        
thread0::mul_grad                           960         650.981     0.339968    1.57696     0.678106    
thread0::matmul_grad                        370         565.571     0.264192    53.889      1.52857     
thread0::softmax_with_cross_entropy         10          469.709     45.5281     48.127      46.9709     
thread0::dropout                            500         374.273     0.579584    1.0496      0.748546    
thread0::matmul                             370         372.095     0.136192    34.4699     1.00566     
thread0::mul                                960         314.561     0.162816    0.748544    0.327668    
thread0::layer_norm_grad                    300         263.247     0.756736    0.930816    0.877489    
thread0::elementwise_add_grad               740         153.038     0.050176    0.909312    0.206808    
thread0::sum                                320         127.341     0.17408     3.48467     0.397939    
thread0::softmax_grad                       180         127.295     0.503808    0.910336    0.707197    
thread0::layer_norm                         300         123.037     0.352256    0.436224    0.410122    
thread0::adam                               1810        88.9652     0.008192    2.20774     0.049152    
thread0::elementwise_add                    740         83.4826     0.054272    0.25088     0.112814    
thread0::softmax                            180         79.7266     0.326656    0.587776    0.442926    
thread0::transpose                          720         61.1564     0.070656    2.22003     0.0849395   
thread0::transpose_grad                     720         58.6763     0.069632    0.088064    0.0814948   
thread0::softmax_with_cross_entropy_grad    10          53.2357     4.92954     5.63917     5.32357     
thread0::label_smooth                       10          50.7187     4.66534     5.3975      5.07187     
thread0::dropout_grad                       500         47.9016     0.077824    0.126976    0.0958033   
thread0::relu_grad                          120         41.0081     0.288768    0.364544    0.341734    
thread0::relu                               120         29.1768     0.205824    0.259072    0.24314     
thread0::scale                              420         24.8033     0.008192    0.06656     0.0590555   
thread0::fill_zeros_like                    1100        21.2132     0.008192    0.041984    0.0192847   
thread0::one_hot                            10          19.67       1.79712     2.08896     1.967       
thread0::reshape                            1110        15.6437     0.002048    0.063488    0.0140934   
thread0::lookup_table_grad                  40          13.7851     0.164864    0.53248     0.344627    
thread0::lookup_table                       40          4.0704      0.064512    0.181248    0.10176     
thread0::reshape_grad                       1110        3.08541     0.002048    0.004096    0.00277965  
thread0::feed                               190         1.96202     0.007168    0.034816    0.0103264   
thread0::fetch                              20          0.835584    0.033792    0.053248    0.0417792   
thread0::reduce_sum                         20          0.196608    0.009216    0.011264    0.0098304   
thread0::elementwise_mul                    10          0.094208    0.009216    0.01024     0.0094208   
thread0::reduce_sum_grad                    10          0.093184    0.009216    0.01024     0.0093184   
thread0::elementwise_mul_grad               10          0.089088    0.008192    0.009216    0.0089088   
thread0::elementwise_div                    10          0.089088    0.008192    0.009216    0.0089088   

Elapsed time: total 6.818273 s, in executor 5.527260 s
指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/models#1052
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7