Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
BaiXuePrincess
Paddle
提交
ca9c8b41
P
Paddle
项目概览
BaiXuePrincess
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
ca9c8b41
编写于
3月 07, 2020
作者:
Z
Zhang Ting
提交者:
GitHub
3月 07, 2020
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
fix compute ratio of profile, test=develop (#22872)
上级
dbb0b9b3
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
24 addition
and
12 deletion
+24
-12
paddle/fluid/platform/profiler_helper.h
paddle/fluid/platform/profiler_helper.h
+24
-12
未找到文件。
paddle/fluid/platform/profiler_helper.h
浏览文件 @
ca9c8b41
...
...
@@ -355,25 +355,37 @@ void SetEvent(bool merge_thread, const Event &analyze_event,
}
}
void
ComputeOverhead
(
const
std
::
multimap
<
std
::
string
,
EventItem
>
&
sub_child_map
,
void
UpdateGpuMemcpy
(
const
EventItem
&
item
,
EventItem
*
memcpy_async
,
EventItem
*
memcpy_sync
)
{
if
(
item
.
name
.
find
(
"GpuMemcpyAsync"
)
!=
std
::
string
::
npos
)
{
memcpy_async
->
calls
+=
item
.
calls
;
memcpy_async
->
total_time
+=
item
.
total_time
;
memcpy_async
->
ratio
+=
item
.
ratio
;
}
else
if
(
item
.
name
.
find
(
"GpuMemcpySync"
)
!=
std
::
string
::
npos
)
{
memcpy_sync
->
calls
+=
item
.
calls
;
memcpy_sync
->
total_time
+=
item
.
total_time
;
memcpy_sync
->
ratio
+=
item
.
ratio
;
}
}
void
ComputeOverhead
(
const
std
::
vector
<
EventItem
>
&
main_event_items
,
const
std
::
multimap
<
std
::
string
,
EventItem
>
&
sub_child_map
,
OverHead
*
overhead
)
{
EventItem
memcpy_async
=
{
"GpuMemcpyAsync"
,
0
,
0.
,
0.
,
0.
,
0.
,
0.
,
0.
,
0.0
f
,
EventRole
::
kOrdinary
};
EventItem
memcpy_sync
=
{
"GpuMemcpySync"
,
0
,
0.
,
0.
,
0.
,
0.
,
0.
,
0.
,
0.0
f
,
EventRole
::
kOrdinary
};
// GpuMemcpy may be in main_event_items
for
(
auto
&
item
:
main_event_items
)
{
UpdateGpuMemcpy
(
item
,
&
memcpy_async
,
&
memcpy_sync
);
}
for
(
auto
it
=
sub_child_map
.
begin
();
it
!=
sub_child_map
.
end
();
it
++
)
{
if
(
it
->
second
.
name
.
find
(
"compute"
)
!=
std
::
string
::
npos
)
{
if
(
it
->
second
.
name
.
find
(
"compute"
)
!=
std
::
string
::
npos
&&
it
->
second
.
name
.
find
(
"compute/"
)
==
std
::
string
::
npos
)
{
overhead
->
compute_ratio
+=
it
->
second
.
ratio
;
}
if
(
it
->
second
.
name
.
find
(
"GpuMemcpyAsync"
)
!=
std
::
string
::
npos
)
{
memcpy_async
.
calls
+=
it
->
second
.
calls
;
memcpy_async
.
total_time
+=
it
->
second
.
total_time
;
memcpy_async
.
ratio
+=
it
->
second
.
ratio
;
}
else
if
(
it
->
second
.
name
.
find
(
"GpuMemcpySync"
)
!=
std
::
string
::
npos
)
{
memcpy_sync
.
calls
+=
it
->
second
.
calls
;
memcpy_sync
.
total_time
+=
it
->
second
.
total_time
;
memcpy_sync
.
ratio
+=
it
->
second
.
ratio
;
}
UpdateGpuMemcpy
(
it
->
second
,
&
memcpy_async
,
&
memcpy_sync
);
}
overhead
->
framework_ratio
=
1.0
f
-
overhead
->
compute_ratio
;
overhead
->
memcpy_item
.
calls
=
memcpy_async
.
calls
+
memcpy_sync
.
calls
;
...
...
@@ -637,7 +649,7 @@ void AnalyzeEvent(
if
((
*
analyze_events
).
size
()
==
1
)
{
overhead
->
total_time
=
total
;
overhead
->
print
=
true
;
ComputeOverhead
(
sub_child_map
,
overhead
);
ComputeOverhead
(
main_event_items
,
sub_child_map
,
overhead
);
}
// sort
if
(
sorted_by
!=
EventSortingKey
::
kDefault
)
{
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录