Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
d3ed070e
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
d3ed070e
编写于
10月 15, 2018
作者:
S
sneaxiy
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
test=develop
上级
fb6201e9
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
5 addition
and
40 deletion
+5
-40
paddle/fluid/framework/parallel_executor.cc
paddle/fluid/framework/parallel_executor.cc
+5
-27
paddle/fluid/framework/parallel_executor.h
paddle/fluid/framework/parallel_executor.h
+0
-13
未找到文件。
paddle/fluid/framework/parallel_executor.cc
浏览文件 @
d3ed070e
...
@@ -64,8 +64,6 @@ ParallelExecutor::ParallelExecutor(
...
@@ -64,8 +64,6 @@ ParallelExecutor::ParallelExecutor(
const
ExecutionStrategy
&
exec_strategy
,
const
BuildStrategy
&
build_strategy
,
const
ExecutionStrategy
&
exec_strategy
,
const
BuildStrategy
&
build_strategy
,
size_t
num_trainers
,
size_t
trainer_id
)
size_t
num_trainers
,
size_t
trainer_id
)
:
member_
(
new
ParallelExecutorPrivate
(
places
))
{
:
member_
(
new
ParallelExecutorPrivate
(
places
))
{
is_alive_
.
test_and_set
();
member_
->
global_scope_
=
scope
;
member_
->
global_scope_
=
scope
;
member_
->
use_cuda_
=
exec_strategy
.
use_cuda_
;
member_
->
use_cuda_
=
exec_strategy
.
use_cuda_
;
member_
->
use_all_reduce_
=
member_
->
use_all_reduce_
=
...
@@ -248,15 +246,6 @@ void ParallelExecutor::BCastParamsToDevices(
...
@@ -248,15 +246,6 @@ void ParallelExecutor::BCastParamsToDevices(
void
ParallelExecutor
::
Run
(
const
std
::
vector
<
std
::
string
>
&
fetch_tensors
,
void
ParallelExecutor
::
Run
(
const
std
::
vector
<
std
::
string
>
&
fetch_tensors
,
const
std
::
string
&
fetched_var_name
)
{
const
std
::
string
&
fetched_var_name
)
{
// If ParallelExecutor has been destructed
// just return
if
(
!
is_alive_
.
test_and_set
())
return
;
// If ParallelExecutor is running
if
(
is_running_
.
test_and_set
())
{
PADDLE_THROW
(
"The previous ParallelExecutor::Run() has not stopped"
);
}
platform
::
RecordBlock
b
(
0
);
platform
::
RecordBlock
b
(
0
);
#ifdef PADDLE_WITH_CUDA
#ifdef PADDLE_WITH_CUDA
if
(
!
gcs_
.
empty
())
{
if
(
!
gcs_
.
empty
())
{
...
@@ -270,17 +259,9 @@ void ParallelExecutor::Run(const std::vector<std::string> &fetch_tensors,
...
@@ -270,17 +259,9 @@ void ParallelExecutor::Run(const std::vector<std::string> &fetch_tensors,
}
}
}
}
#endif
#endif
try
{
auto
fetch_data
=
member_
->
executor_
->
Run
(
fetch_tensors
);
auto
fetch_data
=
member_
->
executor_
->
Run
(
fetch_tensors
);
*
member_
->
global_scope_
->
Var
(
fetched_var_name
)
->
GetMutable
<
FeedFetchList
>
()
=
*
member_
->
global_scope_
->
Var
(
fetched_var_name
)
fetch_data
;
->
GetMutable
<
FeedFetchList
>
()
=
fetch_data
;
is_running_
.
clear
();
}
catch
(...)
{
is_running_
.
clear
();
if
(
is_alive_
.
test_and_set
())
{
std
::
rethrow_exception
(
std
::
current_exception
());
}
}
}
}
void
ParallelExecutor
::
FeedTensorsIntoLocalScopes
(
void
ParallelExecutor
::
FeedTensorsIntoLocalScopes
(
...
@@ -318,7 +299,6 @@ void ParallelExecutor::FeedAndSplitTensorIntoLocalScopes(
...
@@ -318,7 +299,6 @@ void ParallelExecutor::FeedAndSplitTensorIntoLocalScopes(
}
}
ParallelExecutor
::~
ParallelExecutor
()
{
ParallelExecutor
::~
ParallelExecutor
()
{
is_alive_
.
clear
();
if
(
member_
->
own_local_scope_
)
{
if
(
member_
->
own_local_scope_
)
{
for
(
size_t
i
=
1
;
i
<
member_
->
local_scopes_
.
size
();
++
i
)
{
for
(
size_t
i
=
1
;
i
<
member_
->
local_scopes_
.
size
();
++
i
)
{
Scope
*
local_scope
=
member_
->
local_scopes_
[
i
];
Scope
*
local_scope
=
member_
->
local_scopes_
[
i
];
...
@@ -328,10 +308,8 @@ ParallelExecutor::~ParallelExecutor() {
...
@@ -328,10 +308,8 @@ ParallelExecutor::~ParallelExecutor() {
}
}
}
}
while
(
is_running_
.
test_and_set
())
{
// member_ must be destructed before gcs_ since the destructor of
// wait unitl all threads have been stopped
// ReferenceCountOpHandle use raw pointers of gcs_ inside.
}
member_
.
reset
();
member_
.
reset
();
}
}
...
...
paddle/fluid/framework/parallel_executor.h
浏览文件 @
d3ed070e
...
@@ -77,19 +77,6 @@ class ParallelExecutor {
...
@@ -77,19 +77,6 @@ class ParallelExecutor {
std
::
unique_ptr
<
ParallelExecutorPrivate
>
member_
;
std
::
unique_ptr
<
ParallelExecutorPrivate
>
member_
;
// FIXME(zjl): HOT-FIX
// A flag to indicate whether ParallelExecutor is destructed.
// In Python side, when users interrupt the process manually, such as
// keyboard interrupt, ParallelExecutor may be destructed before Run() ends.
// Thus, disturbing exception messages would occur when interrupted.
// If is_alive_ is false, we would discard the last exception thrown by Run().
// Since std::atomic_flag is always lock-free and faster than
// std::atomic<bool>, we choose std::atomic_flag to be the flag here.
std
::
atomic_flag
is_alive_
=
ATOMIC_FLAG_INIT
;
// A flag to indicate whether ParallelExecutor is running.
std
::
atomic_flag
is_running_
=
ATOMIC_FLAG_INIT
;
#ifdef PADDLE_WITH_CUDA
#ifdef PADDLE_WITH_CUDA
// ref_cnts_ is only initialized when ParallelExecutor constructs, and then
// ref_cnts_ is only initialized when ParallelExecutor constructs, and then
// keeps unchanged
// keeps unchanged
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录