未验证 提交 70bc4889 编写于 作者: G guofei 提交者: GitHub

Fix the error of recurrnet op in multithreading in eval process (#24357)

CreateStepScopes in recurrent op also clears scopes, which can cause segmentation fault un multi-threading. We add a lock in this PR but it may slow the computation process. We will fix in another way in next PR.
上级 01e45a06
......@@ -197,7 +197,6 @@ void RecurrentOp::RunImpl(const framework::Scope &scope,
auto &dev_ctx = *pool.Get(place);
VLOG(3) << "Static RNN input sequence length = " << seq_len;
StepScopes scopes = CreateStepScopes(dev_ctx, scope, seq_len);
auto reverse = Attr<bool>(kReverse);
framework::Executor executor(place);
......@@ -208,6 +207,13 @@ void RecurrentOp::RunImpl(const framework::Scope &scope,
*program, block->ID(), Attr<std::vector<std::string>>(
kSkipEagerDeletionVars) /*skip_ref_cnt_vars*/);
static std::mutex mutex;
std::lock_guard<std::mutex> lock(mutex);
StepScopes scopes = CreateStepScopes(dev_ctx, scope, seq_len);
// TODO(gfwm2013) Function CreateStepScopes would make segmentation fault in
// multithreading in eval process, so we use a mutex before function
// CreateStepScopes to make sure that the computing process is correct. This
// problem will fix in next pull request.
for (size_t i = 0; i < seq_len; ++i) {
size_t seq_offset = reverse ? seq_len - i - 1 : i;
VLOG(3) << "Recurrent operate at the time step " << seq_offset;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册