diff --git a/doc/fluid/design/concepts/parallel_executor.md b/doc/fluid/design/concepts/parallel_executor.md index 9aed3b059a1595ba3971d7d5acfc0d16a731584b..4f88e27bed722e9f2f535e368926fe49b4e72e56 100644 --- a/doc/fluid/design/concepts/parallel_executor.md +++ b/doc/fluid/design/concepts/parallel_executor.md @@ -84,7 +84,7 @@ Running an operator can be asynchronized. There is a thread pool to execute an ` ## Synchronize GPU Kernels -The GPU is a non-blocking device. The different streams need be synchronized when switing streams. In current implementation, the synchronization based on the following algorithm: +The GPU is a non-blocking device. The different streams need be synchronized when switching streams. In current implementation, the synchronization based on the following algorithm: 1. `OpHandle` will record `DeviceContext` that it is used. 2. In `OpHandle::Run`, if the `DeviceContext` of current operator is different from `DeviceContext` of any input variable, just wait the generate operator of this input variable.