diff --git a/doc/design/parallel_executor.md b/doc/design/parallel_executor.md index 076c55d281f3d747d4c9e7dd2795af50a93d9ced..9aed3b059a1595ba3971d7d5acfc0d16a731584b 100644 --- a/doc/design/parallel_executor.md +++ b/doc/design/parallel_executor.md @@ -8,7 +8,7 @@ The executor is a very naive interpreter. It runs operators one by one. We can u We want a `ProgramDesc` can be run on different nodes. It is better not to contain device information in `ProgramDesc`. However, we can write a high-performance interpreter, which can hold an alternative intermediate representation of `ProgramDesc`, to take full usage of Multi-GPUs. -ParallelExecutor is an interpreter of `ProgramDesc` which will [out-of-order execute](Out-of-order execution) `Program` in data parallelism mode and maximise the utility of Multi-GPUs. +ParallelExecutor is an interpreter of `ProgramDesc` which will [out-of-order execute](https://en.wikipedia.org/wiki/Out-of-order_execution) `Program` in data parallelism mode and maximise the utility of Multi-GPUs. ## Overview of MultiGPUs logic