diff --git a/paddle/fluid/operators/similarity_focus_op.cc b/paddle/fluid/operators/similarity_focus_op.cc
index 768b6903b741c58d565f5b98da3349996f9ae76a..9612f82b6d45dc4e08bfe288ddd1c7790875ee4d 100644
--- a/paddle/fluid/operators/similarity_focus_op.cc
+++ b/paddle/fluid/operators/similarity_focus_op.cc
@@ -42,8 +42,9 @@ Generate a similarity focus mask with the same shape of input using the followin
 2. For each index, find the largest numbers in the tensor T, so that the same 
    row and same column has at most one number(what it means is that if the 
    largest number has been found in the i-th row and the j-th column, then 
-   the numbers in the i-th or j-th column will be skipped. Obviously there 
-   will be min(B, C) numbers), and mark the corresponding position of the 
+   the numbers in the i-th row or j-th column will be skipped. And then the 
+   next largest number will be selected from the remaining numbers. Obviously 
+   there will be min(B, C) numbers), and mark the corresponding position of the 
    3-D similarity focus mask as 1, otherwise as 0. Do elementwise-or for 
    each index.
 3. Broadcast the 3-D similarity focus mask to the same shape of input X.
diff --git a/python/paddle/fluid/layers/nn.py b/python/paddle/fluid/layers/nn.py
index be0e75161bbff10dd351c24f7a56e022f20ee036..e3737bf6fe0fb9ba26b133006cb19bf8927b55ef 100644
--- a/python/paddle/fluid/layers/nn.py
+++ b/python/paddle/fluid/layers/nn.py
@@ -7567,8 +7567,9 @@ def similarity_focus(input, axis, indexes, name=None):
     2. For each index, find the largest numbers in the tensor T, so that the same 
        row and same column has at most one number(what it means is that if the 
        largest number has been found in the i-th row and the j-th column, then 
-       the numbers in the i-th or j-th column will be skipped. Obviously there 
-       will be min(B, C) numbers), and mark the corresponding position of the 
+       the numbers in the i-th row or j-th column will be skipped. And then the 
+       next largest number will be selected from the remaining numbers. Obviously 
+       there will be min(B, C) numbers), and mark the corresponding position of the 
        3-D similarity focus mask as 1, otherwise as 0. Do elementwise-or for 
        each index.
     3. Broadcast the 3-D similarity focus mask to the same shape of input X.