From b1e1f21f5b8f8a25e53b9f7c59bca9a1d749547e Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 1 May 2018 15:51:32 -0700 Subject: [PATCH] doc: Update data-structure documentation for ->gp_seq Signed-off-by: Paul E. McKenney --- .../Data-Structures/Data-Structures.html | 118 ++++++++++-------- 1 file changed, 63 insertions(+), 55 deletions(-) diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index 6c06e10bd04b..f5120a00f511 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -380,31 +380,26 @@ and therefore need no protection. as follows:
-  1   unsigned long gpnum;
-  2   unsigned long completed;
+  1   unsigned long gp_seq;
 

RCU grace periods are numbered, and -the ->gpnum field contains the number of the grace -period that started most recently. -The ->completed field contains the number of the -grace period that completed most recently. -If the two fields are equal, the RCU grace period that most recently -started has already completed, and therefore the corresponding -flavor of RCU is idle. -If ->gpnum is one greater than ->completed, -then ->gpnum gives the number of the current RCU -grace period, which has not yet completed. -Any other combination of values indicates that something is broken. -These two fields are protected by the root rcu_node's +the ->gp_seq field contains the current grace-period +sequence number. +The bottom two bits are the state of the current grace period, +which can be zero for not yet started or one for in progress. +In other words, if the bottom two bits of ->gp_seq are +zero, the corresponding flavor of RCU is idle. +Any other value in the bottom two bits indicates that something is broken. +This field is protected by the root rcu_node structure's ->lock field. -

There are ->gpnum and ->completed fields +

There are ->gp_seq fields in the rcu_node and rcu_data structures as well. The fields in the rcu_state structure represent the -most current values, and those of the other structures are compared -in order to detect the start of a new grace period in a distributed +most current value, and those of the other structures are compared +in order to detect the beginnings and ends of grace periods in a distributed fashion. The values flow from rcu_state to rcu_node (down the tree from the root to the leaves) to rcu_data. @@ -512,27 +507,47 @@ than to be heisenbugged out of existence. as follows:

-  1   unsigned long gpnum;
-  2   unsigned long completed;
+  1   unsigned long gp_seq;
+  2   unsigned long gp_seq_needed;
 
-

These fields are the counterparts of the fields of the same name in -the rcu_state structure. -They each may lag up to one behind their rcu_state -counterparts. -If a given rcu_node structure's ->gpnum and -->complete fields are equal, then this rcu_node +

The rcu_node structures' ->gp_seq fields are +the counterparts of the field of the same name in the rcu_state +structure. +They each may lag up to one step behind their rcu_state +counterpart. +If the bottom two bits of a given rcu_node structure's +->gp_seq field is zero, then this rcu_node structure believes that RCU is idle. -Otherwise, as with the rcu_state structure, -the ->gpnum field will be one greater than the -->complete fields, with ->gpnum -indicating which grace period this rcu_node believes -is still being waited for. +

The >gp_seq field of each rcu_node +structure is updated at the beginning and the end +of each grace period. + +

The ->gp_seq_needed fields record the +furthest-in-the-future grace period request seen by the corresponding +rcu_node structure. The request is considered fulfilled when +the value of the ->gp_seq field equals or exceeds that of +the ->gp_seq_needed field. -

The >gpnum field of each rcu_node -structure is updated at the beginning -of each grace period, and the ->completed fields are -updated at the end of each grace period. + + + + + + + +
 
Quick Quiz:
+ Suppose that this rcu_node structure doesn't see + a request for a very long time. + Won't wrapping of the ->gp_seq field cause + problems? +
Answer:
+ No, because if the ->gp_seq_needed field lags behind the + ->gp_seq field, the ->gp_seq_needed field + will be updated at the end of the grace period. + Modulo-arithmetic comparisons therefore will always get the + correct answer, even with wrapping. +
 

Quiescent-State Tracking
@@ -626,9 +641,8 @@ normal and expedited grace periods, respectively.

So the locking is absolutely required in - order to coordinate - clearing of the bits with the grace-period numbers in - ->gpnum and ->completed. + order to coordinate clearing of the bits with updating of the + grace-period sequence number in ->gp_seq.   @@ -1038,15 +1052,15 @@ out any rcu_data structure for which this flag is not set. as follows:

-  1   unsigned long completed;
-  2   unsigned long gpnum;
+  1   unsigned long gp_seq;
+  2   unsigned long gp_seq_needed;
   3   bool cpu_no_qs;
   4   bool core_needs_qs;
   5   bool gpwrap;
   6   unsigned long rcu_qs_ctr_snap;
 
-

The completed and gpnum +

The ->gp_seq and ->gp_seq_needed fields are the counterparts of the fields of the same name in the rcu_state and rcu_node structures. They may each lag up to one behind their rcu_node @@ -1054,15 +1068,9 @@ counterparts, but in CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_FULL kernels can lag arbitrarily far behind for CPUs in dyntick-idle mode (but these counters will catch up upon exit from dyntick-idle mode). -If a given rcu_data structure's ->gpnum and -->complete fields are equal, then this rcu_data +If the lower two bits of a given rcu_data structure's +->gp_seq are zero, then this rcu_data structure believes that RCU is idle. -Otherwise, as with the rcu_state and rcu_node -structure, -the ->gpnum field will be one greater than the -->complete fields, with ->gpnum -indicating which grace period this rcu_data believes -is still being waited for. @@ -1070,13 +1078,13 @@ is still being waited for.
 
All this replication of the grace period numbers can only cause massive confusion. - Why not just keep a global pair of counters and be done with it??? + Why not just keep a global sequence number and be done with it???
Answer:
- Because if there was only a single global pair of grace-period + Because if there was only a single global sequence numbers, there would need to be a single global lock to allow - safely accessing and updating them. + safely accessing and updating it. And if we are not going to have a single global lock, we need to carefully manage the numbers on a per-node basis. Recall from the answer to a previous Quick Quiz that the consequences @@ -1091,8 +1099,8 @@ CPU has not yet passed through a quiescent state, while the ->core_needs_qs flag indicates that the RCU core needs a quiescent state from the corresponding CPU. The ->gpwrap field indicates that the corresponding -CPU has remained idle for so long that the completed -and gpnum counters are in danger of overflow, which +CPU has remained idle for so long that the +gp_seq counter is in danger of overflow, which will cause the CPU to disregard the values of its counters on its next exit from idle. Finally, the rcu_qs_ctr_snap field is used to detect @@ -1130,10 +1138,10 @@ The CPU advances the callbacks in its rcu_data structure whenever it notices that another RCU grace period has completed. The CPU detects the completion of an RCU grace period by noticing that the value of its rcu_data structure's -->completed field differs from that of its leaf +->gp_seq field differs from that of its leaf rcu_node structure. Recall that each rcu_node structure's -->completed field is updated at the end of each +->gp_seq field is updated at the beginnings and ends of each grace period.

-- GitLab