1. 11 8月, 2008 4 次提交
    • P
      lockdep: map_acquire · 4f3e7524
      Peter Zijlstra 提交于
      Most the free-standing lock_acquire() usages look remarkably similar, sweep
      them into a new helper.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4f3e7524
    • D
      lockdep: shrink held_lock structure · f82b217e
      Dave Jones 提交于
      struct held_lock {
              u64                        prev_chain_key;       /*     0     8 */
              struct lock_class *        class;                /*     8     8 */
              long unsigned int          acquire_ip;           /*    16     8 */
              struct lockdep_map *       instance;             /*    24     8 */
              int                        irq_context;          /*    32     4 */
              int                        trylock;              /*    36     4 */
              int                        read;                 /*    40     4 */
              int                        check;                /*    44     4 */
              int                        hardirqs_off;         /*    48     4 */
      
              /* size: 56, cachelines: 1 */
              /* padding: 4 */
              /* last cacheline: 56 bytes */
      };
      
      struct held_lock {
              u64                        prev_chain_key;       /*     0     8 */
              long unsigned int          acquire_ip;           /*     8     8 */
              struct lockdep_map *       instance;             /*    16     8 */
              unsigned int               class_idx:11;         /*    24:21  4 */
              unsigned int               irq_context:2;        /*    24:19  4 */
              unsigned int               trylock:1;            /*    24:18  4 */
              unsigned int               read:2;               /*    24:16  4 */
              unsigned int               check:2;              /*    24:14  4 */
              unsigned int               hardirqs_off:1;       /*    24:13  4 */
      
              /* size: 32, cachelines: 1 */
              /* padding: 4 */
              /* bit_padding: 13 bits */
              /* last cacheline: 32 bytes */
      };
      
      [mingo@elte.hu: shrunk hlock->class too]
      [peterz@infradead.org: fixup bit sizes]
      Signed-off-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      f82b217e
    • P
      lockdep: re-annotate scheduler runqueues · 1b12bbc7
      Peter Zijlstra 提交于
      Instead of using a per-rq lock class, use the regular nesting operations.
      
      However, take extra care with double_lock_balance() as it can release the
      already held rq->lock (and therefore change its nesting class).
      
      So what can happen is:
      
       spin_lock(rq->lock);	// this rq subclass 0
      
       double_lock_balance(rq, other_rq);
         // release rq
         // acquire other_rq->lock subclass 0
         // acquire rq->lock subclass 1
      
       spin_unlock(other_rq->lock);
      
      leaving you with rq->lock in subclass 1
      
      So a subsequent double_lock_balance() call can try to nest a subclass 1
      lock while already holding a subclass 1 lock.
      
      Fix this by introducing double_unlock_balance() which releases the other
      rq's lock, but also re-sets the subclass for this rq's lock to 0.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1b12bbc7
    • P
      lockdep: lock_set_subclass - reset a held lock's subclass · 64aa348e
      Peter Zijlstra 提交于
      this can be used to reset a held lock's subclass, for arbitrary-depth
      iterated data structures such as trees or lists which have per-node
      locks.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      64aa348e
  2. 01 8月, 2008 3 次提交
    • P
      lockdep: change scheduler annotation · 5e710e37
      Peter Zijlstra 提交于
      While thinking about David's graph walk lockdep patch it _finally_
      dawned on me that there is no reason we have a lock class per cpu ...
      
      Sorry for being dense :-/
      
      The below changes the annotation from a lock class per cpu, to a single
      nested lock, as the scheduler never holds more that 2 rq locks at a time
      anyway.
      
      If there was code requiring holding all rq locks this would not work and
      the original annotation would be the only option, but that not being the
      case, this is a much lighter one.
      
      Compiles and boots on a 2-way x86_64.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5e710e37
    • D
      debug_locks: set oops_in_progress if we will log messages. · e0fdace1
      David Miller 提交于
      Otherwise lock debugging messages on runqueue locks can deadlock the
      system due to the wakeups performed by printk().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e0fdace1
    • D
      lockdep: fix combinatorial explosion in lock subgraph traversal · 419ca3f1
      David Miller 提交于
      When we traverse the graph, either forwards or backwards, we
      are interested in whether a certain property exists somewhere
      in a node reachable in the graph.
      
      Therefore it is never necessary to traverse through a node more
      than once to get a correct answer to the given query.
      
      Take advantage of this property using a global ID counter so that we
      need not clear all the markers in all the lock_class entries before
      doing a traversal.  A new ID is choosen when we start to traverse, and
      we continue through a lock_class only if it's ID hasn't been marked
      with the new value yet.
      
      This short-circuiting is essential especially for high CPU count
      systems.  The scheduler has a runqueue per cpu, and needs to take
      two runqueue locks at a time, which leads to long chains of
      backwards and forwards subgraphs from these runqueue lock nodes.
      Without the short-circuit implemented here, a graph traversal on
      a runqueue lock can take up to (1 << (N - 1)) checks on a system
      with N cpus.
      
      For anything more than 16 cpus or so, lockdep will eventually bring
      the machine to a complete standstill.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      419ca3f1
  3. 29 7月, 2008 33 次提交