1. 30 10月, 2010 1 次提交
  2. 23 10月, 2010 5 次提交
    • J
      kdb,debug_core: adjust master cpu switch logic against new debug_core locking · 495363d3
      Jason Wessel 提交于
      The kdb shell needs to enforce switching back to the original CPU that
      took the exception before restoring normal kernel execution.  Resuming
      from a different CPU than what took the original exception will cause
      problems with spin locks that are freed from the a different processor
      than had taken the lock.
      
      The special logic in dbg_cpu_switch() can go away entirely with
      because the state of what cpus want to be masters or slaves will
      remain unchanged between entry and exit of the debug_core exception
      context.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      495363d3
    • J
      debug_core: refactor locking for master/slave cpus · dfee3a7b
      Jason Wessel 提交于
      For quite some time there have been problems with memory barriers and
      various races with NMI on multi processor systems using the kernel
      debugger.  The algorithm for entering the kernel debug core and
      resuming kernel execution was racy and had several known edge case
      problems with attempting to debug something on a heavily loaded system
      using breakpoints that are hit repeatedly and quickly.
      
      The prior "locking" design entry worked as follows:
      
        * The atomic counter kgdb_active was used with atomic exchange in
          order to elect a master cpu out of all the cpus that may have
          taken a debug exception.
        * The master cpu increments all elements of passive_cpu_wait[].
        * The master cpu issues the round up cpus message.
        * Each "slave cpu" that enters the debug core increments its own
          element in cpu_in_kgdb[].
        * Each "slave cpu" spins on passive_cpu_wait[] until it becomes 0.
        * The master cpu debugs the system.
      
      The new scheme removes the two arrays of atomic counters and replaces
      them with 2 single counters.  One counter is used to count the number
      of cpus waiting to become a master cpu (because one or more hit an
      exception). The second counter is use to indicate how many cpus have
      entered as slave cpus.
      
      The new entry logic works as follows:
      
        * One or more cpus enters via kgdb_handle_exception() and increments
          the masters_in_kgdb. Each cpu attempts to get the spin lock called
          dbg_master_lock.
        * The master cpu sets kgdb_active to the current cpu.
        * The master cpu takes the spinlock dbg_slave_lock.
        * The master cpu asks to round up all the other cpus.
        * Each slave cpu that is not already in kgdb_handle_exception()
          will enter and increment slaves_in_kgdb.  Each slave will now spin
          try_locking on dbg_slave_lock.
        * The master cpu waits for the sum of masters_in_kgdb and slaves_in_kgdb
          to be equal to the sum of the online cpus.
        * The master cpu debugs the system.
      
      In the new design the kgdb_active can only be changed while holding
      dbg_master_lock.  Stress testing has not turned up any further
      entry/exit races that existed in the prior locking design.  The prior
      locking design suffered from atomic variables not being truly atomic
      (in the capacity as used by kgdb) along with memory barrier races.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NDongdong Deng <dongdong.deng@windriver.com>
      dfee3a7b
    • D
      debug_core: disable hw_breakpoints on all cores in kgdb_cpu_enter() · c1bb9a9c
      Dongdong Deng 提交于
      The slave cpus do not have the hw breakpoints disabled upon entry to
      the debug_core and as a result could cause unrecoverable recursive
      faults on badly placed breakpoints, or get out of sync with the arch
      specific hw breakpoint operations.
      
      This patch addresses the problem by invoking kgdb_disable_hw_debug()
      earlier in kgdb_enter_cpu for each cpu that enters the debug core.
      
      The hw breakpoint dis/enable flow should be:
      
      master_debug_cpu   slave_debug_cpu
               \              /
                kgdb_cpu_enter
                      |
              kgdb_disable_hw_debug --> uninstall pre-enabled hw_breakpoint
                      |
       do add/rm dis/enable operates to hw_breakpoints on master_debug_cpu..
                      |
              correct_hw_break --> correct/install the enabled hw_breakpoint
                      |
                 leave_kgdb
      Signed-off-by: NDongdong Deng <dongdong.deng@windriver.com>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      c1bb9a9c
    • J
      debug_core: stop rcu warnings on kernel resume · fb70b588
      Jason Wessel 提交于
      When returning from the kernel debugger reset the rcu jiffies_stall
      value to prevent the rcu stall detector from sending NMI events which
      invoke a stack dump for each cpu in the system.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      fb70b588
    • J
      debug_core: move all watch dog syncs to a single function · 16cdc628
      Jason Wessel 提交于
      Move the various clock and watch dog syncs to a single function in
      advance of adding another sync for the rcu stall detector.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      16cdc628
  3. 20 8月, 2010 1 次提交
  4. 05 8月, 2010 1 次提交
  5. 22 7月, 2010 1 次提交
  6. 19 7月, 2010 1 次提交
  7. 21 5月, 2010 11 次提交
    • J
      x86, kgdb, init: Add early and late debug states · 0b4b3827
      Jason Wessel 提交于
      The kernel debugger can operate well before mm_init(), but the x86
      hardware breakpoint code which uses the perf api requires that the
      kernel allocators are initialized.
      
      This means the kernel debug core needs to provide an optional arch
      specific call back to allow the initialization functions to run after
      the kernel has been further initialized.
      
      The kdb shell already had a similar restriction with an early
      initialization and late initialization.  The kdb_init() was moved into
      the debug core's version of the late init which is called
      dbg_late_init();
      
      CC: kgdb-bugreport@lists.sourceforge.net
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      0b4b3827
    • J
      kdb,debug_core: Allow the debug core to receive a panic notification · 4402c153
      Jason Wessel 提交于
      It is highly desirable to trap into kdb on panic.  The debug core will
      attempt to register as the first in line for the panic notifier.
      
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      4402c153
    • J
      debug_core,kdb: Allow the debug core to process a recursive debug entry · 6d906340
      Jason Wessel 提交于
      This allows kdb to debug a crash with in the kms code with a
      single level recursive re-entry.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      6d906340
    • J
      kgdb: Add the ability to schedule a breakpoint via a tasklet · 1cee5e35
      Jason Wessel 提交于
      Some kgdb I/O modules require the ability to create a breakpoint
      tasklet, such as kgdboc and external modules such as kgdboe.  The
      breakpoint tasklet is used as an asynchronous entry point into the
      debugger which will have a different function scope than the current
      execution path where it might not be safe to have an inline
      breakpoint.  This is true of some of the kgdb I/O drivers which share
      code with kgdb and rest of the kernel users.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      1cee5e35
    • J
      x86,kgdb: Add low level debug hook · f503b5ae
      Jason Wessel 提交于
      The only way the debugger can handle a trap in inside rcu_lock,
      notify_die, or atomic_notifier_call_chain without a triple fault is
      to have a low level "first opportunity handler" in the int3 exception
      handler.
      
      Generally this will be something the vast majority of folks will not
      need, but for those who need it, it is added as a kernel .config
      option called KGDB_LOW_LEVEL_TRAP.
      
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: H. Peter Anvin <hpa@zytor.com>
      CC: x86@kernel.org
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f503b5ae
    • J
      kgdb: remove post_primary_code references · 98ec1878
      Jason Wessel 提交于
      Remove all the references to the kgdb_post_primary_code.  This
      function serves no useful purpose because you can obtain the same
      information from the "struct kgdb_state *ks" from with in the
      debugger, if for some reason you want the data.
      
      Also remove the unintentional duplicate assignment for ks->ex_vector.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      98ec1878
    • J
      kgdb: gdb "monitor" -> kdb passthrough · a0de055c
      Jason Wessel 提交于
      One of the driving forces behind integrating another front end (kdb)
      to the debug core is to allow front end commands to be accessible via
      gdb's monitor command.  It is true that you could write gdb macros to
      get certain data, but you may want to just use gdb to access the
      commands that are available in the kdb front end.
      
      This patch implements the Rcmd gdb stub packet.  In gdb you access
      this with the "monitor" command.  For instance you could type "monitor
      help", "monitor lsmod" or "monitor ps A" etc...
      
      There is no error checking or command restrictions on what you can and
      cannot access at this point.  Doing something like trying to set
      breakpoints with the monitor command is going to cause nothing but
      problems.  Perhaps in the future only the commands that are actually
      known to work with the gdb monitor command will be available.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      a0de055c
    • J
      kgdb,8250,pl011: Return immediately from console poll · f5316b4a
      Jason Wessel 提交于
      The design of the kdb shell requires that every device that can
      provide input to kdb have a polling routine that exits immediately if
      there is no character available.  This is required in order to get the
      page scrolling mechanism working.
      
      Changing the kernel debugger I/O API to require all polling character
      routines to exit immediately if there is no data allows the kernel
      debugger to process multiple input channels.
      
      NO_POLL_CHAR will be the return code to the polling routine when ever
      there is no character available.
      
      CC: linux-serial@vger.kernel.org
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f5316b4a
    • J
      kgdb: core changes to support kdb · dcc78711
      Jason Wessel 提交于
      These are the minimum changes to the kgdb core in order to enable an
      API to connect a new front end (kdb) to the debug core.
      
      This patch introduces the dbg_kdb_mode variable controls where the
      user level I/O is routed.  It will be routed to the gdbstub (kgdb) or
      to the kdb front end which is a simple shell available over the kgdboc
      connection.
      
      You can switch back and forth between kdb or the gdb stub mode of
      operation dynamically.  From gdb stub mode you can blindly type
      "$3#33", or from the kdb mode you can enter "kgdb" to switch to the
      gdb stub.
      
      The logic in the debug core depends on kdb to look for the typical gdb
      connection sequences and return immediately with KGDB_PASS_EVENT if a
      gdb serial command sequence is detected.  That should allow a
      reasonably seamless transition between kdb -> gdb without leaving the
      kernel exception state.  The two gdb serial queries that kdb is
      responsible for detecting are the "?" and "qSupported" packets.
      
      CC: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NMartin Hicks <mort@sgi.com>
      dcc78711
    • J
      Separate the gdbstub from the debug core · 53197fc4
      Jason Wessel 提交于
      Split the former kernel/kgdb.c into debug_core.c which contains the
      kernel debugger exception logic and to the gdbstub.c which contains
      the logic for allowing gdb to talk to the debug core.
      
      This also created a private include file called debug_core.h which
      contains all the definitions to glue the debug_core to any other
      debugger connections.
      
      CC: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      53197fc4
    • J
      Move kernel/kgdb.c to kernel/debug/debug_core.c · c4338209
      Jason Wessel 提交于
      Move kgdb.c in preparation to separate the gdbstub from the debug
      core and exception handling.
      
      CC: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      c4338209
  8. 03 4月, 2010 4 次提交
  9. 01 2月, 2010 1 次提交
    • J
      softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume · d6ad3e28
      Jason Wessel 提交于
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, sched_clock() gets
      the time from hardware such as the TSC on x86. In this
      configuration kgdb will report a softlock warning message on
      resuming or detaching from a debug session.
      
      Sequence of events in the problem case:
      
       1) "cpu sched clock" and "hardware time" are at 100 sec prior
          to a call to kgdb_handle_exception()
      
       2) Debugger waits in kgdb_handle_exception() for 80 sec and on
          exit the following is called ...  touch_softlockup_watchdog() -->
          __raw_get_cpu_var(touch_timestamp) = 0;
      
       3) "cpu sched clock" = 100s (it was not updated, because the
          interrupt was disabled in kgdb) but the "hardware time" = 180 sec
      
       4) The first timer interrupt after resuming from
          kgdb_handle_exception updates the watchdog from the "cpu sched clock"
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> check (touch_timestamp == 0) (it is "YES"
      here, we have set "touch_timestamp = 0" at kgdb) -->
      __touch_softlockup_watchdog() ***(A)--> reset "touch_timestamp"
      to "get_timestamp()" (Here, the "touch_timestamp" will still be
      set to 100s.)  ...
      
          scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched
          clock" to "hardware time" = 180s) ...  }
      
       5) The Second timer interrupt handler appears to have a large
          jump and trips the softlockup warning.
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> "cpu sched clock" - "touch_timestamp" =
      180s-100s > 60s --> printk "soft lockup error messages" ...  }
      
      note: ***(A) reset "touch_timestamp" to
      "get_timestamp(this_cpu)"
      
      Why is "touch_timestamp" 100 sec, instead of 180 sec?
      
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, the call trace of
      get_timestamp() is:
      
      get_timestamp(this_cpu)
       -->cpu_clock(this_cpu)
       -->sched_clock_cpu(this_cpu)
       -->__update_sched_clock(sched_clock_data, now)
      
      The __update_sched_clock() function uses the GTOD tick value to
      create a window to normalize the "now" values.  So if "now"
      value is too big for sched_clock_data, it will be ignored.
      
      The fix is to invoke sched_clock_tick() to update "cpu sched
      clock" in order to recover from this state.  This is done by
      introducing the function touch_softlockup_watchdog_sync(). This
      allows kgdb to request that the sched clock is updated when the
      watchdog thread runs the first time after a resume from kgdb.
      
      [yong.zhang0@gmail.com: Use per cpu instead of an array]
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NDongdong Deng <Dongdong.Deng@windriver.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: peterz@infradead.org
      LKML-Reference: <1264631124-4837-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d6ad3e28
  10. 30 1月, 2010 1 次提交
    • J
      x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API · cc096749
      Jason Wessel 提交于
      In the 2.6.33 kernel, the hw_breakpoint API is now used for the
      performance event counters.  The hw_breakpoint_handler() now
      consumes the hw breakpoints that were previously set by kgdb
      arch specific code.  In order for kgdb to work in conjunction
      with this core API change, kgdb must use some of the low level
      functions of the hw_breakpoint API to install, uninstall, and
      deal with hw breakpoint reservations.
      
      The kgdb core required a change to call kgdb_disable_hw_debug
      anytime a slave cpu enters kgdb_wait() in order to keep all the
      hw breakpoints in sync as well as to prevent hitting a hw
      breakpoint while kgdb is active.
      
      During the architecture specific initialization of kgdb, it will
      pre-allocate 4 disabled (struct perf event **) structures.  Kgdb
      will use these to manage the capabilities for the 4 hw
      breakpoint registers, per cpu.  Right now the hw_breakpoint API
      does not have a way to ask how many breakpoints are available,
      on each CPU so it is possible that the install of a breakpoint
      might fail when kgdb restores the system to the run state.  The
      intent of this patch is to first get the basic functionality of
      hw breakpoints working and leave it to the person debugging the
      kernel to understand what hw breakpoints are in use and what
      restrictions have been imposed as a result.  Breakpoint
      constraints will be dealt with in a future patch.
      
      While atomic, the x86 specific kgdb code will call
      arch_uninstall_hw_breakpoint() and arch_install_hw_breakpoint()
      to manage the cpu specific hw breakpoints.
      
      The net result of these changes allow kgdb to use the same pool
      of hw_breakpoints that are used by the perf event API, but
      neither knows about future reservations for the available hw
      breakpoint slots.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: torvalds@linux-foundation.org
      LKML-Reference: <1264719883-7285-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc096749
  11. 11 12月, 2009 4 次提交
    • J
      kgdb: Always process the whole breakpoint list on activate or deactivate · 7f8b7ed6
      Jason Wessel 提交于
      This patch fixes 2 edge cases in using kgdb in conjunction with gdb.
      
      1) kgdb_deactivate_sw_breakpoints() should process the entire array of
         breakpoints.  The failure to do so results in breakpoints that you
         cannot remove, because a break point can only be removed if its
         state flag is set to BP_SET.
      
         The easy way to duplicate this problem is to plant a break point in
         a kernel module and then unload the kernel module.
      
      2) kgdb_activate_sw_breakpoints() should process the entire array of
         breakpoints.  The failure to do so results in missed breakpoints
         when a breakpoint cannot be activated.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      7f8b7ed6
    • J
      kgdb: continue and warn on signal passing from gdb · d625e9c0
      Jason Wessel 提交于
      On some architectures for the segv trap, gdb wants to pass the signal
      back on continue.  For kgdb this is not the default behavior, because
      it can cause the kernel to crash if you arbitrarily pass back a
      exception outside of kgdb.
      
      Instead of causing instability, pass a message back to gdb about the
      supported kgdb signal passing and execute a standard kgdb continue
      operation.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      d625e9c0
    • J
      kgdb: allow for cpu switch when single stepping · 028e7b17
      Jason Wessel 提交于
      The kgdb core should not assume that a single step operation of a
      kernel thread will complete on the same CPU.  The single step flag is
      set at the "thread" level and it is possible in a multi cpu system
      that a kernel thread can get scheduled on another cpu the next time it
      is run.
      
      As a further safety net in case a slave cpu is hung, the debug master
      cpu will try 100 times before giving up and assuming control of the
      slave cpus is no longer possible.  It is more useful to be able to get
      some information out of kgdb instead of spinning forever.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      028e7b17
    • J
      kgdb: Read buffer overflow · 84667d48
      Jason Wessel 提交于
      Roel Kluin reported an error found with Parfait.  Where we want to
      ensure that that kgdb_info[-1] never gets accessed.
      
      Also check to ensure any negative tid does not exceed the size of the
      shadow CPU array, else report critical debug context because it is an
      internal kgdb failure.
      Reported-by: NRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      84667d48
  12. 04 11月, 2009 1 次提交
  13. 15 5月, 2009 1 次提交
  14. 07 10月, 2008 1 次提交
  15. 26 9月, 2008 2 次提交
  16. 01 8月, 2008 2 次提交
    • J
      kgdb: fix gdb serial thread queries · 25fc9999
      Jason Wessel 提交于
      The command "info threads" did not work correctly with kgdb.  It would
      result in a silent kernel hang if used.
      
      This patach addresses several problems.
       - Fix use of deprecated NR_CPUS
       - Fix kgdb to not walk linearly through the pid space
       - Correctly implement shadow pids
       - Change the threads per query to a #define
       - Fix kgdb_hex2long to work with negated values
      
      The threads 0 and -1 are reserved to represent the current task.  That
      means that CPU 0 will start with a shadow thread id of -2, and CPU 1
      will have a shadow thread id of -3, etc...
      
      From the debugger you can switch to a shadow thread to see what one of
      the other cpus was doing, however it is not possible to execute run
      control operations on any other cpu execept the cpu executing the
      kgdb_handle_exception().
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      25fc9999
    • J
      kgdb: fix kgdb_validate_break_address to perform a mem write · a9b60bf4
      Jason Wessel 提交于
      A regression to the kgdb core was found in the case of using the
      CONFIG_DEBUG_RODATA kernel option.  When this option is on, a breakpoint
      cannot be written into any readonly memory page.  When an external
      debugger requests a breakpoint to get set, the
      kgdb_validate_break_address() was only checking to see if the address
      to place the breakpoint was readable and lacked a write check.
      
      This patch changes the validate routine to try reading (via the
      breakpoint set request) and also to try immediately writing the break
      point.  If either fails, an error is correctly returned and the
      debugger behaves correctly.  Then an end user can make the
      descision to use hardware breakpoints.
      
      Also update the documentation to reflect that using
      CONFIG_DEBUG_RODATA will inhibit the use of software breakpoints.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      a9b60bf4
  17. 24 6月, 2008 1 次提交
  18. 29 5月, 2008 1 次提交