1. 05 2月, 2008 1 次提交
  2. 20 12月, 2007 1 次提交
  3. 07 11月, 2007 1 次提交
    • R
      [IA64] Disable/re-enable CPE interrupts on Altix · 1f3b6045
      Russ Anderson 提交于
      When the CPE handler encounters too many CPEs (such as a solid single
      bit memory error), it sets up a polling timer and disables the CPE
      interrupt (to avoid excessive overhead logging the stream of single
      bit errors).  disable_irq_nosync() calls chip->disable() to provide
      a chipset specifiec interface for disabling the interrupt.  This patch
      adds the Altix specific support to disable and re-enable the CPE interrupt.
      
      Signed-off-by: Russ Anderson (rja@sgi.com)
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      1f3b6045
  4. 13 10月, 2007 2 次提交
    • R
      [IA64] Fix race when multiple cpus go through MCA · e1b1eb01
      Russ Anderson 提交于
      Additional testing uncovered a situation where the MCA recovery code could
      hang due to a race condition.
      
      According to the SAL spec, SAL sends a rendezvous interrupt to all but the first
      CPU that goes into MCA.  This includes other CPUs that go into MCA at the same
      time.  Those other CPUs will go into the linux MCA handler (rather than the
      slave loop) with the rendezvous interrupt pending.  When all the CPUs have
      completed MCA processing and the last monarch completes, freeing all the CPUs,
      the CPUs with the pended rendezvous interrupt then go into the
      ia64_mca_rendez_int_handler().  In ia64_mca_rendez_int_handler() the CPUs
      get marked as rendezvoused, but then leave the handler (due to no MCA).
      That leaves the CPUs marked as rendezvoused _before_ the next MCA event.
      
      When the next MCA hits, the monarch will mistakenly believe that all the CPUs
      are rendezvoused when they are not, opening up a window where a CPU can get
      stuck in the slave loop.
      
      This patch avoids leaving CPUs marked as rendezvoused when they are not.
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e1b1eb01
    • R
      [IA64] Remove needless delay in MCA rendezvous · 2bc5c282
      Russ Anderson 提交于
      While testing the MCA recovery code, noticed that some machines would have a
      five second delay rendezvousing cpus.  What was happening is that
      ia64_wait_for_slaves() would check to see if all the slave CPUs had
      rendezvoused.  If any had not, it would wait 1 millisecond then check again.
      If any CPUs had still not rendezvoused, it would wait 5 seconds before
      checking again.
      
      On some configs the rendezvous takes more than 1 millisecond, causing the code
      to wait the full 5 seconds, even though the last CPU rendezvoused after only
      a few milliseconds.
      
      The fix is to check every 1 millisecond to see if all the cpus have
      rendezvoused.  After 5 seconds the code concludes the CPUs will never
      rendezvous (same as before).
      
      The MCA code is, by definition, not performance critical, but a needless
      delay of 5 seconds is senseless.  The 5 seconds also adds up quickly
      when running the error injection code in a loop.
      
      This patch both simplifies the code and removes the needless delay.
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      2bc5c282
  5. 14 8月, 2007 1 次提交
  6. 31 7月, 2007 1 次提交
    • S
      [IA64] fix a few section mismatch warnings · 056e6d89
      Sam Ravnborg 提交于
      Fix the following section mismatch warnings:
      
      WARNING: vmlinux.o(.text+0x41902): Section mismatch: reference to .init.text:__alloc_bootmem (between 'ia64_mca_cpu_init' and 'ia64_do_tlb_purge')
      WARNING: vmlinux.o(.text+0x49222): Section mismatch: reference to .init.text:__alloc_bootmem (between 'register_intr' and 'iosapic_register_intr')
      WARNING: vmlinux.o(.text+0x62beb2): Section mismatch: reference to .init.text:__alloc_bootmem_node (between 'hubdev_init_node' and 'cnodeid_get_geoid')
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      056e6d89
  7. 12 7月, 2007 1 次提交
    • R
      [IA64] Support multiple CPUs going through OS_MCA · 1612b18c
      Russ Anderson 提交于
      Linux does not gracefully deal with multiple processors going
      through OS_MCA aa part of the same MCA event.  The first cpu
      into OS_MCA grabs the ia64_mca_serialize lock.  Subsequent
      cpus wait for that lock, preventing them from reporting in as
      rendezvoused.  The first cpu waits 5 seconds then complains
      that all the cpus have not rendezvoused.  The first cpu then
      handles its MCA and frees up all the rendezvoused cpus and
      releases the ia64_mca_serialize lock.  One of the subsequent
      cpus going thought OS_MCA then gets the ia64_mca_serialize
      lock, waits another 5 seconds and then complains that none of
      the other cpus have rendezvoused.
      
      This patch allows multiple CPUs to gracefully go through OS_MCA.
      
      The first CPU into ia64_mca_handler() grabs a mca_count lock.
      Subsequent CPUs into ia64_mca_handler() are added to a list of cpus
      that need to go through OS_MCA (a bit set in mca_cpu), and report
      in as rendezvoused, and but spin waiting their turn.
      
      The first CPU sees everyone rendezvous, handles his MCA, wakes up
      one of the other CPUs waiting to process their MCA (by clearing
      one mca_cpu bit), and then waits for the other cpus to complete
      their MCA handling.  The next CPU handles his MCA and the process
      repeats until all the CPUs have handled their MCA.  When the last
      CPU has handled it's MCA, it sets monarch_cpu to -1, releasing all
      the CPUs.
      
      In testing this works more reliably and faster.
      
      Thanks to Keith Owens for suggesting numerous improvements
      to this code.
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      1612b18c
  8. 16 5月, 2007 1 次提交
  9. 15 5月, 2007 1 次提交
    • J
      [IA64] kdump on INIT needs multi-nodes sync-up (v.2) · 311f594d
      Jay Lan 提交于
      The current implementation of kdump on INIT events would enter
      kdump processing on DIE_INIT_MONARCH_ENTER and DIE_INIT_SLAVE_ENTER
      events. Thus, the monarch cpu would go ahead and boot up the kdump
      
      On SN shub2 systems, this out-of-sync situation causes some slave
      cpus on different nodes to enter POD.
      
      This patch moves kdump entry points to DIE_INIT_MONARCH_LEAVE and
      DIE_INIT_SLAVE_LEAVE. It also sets kdump_in_progress variable in
      the DIE_INIT_MONARCH_PROCESS event to not dump all active stack
      traces to the console in the case of kdump.
      
      I have tested this patch on an SN machine and a HP RX2600.
      Signed-off-by: NJay Lan <jlan@sgi.com>
      Acked-by: NZou Nan hai <nanhai.zou@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      311f594d
  10. 11 5月, 2007 1 次提交
  11. 10 5月, 2007 1 次提交
    • R
      rename thread_info to stack · f7e4217b
      Roman Zippel 提交于
      This finally renames the thread_info field in task structure to stack, so that
      the assumptions about this field are gone and archs have more freedom about
      placing the thread_info structure.
      
      Nonbroken archs which have a proper thread pointer can do the access to both
      current thread and task structure via a single pointer.
      
      It'll allow for a few more cleanups of the fork code, from which e.g.  ia64
      could benefit.
      Signed-off-by: NRoman Zippel <zippel@linux-m68k.org>
      [akpm@linux-foundation.org: build fix]
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Chris Zankel <chris@zankel.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f7e4217b
  12. 09 5月, 2007 2 次提交
  13. 09 3月, 2007 1 次提交
  14. 13 12月, 2006 1 次提交
    • H
      [IA64] CONFIG_KEXEC/CONFIG_CRASH_DUMP permutations · 45a98fc6
      Horms 提交于
      Actually, on reflection I think that there is a good case for
      keeping the options separate. I am thinking particularly of people
      who want a very small crashdump kernel and thus don't want to compile
      in kexec.
      
      The patch below should fix things up so that all valid combinations of
      KEXEC, CRASH_DUMP and VMCORE compile cleanly - VMCORE depends on
      CRASH_DUMP which is why I said valid combinations. In a nutshell
      it just untangles unrelated code and switches around a few defines.
      
      Please note that it creats a new file, arch/ia64/kernel/crash_dump.c
      This is in keeping with the i386 implementation.
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      45a98fc6
  15. 08 12月, 2006 1 次提交
    • Z
      [IA64] IA64 Kexec/kdump · a7956113
      Zou Nan hai 提交于
      Changes and updates.
      
      1. Remove fake rendz path and related code according to discuss with Khalid Aziz.
      2. fc.i offset fix in relocate_kernel.S.
      3. iospic shutdown code eoi and mask race fix from Fujitsu.
      4. Warm boot hook in machine_kexec to SN SAL code from Jack Steiner.
      5. Send slave to SAL slave loop patch from Jay Lan.
      6. Kdump on non-recoverable MCA event patch from Jay Lan
      7. Use CTL_UNNUMBERED in kdump_on_init sysctl.
      Signed-off-by: NZou Nan hai <nanhai.zou@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a7956113
  16. 06 12月, 2006 1 次提交
  17. 05 10月, 2006 1 次提交
    • D
      IRQ: Maintain regs pointer globally rather than passing to IRQ handlers · 7d12e780
      David Howells 提交于
      Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
      of passing regs around manually through all ~1800 interrupt handlers in the
      Linux kernel.
      
      The regs pointer is used in few places, but it potentially costs both stack
      space and code to pass it around.  On the FRV arch, removing the regs parameter
      from all the genirq function results in a 20% speed up of the IRQ exit path
      (ie: from leaving timer_interrupt() to leaving do_IRQ()).
      
      Where appropriate, an arch may override the generic storage facility and do
      something different with the variable.  On FRV, for instance, the address is
      maintained in GR28 at all times inside the kernel as part of general exception
      handling.
      
      Having looked over the code, it appears that the parameter may be handed down
      through up to twenty or so layers of functions.  Consider a USB character
      device attached to a USB hub, attached to a USB controller that posts its
      interrupts through a cascaded auxiliary interrupt controller.  A character
      device driver may want to pass regs to the sysrq handler through the input
      layer which adds another few layers of parameter passing.
      
      I've build this code with allyesconfig for x86_64 and i386.  I've runtested the
      main part of the code on FRV and i386, though I can't test most of the drivers.
      I've also done partial conversion for powerpc and MIPS - these at least compile
      with minimal configurations.
      
      This will affect all archs.  Mostly the changes should be relatively easy.
      Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
      
      	struct pt_regs *old_regs = set_irq_regs(regs);
      
      And put the old one back at the end:
      
      	set_irq_regs(old_regs);
      
      Don't pass regs through to generic_handle_irq() or __do_IRQ().
      
      In timer_interrupt(), this sort of change will be necessary:
      
      	-	update_process_times(user_mode(regs));
      	-	profile_tick(CPU_PROFILING, regs);
      	+	update_process_times(user_mode(get_irq_regs()));
      	+	profile_tick(CPU_PROFILING);
      
      I'd like to move update_process_times()'s use of get_irq_regs() into itself,
      except that i386, alone of the archs, uses something other than user_mode().
      
      Some notes on the interrupt handling in the drivers:
      
       (*) input_dev() is now gone entirely.  The regs pointer is no longer stored in
           the input_dev struct.
      
       (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking.  It does
           something different depending on whether it's been supplied with a regs
           pointer or not.
      
       (*) Various IRQ handler function pointers have been moved to type
           irq_handler_t.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
      7d12e780
  18. 01 10月, 2006 1 次提交
  19. 27 9月, 2006 2 次提交
    • H
      [IA64] CMC/CPE: Reverse the order of fetching log and checking poll threshold · ddb4f0df
      Hidetoshi Seto 提交于
      This patch reverses the order of fetching log from SAL and
      checking poll threshold. This will fix following trivial issues:
      
      - If SAL_GET_SATE_INFO is unbelievably slow (due to huge system
         or just its silly implementation) and if it takes more than
         1/5 sec, CMCI/CPEI will never switch to CMCP/CPEP.
      - Assuming terrible flood of interrupt (continuous corrected
         errors let all CPUs enter to handler at once and bind them
         in it), CPUs will be serialized by IA64_LOG_LOCK(*).
         Now we check the poll threshold after the lock and log fetch,
         so we need to call SAL_GET_STATE_INFO (num_online_cpus() + 4)
         times in the worst case.
         if we can check the threshold before the lock, we can shut up
         interrupts quickly without waiting preceding log fetches, and
         the number of times will be reduced to (num_online_cpus()) in
         the same situation.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      ddb4f0df
    • H
      [IA64] printing support for MCA/INIT · 43ed3baf
      Hidetoshi Seto 提交于
      Printing message to console from MCA/INIT handler is useful,
      however doing oops_in_progress = 1 in them exactly makes
      something in kernel wrong. Especially it sounds ugly if
      system goes wrong after returning from recoverable MCA.
      
      This patch adds ia64_mca_printk() function that collects
      messages into temporary-not-so-large message buffer during
      in MCA/INIT environment and print them out later, after
      returning to normal context or when handlers determine to
      down the system.
      
      Also this print function is exported for use in extensional
      MCA handler. It would be useful to describe detail about
      recovery.
      
      NOTE:
      I don't think it is sane thing if temporary message buffer
      is enlarged enough to hold whole stack dumps from INIT, so
      buffering is disabled during stack dump from INIT-monarch
      (= default_monarch_init_process). please fix it in future.
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Acked-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      43ed3baf
  20. 04 7月, 2006 1 次提交
  21. 03 7月, 2006 1 次提交
  22. 01 7月, 2006 1 次提交
  23. 30 6月, 2006 1 次提交
  24. 14 4月, 2006 1 次提交
  25. 08 4月, 2006 1 次提交
  26. 27 3月, 2006 1 次提交
  27. 25 3月, 2006 1 次提交
    • R
      [IA64] MCA recovery: kernel context recovery table · d2a28ad9
      Russ Anderson 提交于
      Memory errors encountered by user applications may surface
      when the CPU is running in kernel context.  The current code
      will not attempt recovery if the MCA surfaces in kernel
      context (privilage mode 0).  This patch adds a check for cases
      where the user initiated the load that surfaces in kernel
      interrupt code.
      
      An example is a user process lauching a load from memory
      and the data in memory had bad ECC.  Before the bad data
      gets to the CPU register, and interrupt comes in.  The
      code jumps to the IVT interrupt entry point and begins
      execution in kernel context.  The process of saving the
      user registers (SAVE_REST) causes the bad data to be loaded
      into a CPU register, triggering the MCA.  The MCA surfaces in
      kernel context, even though the load was initiated from
      user context.
      
      As suggested by David and Tony, this patch uses an exception
      table like approach, puting the tagged recovery addresses in
      a searchable table.  One difference from the exception table
      is that MCAs do not surface in precise places (such as with
      a TLB miss), so instead of tagging specific instructions,
      address ranges are registers.  A single macro is used to do
      the tagging, with the input parameter being the label
      of the starting address and the macro being the ending
      address.  This limits clutter in the code.
      
      This patch only tags one spot, the interrupt ivt entry.
      Testing showed that spot to be a "heavy hitter" with
      MCAs surfacing while saving user registers.  Other spots
      can be added as needed by adding a single macro.
      
      Signed-off-by: Russ Anderson (rja@sgi.com)
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      d2a28ad9
  28. 23 3月, 2006 1 次提交
  29. 09 2月, 2006 3 次提交
  30. 13 1月, 2006 1 次提交
  31. 06 1月, 2006 1 次提交
    • A
      [IA64] support for cpu0 removal · ff741906
      Ashok Raj 提交于
      here is the BSP removal support for IA64. Its pretty much the same thing that
      was released a while back, but has your feedback incorporated.
      
      - Removed CONFIG_BSP_REMOVE_WORKAROUND and associated cmdline param
      - Fixed compile issue with sn2/zx1 due to a undefined fix_b0_for_bsp
      - some formatting nits (whitespace etc)
      
      This has been tested on tiger and long back by alex on hp systems as well.
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      ff741906
  32. 08 11月, 2005 1 次提交
    • K
      [IA64] Extend notify_die() hooks for IA64 · 9138d581
      Keith Owens 提交于
      notify_die() added for MCA_{MONARCH,SLAVE,RENDEZVOUS}_{ENTER,PROCESS,LEAVE} and
      INIT_{MONARCH,SLAVE}_{ENTER,PROCESS,LEAVE}.  We need multiple
      notification points for these events because they can take many seconds
      to run which has nasty effects on the behaviour of the rest of the
      system.
      
      DIE_SS replaced by a generic DIE_FAULT which checks the vector number,
      to allow interception of faults other than SS.
      
      DIE_MACHINE_{HALT,RESTART} added to allow last minute close down
      processing, especially when the halt/restart routines are called from
      error handlers.
      
      DIE_OOPS added.
      
      The check for kprobe's break numbers has been moved from traps.c to
      kprobes.c, allowing DIE_BREAK to be used for any additional break
      numbers, i.e. it is no longer kprobes specific.
      
      Hooks for kernel debuggers and kernel dumpers added, ENTER and LEAVE.
      Both of these disable the system for long periods which impact on
      watchdogs and heartbeat systems in general.  More patches to come that
      use these events to reset watchdogs and heartbeats.
      
      unregister_die_notifier() added and both routines exported.  Requested
      by Dean Nelson.
      
      Lock removed from {un,}register_die_notifier.  notifier_chain_register()
      already takes a lock.  Also the generic notifier chain locking is being
      reworked to distinguish between callbacks that can block and those that
      cannot, the lock in {un,}register_die_notifier would interfere with
      that change.  http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
      
      Leading white space removed from arch/ia64/kernel/kprobes.c.
      
      Typo in mca.c in original version of this patch found & fixed by Dean
      Nelson.
      Signed-off-by: NKeith Owens <kaos@sgi.com>
      Acked-by: NDean Nelson <dcn@sgi.com>
      Acked-by: NAnil Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      9138d581
  33. 26 10月, 2005 1 次提交
  34. 07 10月, 2005 1 次提交
    • B
      [IA64] Avoid kernel hang during CMC interrupt storm · 76e677e2
      Bryan Sutula 提交于
      I've noticed a kernel hang during a storm of CMC interrupts, which was
      tracked down to the continual execution of the interrupt handler.
      
      There's code in the CMC handler that's supposed to disable CMC
      interrupts and switch to polling mode when it sees a bunch of CMCs.
      Because disabling CMCs across all CPUs isn't safe in interrupt context,
      the disable is done with a schedule_work().  But with continual CMC
      interrupts, the schedule_work() never gets executed.
      
      The following patch immediately disables CMC interrupts for the current
      CPU.  This then allows (at least) one CPU to ignore CMC interrupts,
      execute the schedule_work() code, and disable CMC interrupts on the rest
      of the CPUs.
      Acked-by: NKeith Owens <kaos@sgi.com>
      Signed-off-by: NBryan Sutula <Bryan.Sutula@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      76e677e2
  35. 12 9月, 2005 1 次提交
    • K
      [PATCH] MCA/INIT: use per cpu stacks · 7f613c7d
      Keith Owens 提交于
      The bulk of the change.  Use per cpu MCA/INIT stacks.  Change the SAL
      to OS state (sos) to be per process.  Do all the assembler work on the
      MCA/INIT stacks, leaving the original stack alone.  Pass per cpu state
      data to the C handlers for MCA and INIT, which also means changing the
      mca_drv interfaces slightly.  Lots of verification on whether the
      original stack is usable before converting it to a sleeping process.
      Signed-off-by: NKeith Owens <kaos@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      7f613c7d