1. 13 12月, 2005 2 次提交
  2. 07 12月, 2005 4 次提交
    • J
      [IA64-SGI] Fix SN PTC deadlock recovery · 590711b7
      Jack Steiner 提交于
      The patch that added support for a new platform chipset (shub2) broke
      PTC deadlock recovery on older versions of the chipset. (PTCs are the
      SN platform-specific method for doing a global TLB purge). This
      patch fixes deadlock recovery so that it works on both the old & new
      chipsets.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      590711b7
    • R
      [IA64] Change SET_PERSONALITY to comply with comment in binfmt_elf.c. · bd1d6e24
      Robin Holt 提交于
      We have a customer application which trips a bug.  The problem arises
      when a driver attempts to call do_munmap on an area which is mapped, but
      because current->thread.task_size has been set to 0xC0000000, the call
      to do_munmap fails thinking it is an unmap beyond the user's address
      space.
      
      The comment in fs/binfmt_elf.c in load_elf_library() before the call
      to SET_PERSONALITY() indicates that task_size must not be changed for
      the running application until flush_thread, but is for ia64 executing
      ia32 binaries.
      
      This patch moves the setting of task_size from SET_PERSONALITY() to
      flush_thread() as indicated.  The customer application no longer is able
      to trip the bug.
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      bd1d6e24
    • J
      [IA64] Limit the maximum NODEDATA_ALIGN() offset · acb7f672
      Jack Steiner 提交于
      The per-node data structures are allocated with strided offsets that are a
      function of the node number. This prevents excessive cache-aliasing from
      occurring.
      
      On systems with a large number of nodes, the strided offset becomes
      too large. This patch restricts the maximum offset to 32MB. This is far larger
      than the size of any current L3 cache.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      acb7f672
    • J
      [IA64-SGI] altix: pci_window fixup · 3ec829b6
      John Keller 提交于
      Altix only patch to add fixup code that sets up
      pci_controller->window. This code is a temporary
      fix until ACPI support on Altix is added.
      
      Also, corrects the usage of pci_dev->sysdata,
      which had previously been used to reference
      platform specific device info, to now point to
      a pci_controller struct.
      Signed-off-by: NJohn Keller <jpk@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      3ec829b6
  3. 06 12月, 2005 1 次提交
  4. 03 12月, 2005 2 次提交
  5. 30 11月, 2005 2 次提交
  6. 24 11月, 2005 1 次提交
    • J
      [PATCH] kprobes: Fix return probes on sys_execve · 8bf1101b
      Jim Keniston 提交于
      Fix a bug in kprobes that can cause an Oops or even a crash when a return
      probe is installed on one of the following functions: sys_execve,
      do_execve, load_*_binary, flush_old_exec, or flush_thread.  The fix is to
      remove the call to kprobe_flush_task() in flush_thread().  This fix has
      been tested on all architectures for which the return-probes feature has
      been implemented (i386, x86_64, ppc64, ia64).  Please apply.
      
      BACKGROUND
      
      Up to now, we have called kprobe_flush_task() under two situations: when a
      task exits, and when it execs.  Flushing kretprobe_instances on exit is
      correct because (a) do_exit() doesn't return, and (b) one or more
      return-probed functions may be active when a task calls do_exit().  Neither
      is the case for sys_execve() and its callees.
      
      Initially, the mistaken call to kprobe_flush_task() on exec was harmless
      because we put the "real" return address of each active probed function
      back in the stack, just to be safe, when we recycled its
      kretprobe_instance.  When support for ppc64 and ia64 was added, this safety
      measure couldn't be employed, and was eventually dropped even for i386 and
      x86_64.  sys_execve() and its callees were informally blacklisted for
      return probes until this fix was developed.
      Acked-by: NPrasanna S Panchamukhi <prasanna@in.ibm.com>
      Signed-off-by: NJim Keniston <jkenisto@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8bf1101b
  7. 22 11月, 2005 3 次提交
  8. 18 11月, 2005 2 次提交
    • C
      [IA64] polish comments for tlb fault handler in ivt.S · e8aabc47
      Chen, Kenneth W 提交于
      Polish the comments specifically in vhpt_miss and nested_dtlb_miss
      handlers.  I think it's better to explicitly name each page table
      level with its name instead of numerically name them.  i.e., use
      pgd, pud, pmd, and pte instead of referring as L1, L2, L3 etc.
      Along the line, remove some magic number in the comments like:
      "PTA + (((IFA(61,63) << 7) | IFA(33,39))*8)".  No code change at
      all, pure comment update.  Feel free to shoot anything you have,
      darts or tomahawk cruise missile.  I will duck behind a bunker ;-)
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      e8aabc47
    • C
      [IA64] 4 level page table bug fix in vhpt_miss · fedb25fa
      Chen, Kenneth W 提交于
      From source code inspection, I think there is a bug with 4 level
      page table with vhpt_miss handler.  In the code path of rechecking
      page table entry against previously read value after tlb insertion,
      *pte value in register r18 was overwritten with value newly read
      from pud pointer, render the check of new *pte against previous
      *pte completely wrong.  Though the bug is none fatal and the penalty
      is to purge the entry and retry.  For functional correctness, it
      should be fixed.  The fix is to use a different register so new
      *pud don't trash *pte.  (btw, the comments in the cmp statement is
      wrong as well, which I will address in the next patch).
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      fedb25fa
  9. 16 11月, 2005 1 次提交
    • C
      [PATCH] ia64: cpu_idle performance bug fix · 1e185b97
      Chen, Kenneth W 提交于
      Our performance validation on 2.6.15-rc1 caught a disastrous performance
      regression on ia64 with netperf (-98%) and volanomark (-58%) compares to
      previous kernel version 2.6.14-git7.  See the following chart (result
      group 1 & 2).
      
        http://kernel-perf.sourceforge.net/results.machine_id=26.html
      
      We have root caused it to commit 64c7c8f8
      
      This changeset broke the ia64 task resched notification.  In
      sched.c:resched_task(), a reschedule IPI is conditioned upon
      TIF_POLLING_NRFLAG.  However, the above changeset unconditionally set
      the polling thread flag for idle tasks regardless whether pal_halt_light
      is in use or not.  As a result, resched IPI is not sent from
      resched_task().  And since the default behavior on ia64 is to use
      pal_halt_light, we end up delaying the rescheduling task until next
      timer tick, and thus cause the performance regression.
      
      This fixes the performance bug.  I'm glad our performance suite is
      turning up bad performance bug like this in time.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1e185b97
  10. 15 11月, 2005 1 次提交
  11. 12 11月, 2005 2 次提交
    • M
      [IA64-SGI] set altix preferred console · ff51224c
      Mark Maule 提交于
      Fix default VGA console on SN platforms.  Since SN firmware does not pass
      enough ACPI information to identify VGA cards and the associated legacy IO/MEM
      addresses, we rely on the EFI PCDP table.  Since the linux pcdp driver is
      optional (and overridden if console= directives are used) SN duplicates a
      portion of the pcdp scan code to identify if there is a usable console VGA
      adapter.  Additionally, dup necessary pcdp related structs to avoid dragging
      drivers/pcdp.h into a more public location.
      Signed-off-by: NMark Maule <maule@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      ff51224c
    • R
      [IA64] 4-level page tables · 837cd0bd
      Robin Holt 提交于
      This patch introduces 4-level page tables to ia64.  I have run
      some benchmarks and found nothing interesting.  Performance has
      consistently fallen within the noise range.
      
      It also introduces a config option (setting the default to 3
      levels).  The config option prevents having 4 level page
      tables with 64k base page size.
      Signed-off-by: NRobin Holt <holt@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      837cd0bd
  12. 11 11月, 2005 2 次提交
  13. 09 11月, 2005 10 次提交
    • N
      [PATCH] sched: resched and cpu_idle rework · 64c7c8f8
      Nick Piggin 提交于
      Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
      confusion, and make their semantics rigid.  Improves efficiency of
      resched_task and some cpu_idle routines.
      
      * In resched_task:
      - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
        and as we hold it during resched_task, then there is no need for an
        atomic test and set there. The only other time this should be set is
        when the task's quantum expires, in the timer interrupt - this is
        protected against because the rq lock is irq-safe.
      
      - If TIF_NEED_RESCHED is set, then we don't need to do anything. It
        won't get unset until the task get's schedule()d off.
      
      - If we are running on the same CPU as the task we resched, then set
        TIF_NEED_RESCHED and no further action is required.
      
      - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
        after TIF_NEED_RESCHED has been set, then we need to send an IPI.
      
      Using these rules, we are able to remove the test and set operation in
      resched_task, and make clear the previously vague semantics of
      POLLING_NRFLAG.
      
      * In idle routines:
      - Enter cpu_idle with preempt disabled. When the need_resched() condition
        becomes true, explicitly call schedule(). This makes things a bit clearer
        (IMO), but haven't updated all architectures yet.
      
      - Many do a test and clear of TIF_NEED_RESCHED for some reason. According
        to the resched_task rules, this isn't needed (and actually breaks the
        assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
        held). So remove that. Generally one less locked memory op when switching
        to the idle thread.
      
      - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
        most polling idle loops. The above resched_task semantics allow it to be
        set until before the last time need_resched() is checked before going into
        a halt requiring interrupt wakeup.
      
        Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
        can be always left set, completely eliminating resched IPIs when rescheduling
        the idle task.
      
        POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Con Kolivas <kernel@kolivas.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      64c7c8f8
    • N
      [PATCH] sched: disable preempt in idle tasks · 5bfb5d69
      Nick Piggin 提交于
      Run idle threads with preempt disabled.
      
      Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
      How did it ever work before?
      
      Might fix the CPU hotplugging hang which Nigel Cunningham noted.
      
      We think the bug hits if the idle thread is preempted after checking
      need_resched() and before going to sleep, then the CPU offlined.
      
      After calling stop_machine_run, the CPU eventually returns from preemption and
      into the idle thread and goes to sleep.  The CPU will continue executing
      previous idle and have no chance to call play_dead.
      
      By disabling preemption until we are ready to explicitly schedule, this bug is
      fixed and the idle threads generally become more robust.
      
      From: alexs <ashepard@u.washington.edu>
      
        PPC build fix
      
      From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
      
        MIPS build fix
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5bfb5d69
    • C
      [PATCH] remove ioctl32_handler_t · 7e4c54a2
      Christoph Hellwig 提交于
      Some architectures define and use this type in their compat_ioctl code, but
      all of them can easily use the identical ioctl_trans_handler_t type that is
      defined in common code.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7e4c54a2
    • B
      [IA64] add the MMIO regions that are translated to I/O port space to /proc/iomem · 4f41d5a4
      Bjorn Helgaas 提交于
      ia64 translates normal loads and stores to special MMIO regions into I/O port
      accesses.  Reserve these special MMIO regions in /proc/iomem.
      
      Sample /proc/iomem:
          f8100000000-f81003fffff : PCI Bus 0000:80 I/O Ports 00000000-00000fff
          f8100400000-f81007fffff : PCI Bus 0000:8e I/O Ports 00001000-00001fff
          f8100800000-f8100ffffff : PCI Bus 0000:9c I/O Ports 00002000-00003fff
          f8101000000-f81017fffff : PCI Bus 0000:aa I/O Ports 00004000-00005fff
      
      and corresponding /proc/ioports:
          00000000-00000fff : PCI Bus 0000:80
          00001000-00001fff : PCI Bus 0000:8e
          00002000-00003fff : PCI Bus 0000:9c
          00004000-00005fff : PCI Bus 0000:aa
      Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      4f41d5a4
    • M
      [IA64] altix: misc pci interrupt related fixes · 6fb93a92
      Mark Maule 提交于
      Fix a couple of altix interrupt related bugs.
      Signed-off-by: NMark Maule <maule@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      6fb93a92
    • R
      [IA64] MCA recovery: Bump reference count on bad pages · cbb92144
      Russ Anderson 提交于
      When a page has a memory uncorrectable ECC error, the recovery
      code wants to prevent the page from being reused.  This change
      bumps the reference count to prevent the page from getting back
      on the free list.
      
      Signed-off-by: Russ Anderson (rja@sgi.com)
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      cbb92144
    • R
      [IA64] MCA recovery: pfn_valid() needs a pfn · 56f87b82
      Russ Anderson 提交于
      paddr needs to be shifted by PAGE_SHIFT to be valid
      input for pfn_valid().
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      56f87b82
    • R
      [IA64] MCA recovery based on PSP bits · a14f25a0
      Russ Anderson 提交于
      The determination of whether an MCA is recoverable or not must
      be based on the bits set in the PSP (Processor State Parameter).
      The specific bits are shown in the Intel IA-64 Architecture Software
      Developer's Manual, Vol 2, Table 11-6 Software Recovery Bits in
      Processor State Parameter.  Those bits should be consistent
      across the entire IA-64 family of processors.
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a14f25a0
    • D
      [IA64] align signal-frame even when not using alternate signal-stack · cf20d1ea
      David Mosberger-Tang 提交于
      At the moment, attempting to invoke a signal-handler on the normal
      stack is guaranteed to fail if the stack-pointer happens not to be
      16-byte aligned.  This is because the signal-trampoline will attempt
      to store fp-regs with stf.spill instructions, which will trap for
      misaligned addresses.  This isn't terribly useful behavior.  It's
      better to just always align the signal frame to the next lower 16-byte
      boundary.
      Signed-off-by: NDavid Mosberger-Tang <David.Mosberger@acm.org>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      cf20d1ea
    • B
      [IA64] fix memory less node allocation · 97835245
      Bob Picco 提交于
      The original memory less node allocation attempted to use NODEDATA_ALIGN for
      alignment.  The bootmem allocator only allows a power of two alignments. This
      causes a BUG_ON for some nodes. For cpu only nodes just allocate with a
      PERCPU_PAGE_SIZE alignment.
      
      Some older firmware reports SLIT distances of 0xff and results in bestnode
      not being computed. This is now treated correctly.
      
      The failed allocation check was removed because it's redundant.  The
      bootmem allocator already makes this check.
      
      This fix has been boot tested on 4 node machine which has 4 cpu only nodes
      and 1 memory node.  Thanks to Pete Keilty for reporting this and helping me
      test it.
      Signed-off-by: NBob Picco <bob.picco@hp.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      97835245
  14. 08 11月, 2005 1 次提交
    • K
      [IA64] Extend notify_die() hooks for IA64 · 9138d581
      Keith Owens 提交于
      notify_die() added for MCA_{MONARCH,SLAVE,RENDEZVOUS}_{ENTER,PROCESS,LEAVE} and
      INIT_{MONARCH,SLAVE}_{ENTER,PROCESS,LEAVE}.  We need multiple
      notification points for these events because they can take many seconds
      to run which has nasty effects on the behaviour of the rest of the
      system.
      
      DIE_SS replaced by a generic DIE_FAULT which checks the vector number,
      to allow interception of faults other than SS.
      
      DIE_MACHINE_{HALT,RESTART} added to allow last minute close down
      processing, especially when the halt/restart routines are called from
      error handlers.
      
      DIE_OOPS added.
      
      The check for kprobe's break numbers has been moved from traps.c to
      kprobes.c, allowing DIE_BREAK to be used for any additional break
      numbers, i.e. it is no longer kprobes specific.
      
      Hooks for kernel debuggers and kernel dumpers added, ENTER and LEAVE.
      Both of these disable the system for long periods which impact on
      watchdogs and heartbeat systems in general.  More patches to come that
      use these events to reset watchdogs and heartbeats.
      
      unregister_die_notifier() added and both routines exported.  Requested
      by Dean Nelson.
      
      Lock removed from {un,}register_die_notifier.  notifier_chain_register()
      already takes a lock.  Also the generic notifier chain locking is being
      reworked to distinguish between callbacks that can block and those that
      cannot, the lock in {un,}register_die_notifier would interfere with
      that change.  http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
      
      Leading white space removed from arch/ia64/kernel/kprobes.c.
      
      Typo in mca.c in original version of this patch found & fixed by Dean
      Nelson.
      Signed-off-by: NKeith Owens <kaos@sgi.com>
      Acked-by: NDean Nelson <dcn@sgi.com>
      Acked-by: NAnil Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      9138d581
  15. 07 11月, 2005 6 次提交