1. 07 2月, 2007 2 次提交
    • C
      [IA64] relax per-cpu TLB requirement to DTC · 00b65985
      Chen, Kenneth W 提交于
      Instead of pinning per-cpu TLB into a DTR, use DTC.  This will free up
      one TLB entry for application, or even kernel if access pattern to
      per-cpu data area has high temporal locality.
      
      Since per-cpu is mapped at the top of region 7 address, we just need to
      add special case in alt_dtlb_miss.  The physical address of per-cpu data
      is already conveniently stored in IA64_KR(PER_CPU_DATA).  Latency for
      alt_dtlb_miss is not affected as we can hide all the latency.  It was
      measured that alt_dtlb_miss handler has 23 cycles latency before and
      after the patch.
      
      The performance effect is massive for applications that put lots of tlb
      pressure on CPU.  Workload environment like database online transaction
      processing or application uses tera-byte of memory would benefit the most.
      Measurement with industry standard database benchmark shown an upward
      of 1.6% gain.  While smaller workloads like cpu, java also showing small
      improvement.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      00b65985
    • C
      [IA64] remove per-cpu ia64_phys_stacked_size_p8 · a0776ec8
      Chen, Kenneth W 提交于
      It's not efficient to use a per-cpu variable just to store
      how many physical stack register a cpu has.  Ever since the
      incarnation of ia64 up till upcoming Montecito processor, that
      variable has "glued" to 96. Having a variable in memory means
      that the kernel is burning an extra cacheline access on every
      syscall and kernel exit path.  Such "static" value is better
      served with the instruction patching utility exists today.
      Convert ia64_phys_stacked_size_p8 into dynamic insn patching.
      
      This also has a pleasant side effect of eliminating access to
      per-cpu area while psr.ic=0 in the kernel exit path. (fixable
      for per-cpu DTC work, but why bother?)
      
      There are some concerns with the default value that the instruc-
      tion encoded in the kernel image.  It shouldn't be concerned.
      The reasons are:
      
      (1) cpu_init() is called at CPU initialization.  In there, we
          find out physical stack register size from PAL and patch
          two instructions in kernel exit code.  The code in question
          can not be executed before the patching is done.
      
      (2) current implementation stores zero in ia64_phys_stacked_size_p8,
          and that's what the current kernel exit path loads the value with.
          With the new code, it is equivalent that we store reg size 96
          in ia64_phys_stacked_size_p8, thus creating a better safety net.
          Given (1) above can never fail, having (2) is just a bonus.
      
      All in all, this patch allow one less memory reference in the kernel
      exit path, thus reducing syscall and interrupt return latency; and
      avoid polluting potential useful data in the CPU cache.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      a0776ec8
  2. 05 2月, 2007 1 次提交
  3. 04 2月, 2007 2 次提交
  4. 03 2月, 2007 1 次提交
  5. 02 2月, 2007 3 次提交
  6. 31 1月, 2007 4 次提交
  7. 30 1月, 2007 3 次提交
  8. 29 1月, 2007 2 次提交
  9. 28 1月, 2007 2 次提交
  10. 27 1月, 2007 8 次提交
    • D
      [SPARC64]: Set g4/g5 properly in sun4v dtlb-prot handling. · 86d43258
      David S. Miller 提交于
      Mirror the logic in the sun4u handler, we have to update
      both registers even when we branch out to window fault
      fixup handling.
      
      The way it works is that if we are in etrap processing a
      fault already, g4/g5 holds the original fault information.
      If we take a window spill fault while doing etrap, then
      we put the window spill fault info into g4/g5 and this is
      what the top-level fault handler ends up processing first.
      
      Then we retry the originally faulting instruction, and
      process the original fault at that time.
      
      This is all necessary because of how constrained the trap
      registers are in these code paths.  These cases trigger
      very rarely, so even if there is some performance implication
      it's doesn't happen very often.  In fact the rarity is why
      it took so long to trigger and find this particular bug.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86d43258
    • D
      Fix Maple PATA IRQ assignment. · 8cdf92a9
      David Woodhouse 提交于
      On the Maple board, the AMD8111 IDE is in legacy mode... except that it
      appears on IRQ 20 instead of IRQ 15. For drivers/ide this was handled by
      the architecture's "pci_get_legacy_ide_irq()" function, but in libata we
      just hard-code the numbers 14 and 15.
      
      This patch provides asm-powerpc/libata-portmap.h which maps the IRQ as
      appropriate, having added a pci_dev argument to the
      ATA_{PRIM,SECOND}ARY_IRQ macros.
      
      There's probably a better way to do this -- especially if we observe
      that the _only_ case in which this seemingly-generic
      "pci_get_legacy_ide_irq()" function returns anything other than 14 and
      15 for primary and secondary respectively is the case of the AMD8111 on
      the Maple board -- couldn't we handle that with a special case in the
      pata_amd driver, or perhaps with a PCI quirk for Maple to switch it into
      native mode during early boot and assign resources properly?
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NJeff Garzik <jeff@garzik.org>
      8cdf92a9
    • J
      [PATCH] Fix UML on non-standard VM split hosts · fe33f6f1
      Jeff Dike 提交于
      This fixes UML on hosts with non-standard VM splits.  We had changed the
      config variable that controls UML behavior on such hosts, but not
      propogated the change everywhere.  In particular, the values of STUB_CODE
      and STUB_DATA relied on the old variable.
      
      I also reformatted the HOST_VMSPLIT_3G help to make it more standard.
      
      Spotted by uml@flonatel.org.
      Signed-off-by: NJeff Dike <jdike@addtoit.com>
      Cc: Blaisorblade <blaisorblade@yahoo.it>
      Cc: Pravin <shindepravin@gmail.com>
      Cc: <uml@flonatel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe33f6f1
    • R
      [PATCH] x86_64 ia32 vDSO: define arch_vma_name · c633090e
      Roland McGrath 提交于
      This patch makes x86_64 define arch_vma_name for CONFIG_IA32_EMULATION.  This
      makes the ia32 vDSO mapping appear in /proc/PID/maps with "[vdso]" for ia32
      processes, as it does on native i386.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c633090e
    • R
      [PATCH] powerpc vDSO: use VM_ALWAYSDUMP · 3a0cfadb
      Roland McGrath 提交于
      This patch fixes core dumps to include the vDSO vma, which is left out now.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a0cfadb
    • R
      [PATCH] x86_64 ia32 vDSO: use VM_ALWAYSDUMP · e03f0ca1
      Roland McGrath 提交于
      This patch fixes ia32 core dumps on x86_64 to include just one phdr for the
      vDSO vma.  Currently it writes a confused format with two phdrs for the
      address, one without contents and one with.  This patch removes the
      special-case core writing macros for the ia32 vDSO.  Instead, it uses
      VM_ALWAYSDUMP in the vma.  This changes core dumps so they no longer include
      the non-PT_LOAD phdrs from the vDSO, consistent with fixed native i386 core
      dumps.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e03f0ca1
    • R
      [PATCH] i386 vDSO: use VM_ALWAYSDUMP · f47aef55
      Roland McGrath 提交于
      This patch fixes core dumps to include the vDSO vma, which is left out now.
      It removes the special-case core writing macros, which were not doing the
      right thing for the vDSO vma anyway.  Instead, it uses VM_ALWAYSDUMP in the
      vma; there is no need for the fixmap page to be installed.  It handles the
      CONFIG_COMPAT_VDSO case by making elf_core_dump use the fake vma from
      get_gate_vma after real vmas in the same way the /proc/PID/maps code does.
      
      This changes core dumps so they no longer include the non-PT_LOAD phdrs from
      the vDSO.  I made the change to add them in the first place, but in turned out
      that nothing ever wanted them there since the advent of NT_AUXV.  It's cleaner
      to leave them out, and just let the phdrs inside the vDSO image speak for
      themselves.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f47aef55
    • R
      [PATCH] Fix CONFIG_COMPAT_VDSO · a1f3bb9a
      Roland McGrath 提交于
      I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely.  But if it's there,
      it should work properly.  Currently it's quite haphazard: both real vma and
      fixmap are mapped, both are put in the two different AT_* slots, sysenter
      returns to the vma address rather than the fixmap address, and core dumps yet
      are another story.
      
      This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap
      area consistently.  This makes it actually compatible with what the old vdso
      implementation did.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1f3bb9a
  11. 26 1月, 2007 4 次提交
  12. 25 1月, 2007 5 次提交
  13. 24 1月, 2007 3 次提交