1. 05 12月, 2009 1 次提交
    • C
      PCI: add pci_request_acs · 5d990b62
      Chris Wright 提交于
      Commit ae21ee65 "PCI: acs p2p upsteram
      forwarding enabling" doesn't actually enable ACS.
      
      Add a function to pci core to allow an IOMMU to request that ACS
      be enabled.  The existing mechanism of using iommu_found() in the pci
      core to know when ACS should be enabled doesn't actually work due to
      initialization order;  iommu has only been detected not initialized.
      
      Have Intel and AMD IOMMUs request ACS, and Xen does as well during early
      init of dom0.
      
      Cc: Allen Kay <allen.m.kay@intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      5d990b62
  2. 04 12月, 2009 8 次提交
  3. 17 11月, 2009 1 次提交
    • H
      x86, mm: Clean up and simplify NX enablement · 4763ed4d
      H. Peter Anvin 提交于
      The 32- and 64-bit code used very different mechanisms for enabling
      NX, but even the 32-bit code was enabling NX in head_32.S if it is
      available.  Furthermore, we had a bewildering collection of tests for
      the available of NX.
      
      This patch:
      
      a) merges the 32-bit set_nx() and the 64-bit check_efer() function
         into a single x86_configure_nx() function.  EFER control is left
         to the head code.
      
      b) eliminates the nx_enabled variable entirely.  Things that need to
         test for NX enablement can verify __supported_pte_mask directly,
         and cpu_has_nx gives the supported status of NX.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Vegard Nossum <vegardno@ifi.uio.no>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com>
      Acked-by: NKees Cook <kees.cook@canonical.com>
      4763ed4d
  4. 05 11月, 2009 1 次提交
  5. 04 11月, 2009 2 次提交
  6. 29 10月, 2009 1 次提交
    • T
      percpu: make percpu symbols in xen unique · c6e22f9e
      Tejun Heo 提交于
      This patch updates percpu related symbols in xen such that percpu
      symbols are unique and don't clash with local symbols.  This serves
      two purposes of decreasing the possibility of global percpu symbol
      collision and allowing dropping per_cpu__ prefix from percpu symbols.
      
      * arch/x86/xen/smp.c, arch/x86/xen/time.c, arch/ia64/xen/irq_xen.c:
        add xen_ prefix to percpu variables
      
      * arch/ia64/xen/time.c: add xen_ prefix to percpu variables, drop
        processed_ prefix and make them static
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      c6e22f9e
  7. 28 10月, 2009 1 次提交
  8. 02 10月, 2009 1 次提交
  9. 24 9月, 2009 1 次提交
  10. 22 9月, 2009 1 次提交
  11. 16 9月, 2009 1 次提交
  12. 10 9月, 2009 3 次提交
    • Y
      xen: use stronger barrier after unlocking lock · 2496afbf
      Yang Xiaowei 提交于
      We need to have a stronger barrier between releasing the lock and
      checking for any waiting spinners.  A compiler barrier is not sufficient
      because the CPU's ordering rules do not prevent the read xl->spinners
      from happening before the unlock assignment, as they are different
      memory locations.
      
      We need to have an explicit barrier to enforce the write-read ordering
      to different memory locations.
      
      Because of it, I can't bring up > 4 HVM guests on one SMP machine.
      
      [ Code and commit comments expanded -J ]
      
      [ Impact: avoid deadlock when using Xen PV spinlocks ]
      Signed-off-by: NYang Xiaowei <xiaowei.yang@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      2496afbf
    • J
      xen: only enable interrupts while actually blocking for spinlock · 4d576b57
      Jeremy Fitzhardinge 提交于
      Where possible we enable interrupts while waiting for a spinlock to
      become free, in order to reduce big latency spikes in interrupt handling.
      
      However, at present if we manage to pick up the spinlock just before
      blocking, we'll end up holding the lock with interrupts enabled for a
      while.  This will cause a deadlock if we recieve an interrupt in that
      window, and the interrupt handler tries to take the lock too.
      
      Solve this by shrinking the interrupt-enabled region to just around the
      blocking call.
      
      [ Impact: avoid race/deadlock when using Xen PV spinlocks ]
      Reported-by: N"Yang, Xiaowei" <xiaowei.yang@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      4d576b57
    • J
      xen: make -fstack-protector work under Xen · 577eebea
      Jeremy Fitzhardinge 提交于
      -fstack-protector uses a special per-cpu "stack canary" value.
      gcc generates special code in each function to test the canary to make
      sure that the function's stack hasn't been overrun.
      
      On x86-64, this is simply an offset of %gs, which is the usual per-cpu
      base segment register, so setting it up simply requires loading %gs's
      base as normal.
      
      On i386, the stack protector segment is %gs (rather than the usual kernel
      percpu %fs segment register).  This requires setting up the full kernel
      GDT and then loading %gs accordingly.  We also need to make sure %gs is
      initialized when bringing up secondary cpus too.
      
      To keep things consistent, we do the full GDT/segment register setup on
      both architectures.
      
      Because we need to avoid -fstack-protected code before setting up the GDT
      and because there's no way to disable it on a per-function basis, several
      files need to have stack-protector inhibited.
      
      [ Impact: allow Xen booting with stack-protector enabled ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      577eebea
  13. 01 9月, 2009 1 次提交
    • H
      x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT · 0cc0213e
      H. Peter Anvin 提交于
      For some reason, the _safe MSR functions returned -EFAULT, not -EIO.
      However, the only user which cares about the return code as anything
      other than a boolean is the MSR driver, which wants -EIO.  Change it
      to -EIO across the board.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      0cc0213e
  14. 31 8月, 2009 8 次提交
  15. 27 8月, 2009 1 次提交
  16. 26 8月, 2009 2 次提交
  17. 20 8月, 2009 1 次提交
  18. 16 5月, 2009 1 次提交
    • J
      x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126
      Jeremy Fitzhardinge 提交于
      Xiaohui Xin and some other folks at Intel have been looking into what's
      behind the performance hit of paravirt_ops when running native.
      
      It appears that the hit is entirely due to the paravirtualized
      spinlocks introduced by:
      
       | commit 8efcbab6
       | Date:   Mon Jul 7 12:07:51 2008 -0700
       |
       |     paravirt: introduce a "lock-byte" spinlock implementation
      
      The extra call/return in the spinlock path is somehow
      causing an increase in the cycles/instruction of somewhere around 2-7%
      (seems to vary quite a lot from test to test).  The working theory is
      that the CPU's pipeline is getting upset about the
      call->call->locked-op->return->return, and seems to be failing to
      speculate (though I haven't seen anything definitive about the precise
      reasons).  This doesn't entirely make sense, because the performance
      hit is also visible on unlock and other operations which don't involve
      locked instructions.  But spinlock operations clearly swamp all the
      other pvops operations, even though I can't imagine that they're
      nearly as common (there's only a .05% increase in instructions
      executed).
      
      If I disable just the pv-spinlock calls, my tests show that pvops is
      identical to non-pvops performance on native (my measurements show that
      it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
      
      Summary of results, averaging 10 runs of the "mmperf" test, using a
      no-pvops build as baseline:
      
      		nopv		Pv-nospin	Pv-spin
      CPU cycles	100.00%		99.89%		102.18%
      instructions	100.00%		100.10%		100.15%
      CPI		100.00%		99.79%		102.03%
      cache ref	100.00%		100.84%		100.28%
      cache miss	100.00%		90.47%		88.56%
      cache miss rate	100.00%		89.72%		88.31%
      branches	100.00%		99.93%		100.04%
      branch miss	100.00%		103.66%		107.72%
      branch miss rt	100.00%		103.73%		107.67%
      wallclock	100.00%		99.90%		102.20%
      
      The clear effect here is that the 2% increase in CPI is
      directly reflected in the final wallclock time.
      
      (The other interesting effect is that the more ops are
      out of line calls via pvops, the lower the cache access
      and miss rates.  Not too surprising, but it suggests that
      the non-pvops kernel is over-inlined.  On the flipside,
      the branch misses go up correspondingly...)
      
      So, what's the fix?
      
      Paravirt patching turns all the pvops calls into direct calls, so
      _spin_lock etc do end up having direct calls.  For example, the compiler
      generated code for paravirtualized _spin_lock is:
      
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq  *0xffffffff805a5b30
      <_spin_lock+22>:	retq
      
      The indirect call will get patched to:
      <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
      <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
      <_spin_lock+15>:	callq <__ticket_spin_lock>
      <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
      <_spin_lock+22>:	retq
      
      One possibility is to inline _spin_lock, etc, when building an
      optimised kernel (ie, when there's no spinlock/preempt
      instrumentation/debugging enabled).  That will remove the outer
      call/return pair, returning the instruction stream to a single
      call/return, which will presumably execute the same as the non-pvops
      case.  The downsides arel 1) it will replicate the
      preempt_disable/enable code at eack lock/unlock callsite; this code is
      fairly small, but not nothing; and 2) the spinlock definitions are
      already a very heavily tangled mass of #ifdefs and other preprocessor
      magic, and making any changes will be non-trivial.
      
      The other obvious answer is to disable pv-spinlocks.  Making them a
      separate config option is fairly easy, and it would be trivial to
      enable them only when Xen is enabled (as the only non-default user).
      But it doesn't really address the common case of a distro build which
      is going to have Xen support enabled, and leaves the open question of
      whether the native performance cost of pv-spinlocks is worth the
      performance improvement on a loaded Xen system (10% saving of overall
      system CPU when guests block rather than spin).  Still it is a
      reasonable short-term workaround.
      
      [ Impact: fix pvops performance regression when running native ]
      Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
      Analysed-by: N"Li Xin" <xin.li@intel.com>
      Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      LKML-Reference: <4A0B62F7.5030802@goop.org>
      [ fixed the help text ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4ecc126
  19. 13 5月, 2009 1 次提交
  20. 09 5月, 2009 3 次提交