1. 06 5月, 2012 1 次提交
    • G
      KVM: ensure async PF event wakes up vcpu from halt · a4fa1635
      Gleb Natapov 提交于
      If vcpu executes hlt instruction while async PF is waiting to be delivered
      vcpu can block and deliver async PF only after another even wakes it
      up. This happens because kvm_check_async_pf_completion() will remove
      completion event from vcpu->async_pf.done before entering kvm_vcpu_block()
      and this will make kvm_arch_vcpu_runnable() return false. The solution
      is to make vcpu runnable when processing completion.
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      a4fa1635
  2. 21 4月, 2012 4 次提交
  3. 20 4月, 2012 1 次提交
  4. 19 4月, 2012 1 次提交
    • A
      KVM: VMX: Fix kvm_set_shared_msr() called in preemptible context · 2225fd56
      Avi Kivity 提交于
      kvm_set_shared_msr() may not be called in preemptible context,
      but vmx_set_msr() does so:
      
        BUG: using smp_processor_id() in preemptible [00000000] code: qemu-kvm/22713
        caller is kvm_set_shared_msr+0x32/0xa0 [kvm]
        Pid: 22713, comm: qemu-kvm Not tainted 3.4.0-rc3+ #39
        Call Trace:
         [<ffffffff8131fa82>] debug_smp_processor_id+0xe2/0x100
         [<ffffffffa0328ae2>] kvm_set_shared_msr+0x32/0xa0 [kvm]
         [<ffffffffa03a103b>] vmx_set_msr+0x28b/0x2d0 [kvm_intel]
         ...
      
      Making kvm_set_shared_msr() work in preemptible is cleaner, but
      it's used in the fast path.  Making two variants is overkill, so
      this patch just disables preemption around the call.
      Reported-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      2225fd56
  5. 18 4月, 2012 1 次提交
  6. 16 4月, 2012 2 次提交
    • M
      x86: Handle failures of parsing immediate operands in the instruction decoder · 6c7b8e82
      Masami Hiramatsu 提交于
      This can happen if the instruction is much longer than the maximum length,
      or if insn->opnd_bytes is manually changed.
      
      This patch also fixes warnings from -Wswitch-default flag.
      Reported-by: NPrashanth Nageshappa <prashanth@linux.vnet.ibm.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120413032427.32577.42602.stgit@localhost.localdomainSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6c7b8e82
    • L
      x86-32: fix up strncpy_from_user() sign error · 12e993b8
      Linus Torvalds 提交于
      The 'max' range needs to be unsigned, since the size of the user address
      space is bigger than 2GB.
      
      We know that 'count' is positive in 'long' (that is checked in the
      caller), so we will truncate 'max' down to something that fits in a
      signed long, but before we actually do that, that comparison needs to be
      done in unsigned.
      
      Bug introduced in commit 92ae03f2 ("x86: merge 32/64-bit versions of
      'strncpy_from_user()' and speed it up").  On x86-64 you can't trigger
      this, since the user address space is much smaller than 63 bits, and on
      x86-32 it works in practice, since you would seldom hit the strncpy
      limits anyway.
      
      I had actually tested the corner-cases, I had only tested them on
      x86-64.  Besides, I had only worried about the case of a pointer *close*
      to the end of the address space, rather than really far away from it ;)
      
      This also changes the "we hit the user-specified maximum" to return
      'res', for the trivial reason that gcc seems to generate better code
      that way.  'res' and 'count' are the same in that case, so it really
      doesn't matter which one we return.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12e993b8
  7. 12 4月, 2012 1 次提交
    • L
      x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up · 92ae03f2
      Linus Torvalds 提交于
      This merges the 32- and 64-bit versions of the x86 strncpy_from_user()
      by just rewriting it in C rather than the ancient inline asm versions
      that used lodsb/stosb and had been duplicated for (trivial) differences
      between the 32-bit and 64-bit versions.
      
      While doing that, it also speeds them up by doing the accesses a word at
      a time.  Finally, the new routines also properly handle the case of
      hitting the end of the address space, which we have never done correctly
      before (fs/namei.c has a hack around it for that reason).
      
      Despite all these improvements, it actually removes more lines than it
      adds, due to the de-duplication.  Also, we no longer export (or define)
      the legacy __strncpy_from_user() function (that was defined to not do
      the user permission checks), since it's not actually used anywhere, and
      the user address space checks are built in to the new code.
      
      Other architecture maintainers have been notified that the old hack in
      fs/namei.c will be going away in the 3.5 merge window, in case they
      copied the x86 approach of being a bit cavalier about the end of the
      address space.
      
      Cc: linux-arch@vger.kernel.org
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      92ae03f2
  8. 10 4月, 2012 4 次提交
  9. 07 4月, 2012 5 次提交
    • L
      Make the "word-at-a-time" helper functions more commonly usable · f68e556e
      Linus Torvalds 提交于
      I have a new optimized x86 "strncpy_from_user()" that will use these
      same helper functions for all the same reasons the name lookup code uses
      them.  This is preparation for that.
      
      This moves them into an architecture-specific header file.  It's
      architecture-specific for two reasons:
      
       - some of the functions are likely to want architecture-specific
         implementations.  Even if the current code happens to be "generic" in
         the sense that it should work on any little-endian machine, it's
         likely that the "multiply by a big constant and shift" implementation
         is less than optimal for an architecture that has a guaranteed fast
         bit count instruction, for example.
      
       - I expect that if architectures like sparc want to start playing
         around with this, we'll need to abstract out a few more details (in
         particular the actual unaligned accesses).  So we're likely to have
         more architecture-specific stuff if non-x86 architectures start using
         this.
      
         (and if it turns out that non-x86 architectures don't start using
         this, then having it in an architecture-specific header is still the
         right thing to do, of course)
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f68e556e
    • H
      x86: Use correct byte-sized register constraint in __add() · 8c91c532
      H. Peter Anvin 提交于
      Similar to:
      
       2ca052a3 x86: Use correct byte-sized register constraint in __xchg_op()
      
      ... the __add() macro also needs to use a "q" constraint in the
      byte-sized case, lest we try to generate an illegal register.
      
      Link: http://lkml.kernel.org/r/4F7A3315.501@goop.orgSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Leigh Scott <leigh123linux@googlemail.com>
      Cc: Thomas Reitmayr <treitmayr@devbase.at>
      Cc: <stable@vger.kernel.org> v3.3
      8c91c532
    • J
      x86: Use correct byte-sized register constraint in __xchg_op() · 2ca052a3
      Jeremy Fitzhardinge 提交于
      x86-64 can access the low half of any register, but i386 can only do
      it with a subset of registers.  'r' causes compilation failures on i386,
      but 'q' expresses the constraint properly.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
      Link: http://lkml.kernel.org/r/4F7A3315.501@goop.orgReported-by: NLeigh Scott <leigh123linux@googlemail.com>
      Tested-by: NThomas Reitmayr <treitmayr@devbase.at>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: <stable@vger.kernel.org> v3.3
      2ca052a3
    • S
      xen/smp: Remove unnecessary call to smp_processor_id() · e8c9e788
      Srivatsa S. Bhat 提交于
      There is an extra and unnecessary call to smp_processor_id()
      in cpu_bringup(). Remove it.
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      e8c9e788
    • K
      xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries' · 2531d64b
      Konrad Rzeszutek Wilk 提交于
      The above mentioned patch checks the IOAPIC and if it contains
      -1, then it unmaps said IOAPIC. But under Xen we get this:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
      IP: [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0
      PGD 0
      Oops: 0002 [#1] SMP
      CPU 0
      Modules linked in:
      
      Pid: 1, comm: swapper/0 Not tainted 3.2.10-3.fc16.x86_64 #1 Dell Inc. Inspiron
      1525                  /0U990C
      RIP: e030:[<ffffffff8134e51f>]  [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0
      RSP: e02b: ffff8800d42cbb70  EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 00000000ffffffef RCX: 0000000000000001
      RDX: 0000000000000040 RSI: 00000000ffffffef RDI: 0000000000000001
      RBP: ffff8800d42cbb80 R08: ffff8800d6400000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffef
      R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000010
      FS:  0000000000000000(0000) GS:ffff8800df5fe000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0:000000008005003b
      CR2: 0000000000000040 CR3: 0000000001a05000 CR4: 0000000000002660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper/0 (pid: 1, threadinfo ffff8800d42ca000, task ffff8800d42d0000)
      Stack:
       00000000ffffffef 0000000000000010 ffff8800d42cbbe0 ffffffff8134f157
       ffffffff8100a9b2 ffffffff8182ffd1 00000000000000a0 00000000829e7384
       0000000000000002 0000000000000010 00000000ffffffff 0000000000000000
      Call Trace:
       [<ffffffff8134f157>] xen_bind_pirq_gsi_to_irq+0x87/0x230
       [<ffffffff8100a9b2>] ? check_events+0x12+0x20
       [<ffffffff814bab42>] xen_register_pirq+0x82/0xe0
       [<ffffffff814bac1a>] xen_register_gsi.part.2+0x4a/0xd0
       [<ffffffff814bacc0>] acpi_register_gsi_xen+0x20/0x30
       [<ffffffff8103036f>] acpi_register_gsi+0xf/0x20
       [<ffffffff8131abdb>] acpi_pci_irq_enable+0x12e/0x202
       [<ffffffff814bc849>] pcibios_enable_device+0x39/0x40
       [<ffffffff812dc7ab>] do_pci_enable_device+0x4b/0x70
       [<ffffffff812dc878>] __pci_enable_device_flags+0xa8/0xf0
       [<ffffffff812dc8d3>] pci_enable_device+0x13/0x20
      
      The reason we are dying is b/c the call acpi_get_override_irq() is used,
      which returns the polarity and trigger for the IRQs. That function calls
      mp_find_ioapics to get the 'struct ioapic' structure - which along with the
      mp_irq[x] is used to figure out the default values and the polarity/trigger
      overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled
      with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the
      mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt.
      
      The proper fix for this is going in v3.5 and adds an x86_io_apic_ops
      struct so that platforms can override it. But for v3.4 lets carry this
      work-around. This patch does that by providing a slightly different variant
      of the fake IOAPIC entries.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      2531d64b
  10. 06 4月, 2012 5 次提交
  11. 04 4月, 2012 1 次提交
  12. 03 4月, 2012 1 次提交
  13. 02 4月, 2012 1 次提交
  14. 31 3月, 2012 1 次提交
  15. 30 3月, 2012 5 次提交
  16. 29 3月, 2012 6 次提交
    • L
      x86: Preserve lazy irq disable semantics in fixup_irqs() · 99dd5497
      Liu, Chuansheng 提交于
      The default irq_disable() sematics are to mark the interrupt disabled,
      but keep it unmasked. If the interrupt is delivered while marked
      disabled, the low level interrupt handler masks it and marks it
      pending. This is important for detecting wakeup interrupts during
      suspend and for edge type interrupts to avoid losing interrupts.
      
      fixup_irqs() moves the interrupts away from an offlined cpu. For
      certain interrupt types it needs to mask the interrupt line before
      changing the affinity. After affinity has changed the interrupt line
      is unmasked again, but only if it is not marked disabled.
      
      This breaks the lazy irq disable semantics and causes problems in
      suspend as the interrupt can be lost or wakeup functionality is
      broken.
      
      Check irqd_irq_masked() instead of irqd_irq_disabled() because
      irqd_irq_masked() is only set, when the core code actually masked the
      interrupt line. If it's not set, we unmask the interrupt and let the
      lazy irq disable logic deal with an eventually incoming interrupt.
      
      [ tglx: Massaged changelog and added a comment ]
      Signed-off-by: Nliu chuansheng <chuansheng.liu@intel.com>
      Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
      Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A05DFB3@SHSMSX101.ccr.corp.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      99dd5497
    • R
      documentation: remove references to cpu_*_map. · 5f054e31
      Rusty Russell 提交于
      This has been obsolescent for a while, fix documentation and
      misc comments.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      5f054e31
    • D
      kdump x86: fix total mem size calculation for reservation · 09c71bfd
      Dave Young 提交于
      crashkernel reservation need know the total memory size.  Current
      get_total_mem simply use max_pfn - min_low_pfn.  It is wrong because it
      will including memory holes in the middle.
      
      Especially for kvm guest with memory > 0xe0000000, there's below in qemu
      code: qemu split memory as below:
      
          if (ram_size >= 0xe0000000 ) {
              above_4g_mem_size = ram_size - 0xe0000000;
              below_4g_mem_size = 0xe0000000;
          } else {
              below_4g_mem_size = ram_size;
          }
      
      So for 4G mem guest, seabios will insert a 512M usable region beyond of
      4G.  Thus in above case max_pfn - min_low_pfn will be more than original
      memsize.
      
      Fixing this issue by using memblock_phys_mem_size() to get the total
      memsize.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com>
      Reviewed-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09c71bfd
    • R
      x86/apic/amd: Be more verbose about LVT offset assignments · 8abc3122
      Robert Richter 提交于
      Add information about LVT offset assignments to better debug firmware
      bugs related to this. See following examples.
      
       # dmesg | grep -i 'offset\|ibs'
       LVT offset 0 assigned for vector 0xf9
       [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu
       [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100)
       Failed to setup IBS, -22
      
      In this case the BIOS assigns both offsets for MCE (0xf9) and IBS
      (0x400) vectors to offset 0, which is why the second APIC setup (IBS)
      failed.
      
      With correct setup you get:
      
       # dmesg | grep -i 'offset\|ibs'
       LVT offset 0 assigned for vector 0xf9
       LVT offset 1 assigned for vector 0x400
       IBS: LVT offset 1 assigned
       perf: AMD IBS detected (0x00000007)
       oprofile: AMD IBS detected (0x00000007)
      
      Note: The vector includes also the message type to handle also NMIs
      (0x400). In the firmware bug message the format is the same as of the
      APIC500 register and includes the mask bit (bit 16) in addition.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8abc3122
    • D
      Delete all instances of asm/system.h · 141124c0
      David Howells 提交于
      Delete all instances of asm/system.h as they should be redundant by this
      point.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      141124c0
    • D
      Move all declarations of free_initmem() to linux/mm.h · 49a7f04a
      David Howells 提交于
      Move all declarations of free_initmem() to linux/mm.h so that there's only one
      and it's used by everything.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: linux-c6x-dev@linux-c6x.org
      cc: microblaze-uclinux@itee.uq.edu.au
      cc: linux-sh@vger.kernel.org
      cc: sparclinux@vger.kernel.org
      cc: x86@kernel.org
      cc: linux-mm@kvack.org
      49a7f04a