1. 06 5月, 2014 2 次提交
  2. 03 5月, 2014 1 次提交
    • D
      x86/efi: earlyprintk=efi,keep fix · 5f35eb0e
      Dave Young 提交于
      earlyprintk=efi,keep will cause kernel hangs while freeing initmem like
      below:
      
        VFS: Mounted root (ext4 filesystem) readonly on device 254:2.
        devtmpfs: mounted
        Freeing unused kernel memory: 880K (ffffffff817d4000 - ffffffff818b0000)
      
      It is caused by efi earlyprintk use __init function which will be freed
      later.  Such as early_efi_write is marked as __init, also it will use
      early_ioremap which is init function as well.
      
      To fix this issue, I added early initcall early_efi_map_fb which maps
      the whole efi fb for later use. OTOH, adding a wrapper function
      early_efi_map which calls early_ioremap before ioremap is available.
      
      With this patch applied efi boot ok with earlyprintk=efi,keep console=efi
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      5f35eb0e
  3. 28 4月, 2014 3 次提交
    • T
      genirq: x86: Ensure that dynamic irq allocation does not conflict · 62a08ae2
      Thomas Gleixner 提交于
      On x86 the allocation of irq descriptors may allocate interrupts which
      are in the range of the GSI interrupts. That's wrong as those
      interrupts are hardwired and we don't have the irq domain translation
      like PPC. So one of these interrupts can be hooked up later to one of
      the devices which are hard wired to it and the io_apic init code for
      that particular interrupt line happily reuses that descriptor with a
      completely different configuration so hell breaks lose.
      
      Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs,
      except for a few usage sites which have not yet blown up in our face
      for whatever reason. But for drivers which need an irq range, like the
      GPIO drivers, we have no limit in place and we don't want to expose
      such a detail to a driver.
      
      To cure this introduce a function which an architecture can implement
      to impose a lower bound on the dynamic interrupt allocations.
      
      Implement it for x86 and set the lower bound to nr_gsi_irqs, which is
      the end of the hardwired interrupt space, so all dynamic allocations
      happen above.
      
      That not only allows the GPIO driver to work sanely, it also protects
      the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and
      htirq code. They need to be cleaned up as well, but that's a separate
      issue.
      Reported-by: NJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Krogerus Heikki <heikki.krogerus@intel.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      62a08ae2
    • B
      KVM: x86: Check for host supported fields in shadow vmcs · fe2b201b
      Bandan Das 提交于
      We track shadow vmcs fields through two static lists,
      one for read only and another for r/w fields. However, with
      addition of new vmcs fields, not all fields may be supported on
      all hosts. If so, copy_vmcs12_to_shadow() trying to vmwrite on
      unsupported hosts will result in a vmwrite error. For example, commit
      36be0b9d introduced GUEST_BNDCFGS, which is not supported
      by all processors. Filter out host unsupported fields before
      letting guests use shadow vmcs
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fe2b201b
    • O
      x86/vsmp: Fix irq routing · 39025ba3
      Oren Twaig 提交于
      Correct IRQ routing in case a vSMP box is detected
      but the  Interrupt Routing Comply (IRC) value is set to
      "comply", which leads to incorrect IRQ routing.
      
      Before the patch:
      
      When a vSMP box was detected and IRC was set to "comply",
      users (and the kernel) couldn't effectively set the
      destination of the IRQs. This is because the hook inside
      vsmp_64.c always setup all CPUs as the IRQ destination using
      cpumask_setall() as the return value for IRQ allocation mask.
      Later, this "overrided" mask caused the kernel to set the IRQ
      destination to the lowest online CPU in the mask (CPU0 usually).
      
      After the patch:
      
      When the IRC is set to "comply", users (and the kernel) can control
      the destination of the IRQs as we will not be changing the
      default "apic->vector_allocation_domain".
      Signed-off-by: NOren Twaig <oren@scalemp.com>
      Acked-by: NShai Fultheim <shai@scalemp.com>
      Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-oren@scalemp.com
      [ Minor readability edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      39025ba3
  4. 24 4月, 2014 1 次提交
  5. 22 4月, 2014 1 次提交
  6. 18 4月, 2014 1 次提交
  7. 17 4月, 2014 2 次提交
    • M
      kprobes/x86: Fix page-fault handling logic · 6381c24c
      Masami Hiramatsu 提交于
      Current kprobes in-kernel page fault handler doesn't
      expect that its single-stepping can be interrupted by
      an NMI handler which may cause a page fault(e.g. perf
      with callback tracing).
      
      In that case, the page-fault handled by kprobes and it
      misunderstands the page-fault has been caused by the
      single-stepping code and tries to recover IP address
      to probed address.
      
      But the truth is the page-fault has been caused by the
      NMI handler, and do_page_fault failes to handle real
      page fault because the IP address is modified and
      causes Kernel BUGs like below.
      
       ----
       [ 2264.726905] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
       [ 2264.727190] IP: [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40
      
      To handle this correctly, I fixed the kprobes fault
      handler to ensure the faulted ip address is its own
      single-step buffer instead of checking current kprobe
      state.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Sandeepa Prabhu <sandeepa.prabhu@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fche@redhat.com
      Cc: systemtap@sourceware.org
      Link: http://lkml.kernel.org/r/20140417081644.26341.52351.stgit@ltc230.yrl.intra.hitachi.co.jpSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6381c24c
    • I
      x86/mce: Fix CMCI preemption bugs · ea431643
      Ingo Molnar 提交于
      The following commit:
      
        27f6c573 ("x86, CMCI: Add proper detection of end of CMCI storms")
      
      Added two preemption bugs:
      
       - machine_check_poll() does a get_cpu_var() without a matching
         put_cpu_var(), which causes preemption imbalance and crashes upon
         bootup.
      
       - it does percpu ops without disabling preemption. Preemption is not
         disabled due to the mistaken use of a raw spinlock.
      
      To fix these bugs fix the imbalance and change
      cmci_discover_lock to a regular spinlock.
      Reported-by: NOwen Kibel <qmewlo@gmail.com>
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Chen, Gong <gong.chen@linux.intel.com>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexander Todorov <atodorov@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/n/tip-jtjptvgigpfkpvtQxpEk1at2@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      --
       arch/x86/kernel/cpu/mcheck/mce.c       |    4 +---
       arch/x86/kernel/cpu/mcheck/mce_intel.c |   18 +++++++++---------
       2 files changed, 10 insertions(+), 12 deletions(-)
      ea431643
  8. 16 4月, 2014 2 次提交
    • I
      x86: Remove the PCI reboot method from the default chain · 5be44a6f
      Ingo Molnar 提交于
      Steve reported a reboot hang and bisected it back to this commit:
      
        a4f1987e x86, reboot: Add EFI and CF9 reboot methods into the default list
      
      He heroically tested all reboot methods and found the following:
      
        reboot=t       # triple fault                  ok
        reboot=k       # keyboard ctrl                 FAIL
        reboot=b       # BIOS                          ok
        reboot=a       # ACPI                          FAIL
        reboot=e       # EFI                           FAIL   [system has no EFI]
        reboot=p       # PCI 0xcf9                     FAIL
      
      And I think it's pretty obvious that we should only try PCI 0xcf9 as a
      last resort - if at all.
      
      The other observation is that (on this box) we should never try
      the PCI reboot method, but close with either the 'triple fault'
      or the 'BIOS' (terminal!) reboot methods.
      
      Thirdly, CF9_COND is a total misnomer - it should be something like
      CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ...
      
      So this patch fixes the worst problems:
      
       - it orders the actual reboot logic to follow the reboot ordering
         pattern - it was in a pretty random order before for no good
         reason.
      
       - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and
         BOOT_CF9_SAFE flags to make the code more obvious.
      
       - it tries the BIOS reboot method before the PCI reboot method.
         (Since 'BIOS' is a terminal reboot method resulting in a hang
          if it does not work, this is essentially equivalent to removing
          the PCI reboot method from the default reboot chain.)
      
       - just for the miraculous possibility of terminal (resulting
         in hang) reboot methods of triple fault or BIOS returning
         without having done their job, there's an ordering between
         them as well.
      Reported-and-bisected-and-tested-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Li Aubrey <aubrey.li@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5be44a6f
    • K
      xen/spinlock: Don't enable them unconditionally. · e0fc17a9
      Konrad Rzeszutek Wilk 提交于
      The git commit a945928e
      ('xen: Do not enable spinlocks before jump_label_init() has executed')
      was added to deal with the jump machinery. Earlier the code
      that turned on the jump label was only called by Xen specific
      functions. But now that it had been moved to the initcall machinery
      it gets called on Xen, KVM, and baremetal - ouch!. And the detection
      machinery to only call it on Xen wasn't remembered in the heat
      of merge window excitement.
      
      This means that the slowpath is enabled on baremetal while it should
      not be.
      Reported-by: NWaiman Long <waiman.long@hp.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      CC: stable@vger.kernel.org
      CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      e0fc17a9
  9. 15 4月, 2014 7 次提交
  10. 14 4月, 2014 3 次提交
  11. 12 4月, 2014 2 次提交
  12. 11 4月, 2014 4 次提交
  13. 10 4月, 2014 1 次提交
  14. 09 4月, 2014 1 次提交
  15. 08 4月, 2014 4 次提交
    • M
      x86: use generic early_ioremap · 5b7c73e0
      Mark Salter 提交于
      Move x86 over to the generic early ioremap implementation.
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b7c73e0
    • D
      x86/mm: sparse warning fix for early_memremap · 6b550f6f
      Dave Young 提交于
      This patch series takes the common bits from the x86 early ioremap
      implementation and creates a generic implementation which may be used by
      other architectures.  The early ioremap interfaces are intended for
      situations where boot code needs to make temporary virtual mappings
      before the normal ioremap interfaces are available.  Typically, this
      means before paging_init() has run.
      
      This patch (of 6):
      
      There's a lot of sparse warnings for code like below: void *a =
      early_memremap(phys_addr, size);
      
      early_memremap intend to map kernel memory with ioremap facility, the
      return pointer should be a kernel ram pointer instead of iomem one.
      
      For making the function clearer and supressing sparse warnings this patch
      do below two things:
      1. cast to (__force void *) for the return value of early_memremap
      2. add early_memunmap function and pass (__force void __iomem *) to iounmap
      
      From Boris:
        "Ingo told me yesterday, it makes sense too.  I'd guess we can try it.
         FWIW, all callers of early_memremap use the memory they get remapped
         as normal memory so we should be safe"
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Borislav Petkov <borislav.petkov@amd.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6b550f6f
    • C
      percpu: add raw_cpu_ops · b3ca1c10
      Christoph Lameter 提交于
      The kernel has never been audited to ensure that this_cpu operations are
      consistently used throughout the kernel.  The code generated in many
      places can be improved through the use of this_cpu operations (which
      uses a segment register for relocation of per cpu offsets instead of
      performing address calculations).
      
      The patch set also addresses various consistency issues in general with
      the per cpu macros.
      
      A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
         because checks are skipped. This is typically shown through a raw_
         prefix. So this patch set changes the places where __this_cpu_ptr()
         is used to raw_cpu_ptr().
      
      B. There has been the long term wish by some that __this_cpu operations
         would check for preemption. However, there are cases where preemption
         checks need to be skipped. This patch set adds raw_cpu operations that
         do not check for preemption and then adds preemption checks to the
         __this_cpu operations.
      
      C. The use of __get_cpu_var is always a reference to a percpu variable
         that can also be handled via a this_cpu operation. This patch set
         replaces all uses of __get_cpu_var with this_cpu operations.
      
      D. We can then use this_cpu RMW operations in various places replacing
         sequences of instructions by a single one.
      
      E. The use of this_cpu operations throughout will allow other arches than
         x86 to implement optimized references and RMV operations to work with
         per cpu local data.
      
      F. The use of this_cpu operations opens up the possibility to
         further optimize code that relies on synchronization through
         per cpu data.
      
      The patch set works in a couple of stages:
      
      I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
          Also converts the existing __this_cpu_xx_# primitive in the x86
          code to raw_cpu_xx_#.
      
      II. Patch 2-4 use the raw_cpu operations in places that would give
           us false positives once they are enabled.
      
      III. Patch 5 adds preemption checks to __this_cpu operations to allow
          checking if preemption is properly disabled when these functions
          are used.
      
      IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
         with this_cpu_ptr. They do not depend on any changes to the percpu
         code. No preemption tests are skipped if they are applied.
      
      V. Patches 21-46 are conversion patches that use this_cpu operations
         in various kernel subsystems/drivers or arch code.
      
      VI.  Patches 47/48 (not included in this series) remove no longer used
          functions (__this_cpu_ptr and __get_cpu_var).  These should only be
          applied after all the conversion patches have made it and after we
          have done additional passes through the kernel to ensure that none of
          the uses of these functions remain.
      
      This patch (of 46):
      
      The patches following this one will add preemption checks to __this_cpu
      ops so we need to have an alternative way to use this_cpu operations
      without preemption checks.
      
      raw_cpu_ops will be the basis for all other ops since these will be the
      operations that do not implement any checks.
      
      Primitive operations are renamed by this patch from __this_cpu_xxx to
      raw_cpu_xxxx.
      
      Also change the uses of the x86 percpu primitives in preempt.h.
      These depend directly on asm/percpu.h (header #include nesting issue).
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bryan Wu <cooloney@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Hedi Berriche <hedi@sgi.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wim Van Sebroeck <wim@iguana.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3ca1c10
    • J
      x86: always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG · b06dd879
      Josh Triplett 提交于
      This ensures that BUG() always has a definition that causes a trap (via
      an undefined instruction), and that the compiler still recognizes the
      code following BUG() as unreachable, avoiding warnings that would
      otherwise appear (such as on non-void functions that don't return a
      value after BUG()).
      
      In addition to saving a few bytes over the generic infinite-loop
      implementation, this implementation traps rather than looping, which
      potentially allows for better error-recovery behavior (such as by
      rebooting).
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b06dd879
  16. 04 4月, 2014 1 次提交
  17. 03 4月, 2014 2 次提交
  18. 02 4月, 2014 2 次提交