1. 21 11月, 2012 1 次提交
    • R
      x86-32: Fix invalid stack address while in softirq · 10226238
      Robert Richter 提交于
      In 32 bit the stack address provided by kernel_stack_pointer() may
      point to an invalid range causing NULL pointer access or page faults
      while in NMI (see trace below). This happens if called in softirq
      context and if the stack is empty. The address at &regs->sp is then
      out of range.
      
      Fixing this by checking if regs and &regs->sp are in the same stack
      context. Otherwise return the previous stack pointer stored in struct
      thread_info. If that address is invalid too, return address of regs.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000a
       IP: [<c1004237>] print_context_stack+0x6e/0x8d
       *pde = 00000000
       Oops: 0000 [#1] SMP
       Modules linked in:
       Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch
       EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0
       EIP is at print_context_stack+0x6e/0x8d
       EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a
       ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
        DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
       CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0
       DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
       DR6: ffff0ff0 DR7: 00000400
       Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
       Stack:
        000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c
        f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000
        f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c
       Call Trace:
        [<c1003723>] dump_trace+0x7b/0xa1
        [<c12dce1c>] x86_backtrace+0x40/0x88
        [<c12db712>] ? oprofile_add_sample+0x56/0x84
        [<c12db731>] oprofile_add_sample+0x75/0x84
        [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260
        [<c12dd40d>] profile_exceptions_notify+0x23/0x4c
        [<c1395034>] nmi_handle+0x31/0x4a
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c13950ed>] do_nmi+0xa0/0x2ff
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c13949e5>] nmi_stack_correct+0x28/0x2d
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c1003603>] ? do_softirq+0x4b/0x7f
        <IRQ>
        [<c102a06f>] irq_exit+0x35/0x5b
        [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a
        [<c1394746>] apic_timer_interrupt+0x2a/0x30
       Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
       EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
       CR2: 000000000000000a
       ---[ end trace 62afee3481b00012 ]---
       Kernel panic - not syncing: Fatal exception in interrupt
      
      V2:
      * add comments to kernel_stack_pointer()
      * always return a valid stack address by falling back to the address
        of regs
      Reported-by: NYang Wei <wei.yang@windriver.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Jun Zhang <jun.zhang@intel.com>
      10226238
  2. 26 10月, 2012 1 次提交
  3. 24 10月, 2012 1 次提交
    • M
      x86/efi: Fix oops caused by incorrect set_memory_uc() usage · 3e8fa263
      Matt Fleming 提交于
      Calling __pa() with an ioremap'd address is invalid. If we
      encounter an efi_memory_desc_t without EFI_MEMORY_WB set in
      ->attribute we currently call set_memory_uc(), which in turn
      calls __pa() on a potentially ioremap'd address.
      
      On CONFIG_X86_32 this results in the following oops:
      
        BUG: unable to handle kernel paging request at f7f22280
        IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210
        *pdpt = 0000000001978001 *pde = 0000000001ffb067 *pte = 0000000000000000
        Oops: 0000 [#1] PREEMPT SMP
        Modules linked in:
      
        Pid: 0, comm: swapper Not tainted 3.0.0-acpi-efi-0805 #3
         EIP: 0060:[<c10257b9>] EFLAGS: 00010202 CPU: 0
         EIP is at reserve_ram_pages_type+0x89/0x210
         EAX: 0070e280 EBX: 38714000 ECX: f7814000 EDX: 00000000
         ESI: 00000000 EDI: 38715000 EBP: c189fef0 ESP: c189fea8
         DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
        Process swapper (pid: 0, ti=c189e000 task=c18bbe60 task.ti=c189e000)
        Stack:
         80000200 ff108000 00000000 c189ff00 00038714 00000000 00000000 c189fed0
         c104f8ca 00038714 00000000 00038715 00000000 00000000 00038715 00000000
         00000010 38715000 c189ff48 c1025aff 38715000 00000000 00000010 00000000
        Call Trace:
         [<c104f8ca>] ? page_is_ram+0x1a/0x40
         [<c1025aff>] reserve_memtype+0xdf/0x2f0
         [<c1024dc9>] set_memory_uc+0x49/0xa0
         [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa
         [<c19216d4>] start_kernel+0x291/0x2f2
         [<c19211c7>] ? loglevel+0x1b/0x1b
         [<c19210bf>] i386_start_kernel+0xbf/0xc8
      
      The only time we can call set_memory_uc() for a memory region is
      when it is part of the direct kernel mapping. For the case where
      we ioremap a memory region we must leave it alone.
      
      This patch reimplements the fix from e8c71062 ("x86, efi:
      Calling __pa() with an ioremap()ed address is invalid") which
      was reverted in e1ad783b because it caused a regression on
      some MacBooks (they hung at boot). The regression was caused
      because the commit only marked EFI_RUNTIME_SERVICES_DATA as
      E820_RESERVED_EFI, when it should have marked all regions that
      have the EFI_MEMORY_RUNTIME attribute.
      
      Despite first impressions, it's not possible to use
      ioremap_cache() to map all cached memory regions on
      CONFIG_X86_64 because of the way that the memory map might be
      configured as detailed in the following bug report,
      
      	https://bugzilla.redhat.com/show_bug.cgi?id=748516
      
      e.g. some of the EFI memory regions *need* to be mapped as part
      of the direct kernel mapping.
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Cc: Matthew Garrett <mjg@redhat.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Huang Ying <huang.ying.caritas@gmail.com>
      Cc: Keith Packard <keithp@keithp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1350649546-23541-1-git-send-email-matt@console-pimps.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3e8fa263
  4. 20 10月, 2012 2 次提交
  5. 13 10月, 2012 1 次提交
  6. 09 10月, 2012 4 次提交
    • D
      mm: Add and use update_mmu_cache_pmd() in transparent huge page code. · b113da65
      David Miller 提交于
      The transparent huge page code passes a PMD pointer in as the third
      argument of update_mmu_cache(), which expects a PTE pointer.
      
      This never got noticed because X86 implements update_mmu_cache() as a
      macro and thus we don't get any type checking, and X86 is the only
      architecture which supports transparent huge pages currently.
      
      Before other architectures can support transparent huge pages properly we
      need to add a new interface which will take a PMD pointer as the third
      argument rather than a PTE pointer.
      
      [akpm@linux-foundation.org: implement update_mm_cache_pmd() for s390]
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b113da65
    • A
      mm: thp: fix pmd_present for split_huge_page and PROT_NONE with THP · 027ef6c8
      Andrea Arcangeli 提交于
      In many places !pmd_present has been converted to pmd_none.  For pmds
      that's equivalent and pmd_none is quicker so using pmd_none is better.
      
      However (unless we delete pmd_present) we should provide an accurate
      pmd_present too.  This will avoid the risk of code thinking the pmd is non
      present because it's under __split_huge_page_map, see the pmd_mknotpresent
      there and the comment above it.
      
      If the page has been mprotected as PROT_NONE, it would also lead to a
      pmd_present false negative in the same way as the race with
      split_huge_page.
      
      Because the PSE bit stays on at all times (both during split_huge_page and
      when the _PAGE_PROTNONE bit get set), we could only check for the PSE bit,
      but checking the PROTNONE bit too is still good to remember pmd_present
      must always keep PROT_NONE into account.
      
      This explains a not reproducible BUG_ON that was seldom reported on the
      lists.
      
      The same issue is in pmd_large, it would go wrong with both PROT_NONE and
      if it races with split_huge_page.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <jweiner@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      027ef6c8
    • S
      atomic: implement generic atomic_dec_if_positive() · e79bee24
      Shaohua Li 提交于
      The x86 implementation of atomic_dec_if_positive is quite generic, so make
      it available to all architectures.
      
      This is needed for "swap: add a simple detector for inappropriate swapin
      readahead".
      
      [akpm@linux-foundation.org: do the "#define foo foo" trick in the conventional manner]
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e79bee24
    • W
      mm: hugetlb: add arch hook for clearing page flags before entering pool · 5d3a551c
      Will Deacon 提交于
      The core page allocator ensures that page flags are zeroed when freeing
      pages via free_pages_check.  A number of architectures (ARM, PPC, MIPS)
      rely on this property to treat new pages as dirty with respect to the data
      cache and perform the appropriate flushing before mapping the pages into
      userspace.
      
      This can lead to cache synchronisation problems when using hugepages,
      since the allocator keeps its own pool of pages above the usual page
      allocator and does not reset the page flags when freeing a page into the
      pool.
      
      This patch adds a new architecture hook, arch_clear_hugepage_flags, so
      that architectures which rely on the page flags being in a particular
      state for fresh allocations can adjust the flags accordingly when a page
      is freed into the pool.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d3a551c
  7. 06 10月, 2012 2 次提交
  8. 04 10月, 2012 2 次提交
  9. 03 10月, 2012 4 次提交
  10. 01 10月, 2012 2 次提交
  11. 28 9月, 2012 2 次提交
  12. 26 9月, 2012 5 次提交
    • F
      x86: Use the new schedule_user API on userspace preemption · 0430499c
      Frederic Weisbecker 提交于
      This way we can exit the RCU extended quiescent state before
      we schedule a new task from irq/exception exit.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      0430499c
    • F
      x86: Exception hooks for userspace RCU extended QS · 6ba3c97a
      Frederic Weisbecker 提交于
      Add necessary hooks to x86 exception for userspace
      RCU extended quiescent state support.
      
      This includes traps, page fault, debug exceptions, etc...
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6ba3c97a
    • F
      x86: Syscall hooks for userspace RCU extended QS · bf5a3c13
      Frederic Weisbecker 提交于
      Add syscall slow path hooks to notify syscall entry
      and exit on CPUs that want to support userspace RCU
      extended quiescent state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      bf5a3c13
    • T
      x86_64: Work around old GAS bug · 1b2b23d8
      Tao Guo 提交于
      GAS in binutils(2.16.91) could not parse parentheses within
      macro parameters unless fully parenthesized, and this is a
      workaround to make old gas work without generating below errors:
      
       arch/x86/kernel/entry_64.S: Assembler messages:
       arch/x86/kernel/entry_64.S:387: Error: too many positional arguments
       arch/x86/kernel/entry_64.S:389: Error: too many positional arguments
       [...]
      Signed-off-by: NTao Guo <glorioustao@gmail.com>
      Reluctantly-Acked-by: NJan Beulich <jbeulich@novell.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1348648102-12653-1-git-send-email-glorioustao@gmail.com
      [ Jan argues that these old GAS versions are fragile - which is so, but lets give them a chance. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1b2b23d8
    • H
      x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space · e139e955
      H. Peter Anvin 提交于
      With SMAP, the [f][x]rstor_checking() functions are no longer usable
      for user-space pointers by applying a simple __force cast.  Instead,
      create new [f][x]rstor_user() functions which do the proper SMAP
      magic.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.com
      e139e955
  13. 25 9月, 2012 1 次提交
    • J
      time: Convert x86_64 to using new update_vsyscall · 650ea024
      John Stultz 提交于
      Switch x86_64 to using sub-ns precise vsyscall
      
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
      650ea024
  14. 23 9月, 2012 1 次提交
    • J
      KVM: x86: Fix guest debug across vcpu INIT reset · c8639010
      Jan Kiszka 提交于
      If we reset a vcpu on INIT, we so far overwrote dr7 as provided by
      KVM_SET_GUEST_DEBUG, and we also cleared switch_db_regs unconditionally.
      
      Fix this by saving the dr7 used for guest debugging and calculating the
      effective register value as well as switch_db_regs on any potential
      change. This will change to focus of the set_guest_debug vendor op to
      update_dp_bp_intercept.
      
      Found while trying to stop on start_secondary.
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c8639010
  15. 22 9月, 2012 8 次提交
  16. 21 9月, 2012 1 次提交
  17. 20 9月, 2012 2 次提交