1. 04 9月, 2009 1 次提交
    • J
      x86/i386: Make sure stack-protector segment base is cache aligned · 1ea0d14e
      Jeremy Fitzhardinge 提交于
      The Intel Optimization Reference Guide says:
      
      	In Intel Atom microarchitecture, the address generation unit
      	assumes that the segment base will be 0 by default. Non-zero
      	segment base will cause load and store operations to experience
      	a delay.
      		- If the segment base isn't aligned to a cache line
      		  boundary, the max throughput of memory operations is
      		  reduced to one [e]very 9 cycles.
      	[...]
      	Assembly/Compiler Coding Rule 15. (H impact, ML generality)
      	For Intel Atom processors, use segments with base set to 0
      	whenever possible; avoid non-zero segment base address that is
      	not aligned to cache line boundary at all cost.
      
      We can't avoid having a non-zero base for the stack-protector
      segment, but we can make it cache-aligned.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <4AA01893.6000507@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1ea0d14e
  2. 26 8月, 2009 1 次提交
    • H
      x86: allow "=rm" in native_save_fl() · ab94fcf5
      H. Peter Anvin 提交于
      This is a partial revert of f1f029c7.
      
      "=rm" is allowed in this context, because "pop" is explicitly defined
      to adjust the stack pointer *before* it evaluates its effective
      address, if it has one.  Thus, we do end up writing to the correct
      address even if we use an on-stack memory argument.
      
      The original reporter for f1f029c7 was
      apparently using a broken x86 simulator.
      
      [ Impact: performance ]
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Gabe Black <spamforgabe@umich.edu>
      ab94fcf5
  3. 22 8月, 2009 1 次提交
  4. 18 8月, 2009 1 次提交
    • S
      x86, pat: Allow ISA memory range uncacheable mapping requests · 1adcaafe
      Suresh Siddha 提交于
      Max Vozeler reported:
      >  Bug 13877 -  bogl-term broken with CONFIG_X86_PAT=y, works with =n
      >
      >  strace of bogl-term:
      >  814   mmap2(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0)
      >				 = -1 EAGAIN (Resource temporarily unavailable)
      >  814   write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n",
      >	       57) = 57
      
      PAT code maps the ISA memory range as WB in the PAT attribute, so that
      fixed range MTRR registers define the actual memory type (UC/WC/WT etc).
      
      But the upper level is_new_memtype_allowed() API checks are failing,
      as the request here is for UC and the return tracked type is WB (Tracked type is
      WB as MTRR type for this legacy range potentially will be different for each
      4k page).
      
      Fix is_new_memtype_allowed() by always succeeding the ISA address range
      checks, as the null PAT (WB) and def MTRR fixed range register settings
      satisfy the memory type needs of the applications that map the ISA address
      range.
      Reported-and-Tested-by: NMax Vozeler <xam@debian.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      1adcaafe
  5. 15 8月, 2009 1 次提交
  6. 08 8月, 2009 1 次提交
  7. 04 8月, 2009 4 次提交
    • J
      x86, UV: Fix macros for accessing large node numbers · 67e83f30
      Jack Steiner 提交于
      The UV chipset automatically supplies the upper bits on nodes
      being referenced by MMR accesses. These bit can be deleted from
      the hub addressing macros.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143808.GA8076@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67e83f30
    • J
      x86, UV: Handle missing blade-local memory correctly · 6c7184b7
      Jack Steiner 提交于
      UV blades may not have any blade-local memory. Add a field
      (nid) to the UV blade structure to indicates whether the node
      has local memory. This is needed by the GRU driver (pushed
      separately).
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143507.GA7006@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c7184b7
    • H
      x86: fix assembly constraints in native_save_fl() · f1f029c7
      H. Peter Anvin 提交于
      From Gabe Black in bugzilla 13888:
      
      native_save_fl is implemented as follows:
      
        11static inline unsigned long native_save_fl(void)
        12{
        13        unsigned long flags;
        14
        15        asm volatile("# __raw_save_flags\n\t"
        16                     "pushf ; pop %0"
        17                     : "=g" (flags)
        18                     : /* no input */
        19                     : "memory");
        20
        21        return flags;
        22}
      
      If gcc chooses to put flags on the stack, for instance because this is
      inlined into a larger function with more register pressure, the offset
      of the flags variable from the stack pointer will change when the
      pushf is performed. gcc doesn't attempt to understand that fact, and
      address used for pop will still be the same. It will write to
      somewhere near flags on the stack but not actually into it and
      overwrite some other value.
      
      I saw this happen in the ide_device_add_all function when running in a
      simulator I work on. I'm assuming that some quirk of how the simulated
      hardware is set up caused the code path this is on to be executed when
      it normally wouldn't.
      
      A simple fix might be to change "=g" to "=r".
      Reported-by: NGabe Black <spamforgabe@umich.edu>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Stable Team <stable@kernel.org>
      f1f029c7
    • P
      x86: Make 64-bit efi_ioremap use ioremap on MMIO regions · 6a7bbd57
      Paul Mackerras 提交于
      Booting current 64-bit x86 kernels on the latest Apple MacBook
      (MacBook5,2) via EFI gives the following warning:
      
      [    0.182209] ------------[ cut here ]------------
      [    0.182222] WARNING: at arch/x86/mm/pageattr.c:581 __cpa_process_fault+0x44/0xa0()
      [    0.182227] Hardware name: MacBook5,2
      [    0.182231] CPA: called for zero pte. vaddr = ffff8800ffe00000 cpa->vaddr = ffff8800ffe00000
      [    0.182236] Modules linked in:
      [    0.182242] Pid: 0, comm: swapper Not tainted 2.6.31-rc4 #6
      [    0.182246] Call Trace:
      [    0.182254]  [<ffffffff8102c754>] ? __cpa_process_fault+0x44/0xa0
      [    0.182261]  [<ffffffff81048668>] warn_slowpath_common+0x78/0xd0
      [    0.182266]  [<ffffffff81048744>] warn_slowpath_fmt+0x64/0x70
      [    0.182272]  [<ffffffff8102c7ec>] ? update_page_count+0x3c/0x50
      [    0.182280]  [<ffffffff818d25c5>] ? phys_pmd_init+0x140/0x22e
      [    0.182286]  [<ffffffff8102c754>] __cpa_process_fault+0x44/0xa0
      [    0.182292]  [<ffffffff8102ce60>] __change_page_attr_set_clr+0x5f0/0xb40
      [    0.182301]  [<ffffffff810d1035>] ? vm_unmap_aliases+0x175/0x190
      [    0.182307]  [<ffffffff8102d4ae>] change_page_attr_set_clr+0xfe/0x3d0
      [    0.182314]  [<ffffffff8102dcca>] _set_memory_uc+0x2a/0x30
      [    0.182319]  [<ffffffff8102dd4b>] set_memory_uc+0x7b/0xb0
      [    0.182327]  [<ffffffff818afe31>] efi_enter_virtual_mode+0x2ad/0x2c9
      [    0.182334]  [<ffffffff818a1c66>] start_kernel+0x2db/0x3f4
      [    0.182340]  [<ffffffff818a1289>] x86_64_start_reservations+0x99/0xb9
      [    0.182345]  [<ffffffff818a1389>] x86_64_start_kernel+0xe0/0xf2
      [    0.182357] ---[ end trace 4eaa2a86a8e2da22 ]---
      [    0.182982] init_memory_mapping: 00000000ffffc000-0000000100000000
      [    0.182993]  00ffffc000 - 0100000000 page 4k
      
      This happens because the 64-bit version of efi_ioremap calls
      init_memory_mapping for all addresses, regardless of whether they are
      RAM or MMIO.  The EFI tables on this machine ask for runtime access to
      some MMIO regions:
      
      [    0.000000] EFI: mem195: type=11, attr=0x8000000000000000, range=[0x0000000093400000-0x0000000093401000) (0MB)
      [    0.000000] EFI: mem196: type=11, attr=0x8000000000000000, range=[0x00000000ffc00000-0x00000000ffc40000) (0MB)
      [    0.000000] EFI: mem197: type=11, attr=0x8000000000000000, range=[0x00000000ffc40000-0x00000000ffc80000) (0MB)
      [    0.000000] EFI: mem198: type=11, attr=0x8000000000000000, range=[0x00000000ffc80000-0x00000000ffca4000) (0MB)
      [    0.000000] EFI: mem199: type=11, attr=0x8000000000000000, range=[0x00000000ffca4000-0x00000000ffcb4000) (0MB)
      [    0.000000] EFI: mem200: type=11, attr=0x8000000000000000, range=[0x00000000ffcb4000-0x00000000ffffc000) (3MB)
      [    0.000000] EFI: mem201: type=11, attr=0x8000000000000000, range=[0x00000000ffffc000-0x0000000100000000) (0MB)
      
      This arranges to pass the EFI memory type through to efi_ioremap, and
      makes efi_ioremap use ioremap rather than init_memory_mapping if the
      type is EFI_MEMORY_MAPPED_IO.  With this, the above warning goes away.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      LKML-Reference: <19062.55858.533494.471153@cargo.ozlabs.ibm.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6a7bbd57
  8. 30 7月, 2009 2 次提交
    • R
      lguest: update commentry · a91d74a3
      Rusty Russell 提交于
      Every so often, after code shuffles, I need to go through and unbitrot
      the Lguest Journey (see drivers/lguest/README).  Since we now use RCU in
      a simple form in one place I took the opportunity to expand that explanation.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      a91d74a3
    • R
      lguest: fix comment style · 2e04ef76
      Rusty Russell 提交于
      I don't really notice it (except to begrudge the extra vertical
      space), but Ingo does.  And he pointed out that one excuse of lguest
      is as a teaching tool, it should set a good example.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      2e04ef76
  9. 28 7月, 2009 1 次提交
    • B
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() · 9e1b32ca
      Benjamin Herrenschmidt 提交于
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
      
      Upcoming paches to support the new 64-bit "BookE" powerpc architecture
      will need to have the virtual address corresponding to PTE page when
      freeing it, due to the way the HW table walker works.
      
      Basically, the TLB can be loaded with "large" pages that cover the whole
      virtual space (well, sort-of, half of it actually) represented by a PTE
      page, and which contain an "indirect" bit indicating that this TLB entry
      RPN points to an array of PTEs from which the TLB can then create direct
      entries. Thus, in order to invalidate those when PTE pages are deleted,
      we need the virtual address to pass to tlbilx or tlbivax instructions.
      
      The old trick of sticking it somewhere in the PTE page struct page sucks
      too much, the address is almost readily available in all call sites and
      almost everybody implemets these as macros, so we may as well add the
      argument everywhere. I added it to the pmd and pud variants for consistency.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
      Acked-by: NNick Piggin <npiggin@suse.de>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e1b32ca
  10. 21 7月, 2009 2 次提交
  11. 20 7月, 2009 2 次提交
  12. 17 7月, 2009 1 次提交
  13. 11 7月, 2009 2 次提交
  14. 10 7月, 2009 1 次提交
  15. 04 7月, 2009 2 次提交
    • E
      x86: atomic64: Inline atomic64_read() again · a79f0da8
      Eric Dumazet 提交于
      Now atomic64_read() is light weight (no register pressure and
      small icache), we can inline it again.
      
      Also use "=&A" constraint instead of "+A" to avoid warning
      about unitialized 'res' variable. (gcc had to force 0 in eax/edx)
      
        $ size vmlinux.prev vmlinux.after
           text    data     bss     dec     hex filename
        4908667  451676 1684868 7045211  6b805b vmlinux.prev
        4908651  451676 1684868 7045195  6b804b vmlinux.after
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <4A4E1AA2.30002@gmail.com>
      [ Also fix typo in atomic64_set() export ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a79f0da8
    • I
      x86: atomic64: Improve atomic64_xchg() · 3a8d1788
      Ingo Molnar 提交于
      Remove the read-first logic from atomic64_xchg() and simplify
      the loop.
      
      This function was the last user of __atomic64_read() - remove it.
      
      Also, change the 'real_val' assumption from the somewhat quirky
      1ULL << 32 value to the (just as arbitrary, but simpler) value
      of 0.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a8d1788
  16. 03 7月, 2009 7 次提交
    • P
      x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP · 8e049ef0
      Paul Mackerras 提交于
      Occasionally we get bugs where atomic_read or atomic_set are
      used on atomic64_t variables or vice versa.  These bugs don't
      generate warnings on x86 because atomic_read and atomic_set are
      coded as macros rather than C functions, so we don't get any
      type-checking on their arguments; similarly for atomic64_read
      and atomic64_set in 64-bit kernels.
      
      This converts them to C functions so that the arguments are
      type-checked and bugs like this will get caught more easily. It
      also converts atomic_cmpxchg and atomic_xchg, and
      atomic64_cmpxchg and atomic64_xchg on 64-bit, so we get
      type-checking on their arguments too.
      
      Compiling a typical 64-bit x86 config, this generates no new
      warnings, and the vmlinux text is 86 bytes smaller.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e049ef0
    • J
      x86: Remove unused function lapic_watchdog_ok() · c7210e1f
      Jaswinder Singh Rajput 提交于
      lapic_watchdog_ok() is a global function but no one is using it.
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7210e1f
    • M
      x86: Fix fixmap page order for FIX_TEXT_POKE0,1 · 12b9d7cc
      Mathieu Desnoyers 提交于
      Masami reported:
      
      > Since the fixmap pages are assigned higher address to lower,
      > text_poke() has to use it with inverted order (FIX_TEXT_POKE1
      > to FIX_TEXT_POKE0).
      
      I prefer to just invert the order of the fixmap declaration.
      It's simpler and more straightforward.
      
      Backward fixmaps seems to be used by both x86 32 and 64.
      
      It's really rare but a nasty bug, because it only hurts when
      instructions to patch are crossing a page boundary. If this
      happens, the fixmap write accesses will spill on the following
      fixmap, which may very well crash the system. And this does not
      crash the system, it could leave illegal instructions in place.
      Thanks Masami for finding this.
      
      It seems to have crept into the 2.6.30-rc series, so this calls
      for a -stable inclusion.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <20090701213722.GH19926@Krystal>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12b9d7cc
    • I
      x86: atomic64: Make atomic_read() type-safe · 32171208
      Ingo Molnar 提交于
      Linus noticed that atomic64_xchg() uses atomic_read(), which
      happens to work because atomic_read() is a macro so the
      .counter value gets u64-read on 32-bit too - but this is really
      bogus and serious bugs are waiting to happen.
      
      Change atomic_read() to be a type-safe inline, and this exposes
      the atomic64 bogosity as well:
      
        arch/x86/lib/atomic64_32.c: In function ‘atomic64_xchg’:
        arch/x86/lib/atomic64_32.c:39: warning: passing argument 1 of ‘atomic_read’ from incompatible pointer type
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      32171208
    • I
      x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file · b7882b7c
      Ingo Molnar 提交于
      Linus noted that the atomic64_t primitives are all inlines
      currently which is crazy because these functions have a large
      register footprint anyway.
      
      Move them to a separate file: arch/x86/lib/atomic64_32.c
      
      Also, while at it, rename all uses of 'unsigned long long' to
      the much shorter u64.
      
      This makes the appearance of the prototypes a lot nicer - and
      it also uncovered a few bugs where (yet unused) API variants
      had 'long' as their return type instead of u64.
      
      [ More intrusive changes are not yet done in this patch. ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7882b7c
    • E
      x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too · bbf2a330
      Eric Dumazet 提交于
      Locked instructions on two cache lines at once are painful. If
      atomic64_t uses two cache lines, my test program is 10x slower.
      
      The chance for that is significant: 4/32 or 12.5%.
      
      Make sure an atomic64_t is 8 bytes aligned.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      [ changed it to __aligned(8) as per Andrew's suggestion ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bbf2a330
    • L
      x86: fix power-of-2 round_up/round_down macros · 43644679
      Linus Torvalds 提交于
      These macros had two bugs:
       - the type of the mask was not correctly expanded to the full size of
         the argument being expanded, resulting in possible loss of high bits
         when mixing types.
       - the alignment argument was evaluated twice, despite the macro looking
         like a fancy function (but it really does need to be a macro, since
         it works on arbitrary integer types)
      
      Noticed by Peter Anvin, and with a fix that is a modification of his
      suggestion (bug noticed by Yinghai Lu).
      
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43644679
  17. 02 7月, 2009 2 次提交
    • F
      perf_counter: Ignore the nmi call frames in the x86-64 backtraces · 0406ca6d
      Frederic Weisbecker 提交于
      About every callchains recorded with perf record are filled up
      including the internal perfcounter nmi frame:
      
       perf_callchain
       perf_counter_overflow
       intel_pmu_handle_irq
       perf_counter_nmi_handler
       notifier_call_chain
       atomic_notifier_call_chain
       notify_die
       do_nmi
       nmi
      
      We want ignore this frame as it's not interesting for
      instrumentation. To solve this, we simply ignore every frames
      from nmi context.
      
      New example of "perf report -s sym -c" after this patch:
      
      9.59%  [k] search_by_key
                   4.88%
                      search_by_key
                      reiserfs_read_locked_inode
                      reiserfs_iget
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      
                   3.19%
                      search_by_key
                      search_by_entry_key
                      reiserfs_find_entry
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      [...]
      
      For now this patch only solves the problem in x86-64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0406ca6d
    • D
      Fix pci_unmap_addr() et al on i386. · 788d84bb
      David Woodhouse 提交于
      We can run a 32-bit kernel on boxes with an IOMMU, so we need
      pci_unmap_addr() etc. to work -- without it, drivers will leak mappings.
      
      To be honest, this whole thing looks like it's more pain than it's
      worth; I'm half inclined to remove the no-op #else case altogether.
      
      But this is the minimal fix, which just does the right thing if
      CONFIG_DMAR is set.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: stable@kernel.org  [ for 2.6.30 ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      788d84bb
  18. 01 7月, 2009 2 次提交
  19. 26 6月, 2009 3 次提交
  20. 25 6月, 2009 1 次提交
  21. 22 6月, 2009 1 次提交
    • T
      x86: fix pageattr handling for lpage percpu allocator and re-enable it · e59a1bb2
      Tejun Heo 提交于
      lpage allocator aliases a PMD page for each cpu and returns whatever
      is unused to the page allocator.  When the pageattr of the recycled
      pages are changed, this makes the two aliases point to the overlapping
      regions with different attributes which isn't allowed and known to
      cause subtle data corruption in certain cases.
      
      This can be handled in simliar manner to the x86_64 highmap alias.
      pageattr code should detect if the target pages have PMD alias and
      split the PMD alias and synchronize the attributes.
      
      pcpur allocator is updated to keep the allocated PMD pages map sorted
      in ascending address order and provide pcpu_lpage_remapped() function
      which binary searches the array to determine whether the given address
      is aliased and if so to which address.  pageattr is updated to use
      pcpu_lpage_remapped() to detect the PMD alias and split it up as
      necessary from cpa_process_alias().
      
      Jan Beulich spotted the original problem and incorrect usage of vaddr
      instead of laddr for lookup.
      
      With this, lpage percpu allocator should work correctly.  Re-enable
      it.
      
      [ Impact: fix subtle lpage pageattr bug and re-enable lpage ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      e59a1bb2
  22. 21 6月, 2009 1 次提交
    • L
      x86, 64-bit: Clean up user address masking · 9063c61f
      Linus Torvalds 提交于
      The discussion about using "access_ok()" in get_user_pages_fast() (see
      commit 7f818906: "x86: don't use
      'access_ok()' as a range check in get_user_pages_fast()" for details and
      end result), made us notice that x86-64 was really being very sloppy
      about virtual address checking.
      
      So be way more careful and straightforward about masking x86-64 virtual
      addresses:
      
       - All the VIRTUAL_MASK* variants now cover half of the address
         space, it's not like we can use the full mask on a signed
         integer, and the larger mask just invites mistakes when
         applying it to either half of the 48-bit address space.
      
       - /proc/kcore's kc_offset_to_vaddr() becomes a lot more
         obvious when it transforms a file offset into a
         (kernel-half) virtual address.
      
       - Unify/simplify the 32-bit and 64-bit USER_DS definition to
         be based on TASK_SIZE_MAX.
      
      This cleanup and more careful/obvious user virtual address checking also
      uncovered a buglet in the x86-64 implementation of strnlen_user(): it
      would do an "access_ok()" check on the whole potential area, even if the
      string itself was much shorter, and thus return an error even for valid
      strings. Our sloppy checking had hidden this.
      
      So this fixes 'strnlen_user()' to do this properly, the same way we
      already handled user strings in 'strncpy_from_user()'.  Namely by just
      checking the first byte, and then relying on fault handling for the
      rest.  That always works, since we impose a guard page that cannot be
      mapped at the end of the user space address space (and even if we
      didn't, we'd have the address space hole).
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9063c61f