1. 04 8月, 2009 4 次提交
    • J
      x86, UV: Fix macros for accessing large node numbers · 67e83f30
      Jack Steiner 提交于
      The UV chipset automatically supplies the upper bits on nodes
      being referenced by MMR accesses. These bit can be deleted from
      the hub addressing macros.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143808.GA8076@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67e83f30
    • J
      x86, UV: Handle missing blade-local memory correctly · 6c7184b7
      Jack Steiner 提交于
      UV blades may not have any blade-local memory. Add a field
      (nid) to the UV blade structure to indicates whether the node
      has local memory. This is needed by the GRU driver (pushed
      separately).
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143507.GA7006@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c7184b7
    • H
      x86: fix assembly constraints in native_save_fl() · f1f029c7
      H. Peter Anvin 提交于
      From Gabe Black in bugzilla 13888:
      
      native_save_fl is implemented as follows:
      
        11static inline unsigned long native_save_fl(void)
        12{
        13        unsigned long flags;
        14
        15        asm volatile("# __raw_save_flags\n\t"
        16                     "pushf ; pop %0"
        17                     : "=g" (flags)
        18                     : /* no input */
        19                     : "memory");
        20
        21        return flags;
        22}
      
      If gcc chooses to put flags on the stack, for instance because this is
      inlined into a larger function with more register pressure, the offset
      of the flags variable from the stack pointer will change when the
      pushf is performed. gcc doesn't attempt to understand that fact, and
      address used for pop will still be the same. It will write to
      somewhere near flags on the stack but not actually into it and
      overwrite some other value.
      
      I saw this happen in the ide_device_add_all function when running in a
      simulator I work on. I'm assuming that some quirk of how the simulated
      hardware is set up caused the code path this is on to be executed when
      it normally wouldn't.
      
      A simple fix might be to change "=g" to "=r".
      Reported-by: NGabe Black <spamforgabe@umich.edu>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Stable Team <stable@kernel.org>
      f1f029c7
    • P
      x86: Make 64-bit efi_ioremap use ioremap on MMIO regions · 6a7bbd57
      Paul Mackerras 提交于
      Booting current 64-bit x86 kernels on the latest Apple MacBook
      (MacBook5,2) via EFI gives the following warning:
      
      [    0.182209] ------------[ cut here ]------------
      [    0.182222] WARNING: at arch/x86/mm/pageattr.c:581 __cpa_process_fault+0x44/0xa0()
      [    0.182227] Hardware name: MacBook5,2
      [    0.182231] CPA: called for zero pte. vaddr = ffff8800ffe00000 cpa->vaddr = ffff8800ffe00000
      [    0.182236] Modules linked in:
      [    0.182242] Pid: 0, comm: swapper Not tainted 2.6.31-rc4 #6
      [    0.182246] Call Trace:
      [    0.182254]  [<ffffffff8102c754>] ? __cpa_process_fault+0x44/0xa0
      [    0.182261]  [<ffffffff81048668>] warn_slowpath_common+0x78/0xd0
      [    0.182266]  [<ffffffff81048744>] warn_slowpath_fmt+0x64/0x70
      [    0.182272]  [<ffffffff8102c7ec>] ? update_page_count+0x3c/0x50
      [    0.182280]  [<ffffffff818d25c5>] ? phys_pmd_init+0x140/0x22e
      [    0.182286]  [<ffffffff8102c754>] __cpa_process_fault+0x44/0xa0
      [    0.182292]  [<ffffffff8102ce60>] __change_page_attr_set_clr+0x5f0/0xb40
      [    0.182301]  [<ffffffff810d1035>] ? vm_unmap_aliases+0x175/0x190
      [    0.182307]  [<ffffffff8102d4ae>] change_page_attr_set_clr+0xfe/0x3d0
      [    0.182314]  [<ffffffff8102dcca>] _set_memory_uc+0x2a/0x30
      [    0.182319]  [<ffffffff8102dd4b>] set_memory_uc+0x7b/0xb0
      [    0.182327]  [<ffffffff818afe31>] efi_enter_virtual_mode+0x2ad/0x2c9
      [    0.182334]  [<ffffffff818a1c66>] start_kernel+0x2db/0x3f4
      [    0.182340]  [<ffffffff818a1289>] x86_64_start_reservations+0x99/0xb9
      [    0.182345]  [<ffffffff818a1389>] x86_64_start_kernel+0xe0/0xf2
      [    0.182357] ---[ end trace 4eaa2a86a8e2da22 ]---
      [    0.182982] init_memory_mapping: 00000000ffffc000-0000000100000000
      [    0.182993]  00ffffc000 - 0100000000 page 4k
      
      This happens because the 64-bit version of efi_ioremap calls
      init_memory_mapping for all addresses, regardless of whether they are
      RAM or MMIO.  The EFI tables on this machine ask for runtime access to
      some MMIO regions:
      
      [    0.000000] EFI: mem195: type=11, attr=0x8000000000000000, range=[0x0000000093400000-0x0000000093401000) (0MB)
      [    0.000000] EFI: mem196: type=11, attr=0x8000000000000000, range=[0x00000000ffc00000-0x00000000ffc40000) (0MB)
      [    0.000000] EFI: mem197: type=11, attr=0x8000000000000000, range=[0x00000000ffc40000-0x00000000ffc80000) (0MB)
      [    0.000000] EFI: mem198: type=11, attr=0x8000000000000000, range=[0x00000000ffc80000-0x00000000ffca4000) (0MB)
      [    0.000000] EFI: mem199: type=11, attr=0x8000000000000000, range=[0x00000000ffca4000-0x00000000ffcb4000) (0MB)
      [    0.000000] EFI: mem200: type=11, attr=0x8000000000000000, range=[0x00000000ffcb4000-0x00000000ffffc000) (3MB)
      [    0.000000] EFI: mem201: type=11, attr=0x8000000000000000, range=[0x00000000ffffc000-0x0000000100000000) (0MB)
      
      This arranges to pass the EFI memory type through to efi_ioremap, and
      makes efi_ioremap use ioremap rather than init_memory_mapping if the
      type is EFI_MEMORY_MAPPED_IO.  With this, the above warning goes away.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      LKML-Reference: <19062.55858.533494.471153@cargo.ozlabs.ibm.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6a7bbd57
  2. 30 7月, 2009 2 次提交
    • R
      lguest: update commentry · a91d74a3
      Rusty Russell 提交于
      Every so often, after code shuffles, I need to go through and unbitrot
      the Lguest Journey (see drivers/lguest/README).  Since we now use RCU in
      a simple form in one place I took the opportunity to expand that explanation.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      a91d74a3
    • R
      lguest: fix comment style · 2e04ef76
      Rusty Russell 提交于
      I don't really notice it (except to begrudge the extra vertical
      space), but Ingo does.  And he pointed out that one excuse of lguest
      is as a teaching tool, it should set a good example.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      2e04ef76
  3. 28 7月, 2009 1 次提交
    • B
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() · 9e1b32ca
      Benjamin Herrenschmidt 提交于
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
      
      Upcoming paches to support the new 64-bit "BookE" powerpc architecture
      will need to have the virtual address corresponding to PTE page when
      freeing it, due to the way the HW table walker works.
      
      Basically, the TLB can be loaded with "large" pages that cover the whole
      virtual space (well, sort-of, half of it actually) represented by a PTE
      page, and which contain an "indirect" bit indicating that this TLB entry
      RPN points to an array of PTEs from which the TLB can then create direct
      entries. Thus, in order to invalidate those when PTE pages are deleted,
      we need the virtual address to pass to tlbilx or tlbivax instructions.
      
      The old trick of sticking it somewhere in the PTE page struct page sucks
      too much, the address is almost readily available in all call sites and
      almost everybody implemets these as macros, so we may as well add the
      argument everywhere. I added it to the pmd and pud variants for consistency.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
      Acked-by: NNick Piggin <npiggin@suse.de>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e1b32ca
  4. 21 7月, 2009 2 次提交
  5. 17 7月, 2009 1 次提交
  6. 11 7月, 2009 2 次提交
  7. 10 7月, 2009 1 次提交
  8. 04 7月, 2009 2 次提交
    • E
      x86: atomic64: Inline atomic64_read() again · a79f0da8
      Eric Dumazet 提交于
      Now atomic64_read() is light weight (no register pressure and
      small icache), we can inline it again.
      
      Also use "=&A" constraint instead of "+A" to avoid warning
      about unitialized 'res' variable. (gcc had to force 0 in eax/edx)
      
        $ size vmlinux.prev vmlinux.after
           text    data     bss     dec     hex filename
        4908667  451676 1684868 7045211  6b805b vmlinux.prev
        4908651  451676 1684868 7045195  6b804b vmlinux.after
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <4A4E1AA2.30002@gmail.com>
      [ Also fix typo in atomic64_set() export ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a79f0da8
    • I
      x86: atomic64: Improve atomic64_xchg() · 3a8d1788
      Ingo Molnar 提交于
      Remove the read-first logic from atomic64_xchg() and simplify
      the loop.
      
      This function was the last user of __atomic64_read() - remove it.
      
      Also, change the 'real_val' assumption from the somewhat quirky
      1ULL << 32 value to the (just as arbitrary, but simpler) value
      of 0.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a8d1788
  9. 03 7月, 2009 7 次提交
    • P
      x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP · 8e049ef0
      Paul Mackerras 提交于
      Occasionally we get bugs where atomic_read or atomic_set are
      used on atomic64_t variables or vice versa.  These bugs don't
      generate warnings on x86 because atomic_read and atomic_set are
      coded as macros rather than C functions, so we don't get any
      type-checking on their arguments; similarly for atomic64_read
      and atomic64_set in 64-bit kernels.
      
      This converts them to C functions so that the arguments are
      type-checked and bugs like this will get caught more easily. It
      also converts atomic_cmpxchg and atomic_xchg, and
      atomic64_cmpxchg and atomic64_xchg on 64-bit, so we get
      type-checking on their arguments too.
      
      Compiling a typical 64-bit x86 config, this generates no new
      warnings, and the vmlinux text is 86 bytes smaller.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e049ef0
    • J
      x86: Remove unused function lapic_watchdog_ok() · c7210e1f
      Jaswinder Singh Rajput 提交于
      lapic_watchdog_ok() is a global function but no one is using it.
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7210e1f
    • M
      x86: Fix fixmap page order for FIX_TEXT_POKE0,1 · 12b9d7cc
      Mathieu Desnoyers 提交于
      Masami reported:
      
      > Since the fixmap pages are assigned higher address to lower,
      > text_poke() has to use it with inverted order (FIX_TEXT_POKE1
      > to FIX_TEXT_POKE0).
      
      I prefer to just invert the order of the fixmap declaration.
      It's simpler and more straightforward.
      
      Backward fixmaps seems to be used by both x86 32 and 64.
      
      It's really rare but a nasty bug, because it only hurts when
      instructions to patch are crossing a page boundary. If this
      happens, the fixmap write accesses will spill on the following
      fixmap, which may very well crash the system. And this does not
      crash the system, it could leave illegal instructions in place.
      Thanks Masami for finding this.
      
      It seems to have crept into the 2.6.30-rc series, so this calls
      for a -stable inclusion.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <20090701213722.GH19926@Krystal>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12b9d7cc
    • I
      x86: atomic64: Make atomic_read() type-safe · 32171208
      Ingo Molnar 提交于
      Linus noticed that atomic64_xchg() uses atomic_read(), which
      happens to work because atomic_read() is a macro so the
      .counter value gets u64-read on 32-bit too - but this is really
      bogus and serious bugs are waiting to happen.
      
      Change atomic_read() to be a type-safe inline, and this exposes
      the atomic64 bogosity as well:
      
        arch/x86/lib/atomic64_32.c: In function ‘atomic64_xchg’:
        arch/x86/lib/atomic64_32.c:39: warning: passing argument 1 of ‘atomic_read’ from incompatible pointer type
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      32171208
    • I
      x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file · b7882b7c
      Ingo Molnar 提交于
      Linus noted that the atomic64_t primitives are all inlines
      currently which is crazy because these functions have a large
      register footprint anyway.
      
      Move them to a separate file: arch/x86/lib/atomic64_32.c
      
      Also, while at it, rename all uses of 'unsigned long long' to
      the much shorter u64.
      
      This makes the appearance of the prototypes a lot nicer - and
      it also uncovered a few bugs where (yet unused) API variants
      had 'long' as their return type instead of u64.
      
      [ More intrusive changes are not yet done in this patch. ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7882b7c
    • E
      x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too · bbf2a330
      Eric Dumazet 提交于
      Locked instructions on two cache lines at once are painful. If
      atomic64_t uses two cache lines, my test program is 10x slower.
      
      The chance for that is significant: 4/32 or 12.5%.
      
      Make sure an atomic64_t is 8 bytes aligned.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      [ changed it to __aligned(8) as per Andrew's suggestion ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bbf2a330
    • L
      x86: fix power-of-2 round_up/round_down macros · 43644679
      Linus Torvalds 提交于
      These macros had two bugs:
       - the type of the mask was not correctly expanded to the full size of
         the argument being expanded, resulting in possible loss of high bits
         when mixing types.
       - the alignment argument was evaluated twice, despite the macro looking
         like a fancy function (but it really does need to be a macro, since
         it works on arbitrary integer types)
      
      Noticed by Peter Anvin, and with a fix that is a modification of his
      suggestion (bug noticed by Yinghai Lu).
      
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43644679
  10. 02 7月, 2009 2 次提交
    • F
      perf_counter: Ignore the nmi call frames in the x86-64 backtraces · 0406ca6d
      Frederic Weisbecker 提交于
      About every callchains recorded with perf record are filled up
      including the internal perfcounter nmi frame:
      
       perf_callchain
       perf_counter_overflow
       intel_pmu_handle_irq
       perf_counter_nmi_handler
       notifier_call_chain
       atomic_notifier_call_chain
       notify_die
       do_nmi
       nmi
      
      We want ignore this frame as it's not interesting for
      instrumentation. To solve this, we simply ignore every frames
      from nmi context.
      
      New example of "perf report -s sym -c" after this patch:
      
      9.59%  [k] search_by_key
                   4.88%
                      search_by_key
                      reiserfs_read_locked_inode
                      reiserfs_iget
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      
                   3.19%
                      search_by_key
                      search_by_entry_key
                      reiserfs_find_entry
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      [...]
      
      For now this patch only solves the problem in x86-64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0406ca6d
    • D
      Fix pci_unmap_addr() et al on i386. · 788d84bb
      David Woodhouse 提交于
      We can run a 32-bit kernel on boxes with an IOMMU, so we need
      pci_unmap_addr() etc. to work -- without it, drivers will leak mappings.
      
      To be honest, this whole thing looks like it's more pain than it's
      worth; I'm half inclined to remove the no-op #else case altogether.
      
      But this is the minimal fix, which just does the right thing if
      CONFIG_DMAR is set.
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: stable@kernel.org  [ for 2.6.30 ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      788d84bb
  11. 01 7月, 2009 2 次提交
  12. 26 6月, 2009 3 次提交
  13. 25 6月, 2009 1 次提交
  14. 22 6月, 2009 1 次提交
    • T
      x86: fix pageattr handling for lpage percpu allocator and re-enable it · e59a1bb2
      Tejun Heo 提交于
      lpage allocator aliases a PMD page for each cpu and returns whatever
      is unused to the page allocator.  When the pageattr of the recycled
      pages are changed, this makes the two aliases point to the overlapping
      regions with different attributes which isn't allowed and known to
      cause subtle data corruption in certain cases.
      
      This can be handled in simliar manner to the x86_64 highmap alias.
      pageattr code should detect if the target pages have PMD alias and
      split the PMD alias and synchronize the attributes.
      
      pcpur allocator is updated to keep the allocated PMD pages map sorted
      in ascending address order and provide pcpu_lpage_remapped() function
      which binary searches the array to determine whether the given address
      is aliased and if so to which address.  pageattr is updated to use
      pcpu_lpage_remapped() to detect the PMD alias and split it up as
      necessary from cpa_process_alias().
      
      Jan Beulich spotted the original problem and incorrect usage of vaddr
      instead of laddr for lookup.
      
      With this, lpage percpu allocator should work correctly.  Re-enable
      it.
      
      [ Impact: fix subtle lpage pageattr bug and re-enable lpage ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      e59a1bb2
  15. 21 6月, 2009 1 次提交
    • L
      x86, 64-bit: Clean up user address masking · 9063c61f
      Linus Torvalds 提交于
      The discussion about using "access_ok()" in get_user_pages_fast() (see
      commit 7f818906: "x86: don't use
      'access_ok()' as a range check in get_user_pages_fast()" for details and
      end result), made us notice that x86-64 was really being very sloppy
      about virtual address checking.
      
      So be way more careful and straightforward about masking x86-64 virtual
      addresses:
      
       - All the VIRTUAL_MASK* variants now cover half of the address
         space, it's not like we can use the full mask on a signed
         integer, and the larger mask just invites mistakes when
         applying it to either half of the 48-bit address space.
      
       - /proc/kcore's kc_offset_to_vaddr() becomes a lot more
         obvious when it transforms a file offset into a
         (kernel-half) virtual address.
      
       - Unify/simplify the 32-bit and 64-bit USER_DS definition to
         be based on TASK_SIZE_MAX.
      
      This cleanup and more careful/obvious user virtual address checking also
      uncovered a buglet in the x86-64 implementation of strnlen_user(): it
      would do an "access_ok()" check on the whole potential area, even if the
      string itself was much shorter, and thus return an error even for valid
      strings. Our sloppy checking had hidden this.
      
      So this fixes 'strnlen_user()' to do this properly, the same way we
      already handled user strings in 'strncpy_from_user()'.  Namely by just
      checking the first byte, and then relying on fault handling for the
      rest.  That always works, since we impose a guard page that cannot be
      mapped at the end of the user space address space (and even if we
      didn't, we'd have the address space hole).
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9063c61f
  16. 19 6月, 2009 2 次提交
  17. 18 6月, 2009 2 次提交
  18. 17 6月, 2009 4 次提交
    • P
      sched, x86: Fix cpufreq + sched_clock() TSC scaling · 84599f8a
      Peter Zijlstra 提交于
      For freqency dependent TSCs we only scale the cycles, we do not account
      for the discrepancy in absolute value.
      
      Our current formula is: time = cycles * mult
      
      (where mult is a function of the cpu-speed on variable tsc machines)
      
      Suppose our current cycle count is 10, and we have a multiplier of 5,
      then our time value would end up being 50.
      
      Now cpufreq comes along and changes the multiplier to say 3 or 7,
      which would result in our time being resp. 30 or 70.
      
      That means that we can observe random jumps in the time value due to
      frequency changes in both fwd and bwd direction.
      
      So what this patch does is change the formula to:
      
        time = cycles * frequency + offset
      
      And we calculate offset so that time_before == time_after, thereby
      ridding us of these jumps in time.
      
      [ Impact: fix/reduce sched_clock() jumps across frequency changing events ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Chucked-on-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      84599f8a
    • R
      kmap_types: make most arches use generic header file · e4c9dd0f
      Randy Dunlap 提交于
      Convert most arches to use asm-generic/kmap_types.h.
      
      Move the KM_FENCE_ macro additions into asm-generic/kmap_types.h,
      controlled by __WITH_KM_FENCE from each arch's kmap_types.h file.
      
      Would be nice to be able to add custom KM_types per arch, but I don't yet
      see a nice, clean way to do that.
      
      Built on x86_64, i386, mips, sparc, alpha(tonyb), powerpc(tonyb), and
      68k(tonyb).
      
      Note: avr32 should be able to remove KM_PTE2 (since it's not used) and
      then just use the generic kmap_types.h file.  Get avr32 maintainer
      approval.
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: <linux-arch@vger.kernel.org>
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Bryan Wu <cooloney@kernel.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: "Luck Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e4c9dd0f
    • A
      time: move PIT_TICK_RATE to linux/timex.h · 08604bd9
      Arnd Bergmann 提交于
      PIT_TICK_RATE is currently defined in four architectures, but in three
      different places.  While linux/timex.h is not the perfect place for it, it
      is still a reasonable replacement for those drivers that traditionally use
      asm/timex.h to get CLOCK_TICK_RATE and expect it to be the PIT frequency.
      
      Note that for Alpha, the actual value changed from 1193182UL to 1193180UL.
       This is unlikely to make a difference, and probably can only improve
      accuracy.  There was a discussion on the correct value of CLOCK_TICK_RATE
      a few years ago, after which every existing instance was getting changed
      to 1193182.  According to the specification, it should be
      1193181.818181...
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Dmitry Torokhov <dtor@mail.ru>
      Cc: Takashi Iwai <tiwai@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08604bd9
    • H
      x86, mce: mce.h cleanup · 58995d2d
      Hidetoshi Seto 提交于
      Reorder definitions.
      
       - static inline dummy mcheck_init() for !CONFIG_X86_MCE
       - gather defs for exception, threshold handler
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      58995d2d