1. 12 11月, 2009 1 次提交
    • T
      percpu: restructure pcpu_extend_area_map() to fix bugs and improve readability · 833af842
      Tejun Heo 提交于
      pcpu_extend_area_map() had the following two bugs.
      
      * It should return 1 if pcpu_lock was dropped and reacquired but it
        returned 0.  This could lead to oops if free_percpu() races with
        area map extension.
      
      * pcpu_mem_free() was called under pcpu_lock.  pcpu_mem_free() might
        end up calling vfree() which isn't IRQ safe.  This could lead to
        deadlock through lock order inversion via IRQ.
      
      In addition, Linus pointed out that the temporary lock dropping and
      subtle three-way return value of pcpu_extend_area_map() was very ugly
      and suggested to split the function into two - pcpu_need_to_extend()
      and pcpu_extend_area_map().
      
      This patch restructures pcpu_extend_area_map() as suggested and fixes
      the two bugs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      833af842
  2. 28 10月, 2009 2 次提交
    • J
      sched: move rq_weight data array out of .percpu · 4a6cc4bd
      Jiri Kosina 提交于
      Commit 34d76c41 introduced percpu array update_shares_data, size of which
      being proportional to NR_CPUS. Unfortunately this blows up ia64 for large
      NR_CPUS configuration, as ia64 allows only 64k for .percpu section.
      
      Fix this by allocating this array dynamically and keep only pointer to it
      percpu.
      
      The per-cpu handling doesn't impose significant performance penalty on
      potentially contented path in tg_shares_up().
      
      ...
      ffffffff8104337c:       65 48 8b 14 25 20 cd    mov    %gs:0xcd20,%rdx
      ffffffff81043383:       00 00
      ffffffff81043385:       48 c7 c0 00 e1 00 00    mov    $0xe100,%rax
      ffffffff8104338c:       48 c7 45 a0 00 00 00    movq   $0x0,-0x60(%rbp)
      ffffffff81043393:       00
      ffffffff81043394:       48 c7 45 a8 00 00 00    movq   $0x0,-0x58(%rbp)
      ffffffff8104339b:       00
      ffffffff8104339c:       48 01 d0                add    %rdx,%rax
      ffffffff8104339f:       49 8d 94 24 08 01 00    lea    0x108(%r12),%rdx
      ffffffff810433a6:       00
      ffffffff810433a7:       b9 ff ff ff ff          mov    $0xffffffff,%ecx
      ffffffff810433ac:       48 89 45 b0             mov    %rax,-0x50(%rbp)
      ffffffff810433b0:       bb 00 04 00 00          mov    $0x400,%ebx
      ffffffff810433b5:       48 89 55 c0             mov    %rdx,-0x40(%rbp)
      ...
      
      After:
      
      ...
      ffffffff8104337c:       65 8b 04 25 28 cd 00    mov    %gs:0xcd28,%eax
      ffffffff81043383:       00
      ffffffff81043384:       48 98                   cltq
      ffffffff81043386:       49 8d bc 24 08 01 00    lea    0x108(%r12),%rdi
      ffffffff8104338d:       00
      ffffffff8104338e:       48 8b 15 d3 7f 76 00    mov    0x767fd3(%rip),%rdx        # ffffffff817ab368 <update_shares_data>
      ffffffff81043395:       48 8b 34 c5 00 ee 6d    mov    -0x7e921200(,%rax,8),%rsi
      ffffffff8104339c:       81
      ffffffff8104339d:       48 c7 45 a0 00 00 00    movq   $0x0,-0x60(%rbp)
      ffffffff810433a4:       00
      ffffffff810433a5:       b9 ff ff ff ff          mov    $0xffffffff,%ecx
      ffffffff810433aa:       48 89 7d c0             mov    %rdi,-0x40(%rbp)
      ffffffff810433ae:       48 c7 45 a8 00 00 00    movq   $0x0,-0x58(%rbp)
      ffffffff810433b5:       00
      ffffffff810433b6:       bb 00 04 00 00          mov    $0x400,%ebx
      ffffffff810433bb:       48 01 f2                add    %rsi,%rdx
      ffffffff810433be:       48 89 55 b0             mov    %rdx,-0x50(%rbp)
      ...
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      4a6cc4bd
    • J
      percpu: allow pcpu_alloc() to be called with IRQs off · 403a91b1
      Jiri Kosina 提交于
      pcpu_alloc() and pcpu_extend_area_map() perform a series of
      spin_lock_irq()/spin_unlock_irq() calls, which make them unsafe
      with respect to being called from contexts which have IRQs off.
      
      This patch converts the code to perform save/restore of flags instead,
      making pcpu_alloc() (or __alloc_percpu() respectively) to be called
      from early kernel startup stage, where IRQs are off.
      
      This is needed for proper initialization of per-cpu rq_weight data from
      sched_init().
      
      tj: added comment explaining why irqsave/restore is used in alloc path.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      403a91b1
  3. 12 10月, 2009 1 次提交
  4. 29 9月, 2009 6 次提交
    • T
      percpu: make allocation failures more verbose · f2badb0c
      Tejun Heo 提交于
      Warn and dump stack when percpu allocation fails.  percpu allocator is
      still young and unchecked NULL percpu pointer usage can result in
      random memory corruption when combined with the pointer shifting in
      access macros.  Allocation failures should be rare and the warning
      message will be disabled after certain times.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f2badb0c
    • T
      percpu: make pcpu_setup_first_chunk() failures more verbose · 635b75fc
      Tejun Heo 提交于
      The parameters to pcpu_setup_first_chunk() come from different sources
      depending on architecture and can be quite complex.  The function runs
      various sanity checks on the parameters and triggers BUG() if
      something isn't right.  However, this is very early during the boot
      and not reporting exactly what the problem is makes debugging even
      harder.
      
      Add PCPU_SETUP_BUG() macro which prints out enough information about
      the parameters.  As the macro still puts separate BUG() for each
      check, it won't lose any information even on the situations where only
      the program counter can be retrieved.
      
      While at it, also bump pcpu_dump_alloc_info() message to KERN_INFO so
      that it's visible on the console if boot fails to complete.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      635b75fc
    • T
      percpu: make embedding first chunk allocator check vmalloc space size · 6ea529a2
      Tejun Heo 提交于
      Embedding first chunk allocator maintains the distances between units
      in the vmalloc area and thus needs vmalloc space to be larger than the
      maximum distances between units; otherwise, it wouldn't be able to
      create any dynamic chunks.  This patch makes the embedding first chunk
      allocator check vmalloc space size and if the maximum distance between
      units is larger than 75% of it, print warning and, if page mapping
      allocator is available, fail initialization so that the system falls
      back onto it.
      
      This should work around percpu allocation failure problems on certain
      sparc64 configurations where distances between NUMA nodes are larger
      than the vmalloc area and makes percpu allocator more robust for
      future configurations.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      6ea529a2
    • T
      sparc64: implement page mapping percpu first chunk allocator · a70c6913
      Tejun Heo 提交于
      Implement page mapping percpu first chunk allocator as a fallback to
      the embedding allocator.  The next patch will make the embedding
      allocator check distances between units to determine whether it fits
      within the vmalloc area so that this fallback can be used on such
      cases.
      
      sparc64 currently has relatively small vmalloc area which makes it
      impossible to create any dynamic chunks on certain configurations
      leading to percpu allocation failures.  This and the next patch should
      allow those configurations to keep working until proper solution is
      found.
      
      While at it, mark pcpu_cpu_distance() with __init.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      a70c6913
    • T
      percpu: make pcpu_build_alloc_info() clear static buffers · fb59e72e
      Tejun Heo 提交于
      pcpu_build_alloc_info() may be called multiple times when percpu is
      falling back to different first chunk allocator.  Make it clear static
      buffers so that they don't contain values from previous runs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      fb59e72e
    • T
      percpu: fix unit_map[] verification in pcpu_setup_first_chunk() · ffe0d5a5
      Tejun Heo 提交于
      pcpu_setup_first_chunk() incorrectly used NR_CPUS as the impossible
      unit number while unit number can equal and go over NR_CPUS with
      sparse unit map.  This triggers BUG_ON() spuriously on machines which
      have non-power-of-two number of cpus.  Use UINT_MAX instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-and-tested-by: NTony Vroon <tony@linx.net>
      ffe0d5a5
  5. 28 9月, 2009 8 次提交
  6. 27 9月, 2009 22 次提交