1. 03 9月, 2014 5 次提交
    • T
      percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated · b539b87f
      Tejun Heo 提交于
      pcpu_nr_empty_pop_pages counts the number of empty populated pages
      across all chunks and chunk->nr_populated counts the number of
      populated pages in a chunk.  Both will be used to implement pre/async
      population for atomic allocations.
      
      pcpu_chunk_[de]populated() are added to update chunk->populated,
      chunk->nr_populated and pcpu_nr_empty_pop_pages together.  All
      successful chunk [de]populations should be followed by the
      corresponding pcpu_chunk_[de]populated() calls.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b539b87f
    • T
      percpu: restructure locking · b38d08f3
      Tejun Heo 提交于
      At first, the percpu allocator required a sleepable context for both
      alloc and free paths and used pcpu_alloc_mutex to protect everything.
      Later, pcpu_lock was introduced to protect the index data structure so
      that the free path can be invoked from atomic contexts.  The
      conversion only updated what's necessary and left most of the
      allocation path under pcpu_alloc_mutex.
      
      The percpu allocator is planned to add support for atomic allocation
      and this patch restructures locking so that the coverage of
      pcpu_alloc_mutex is further reduced.
      
      * pcpu_alloc() now grab pcpu_alloc_mutex only while creating a new
        chunk and populating the allocated area.  Everything else is now
        protected soley by pcpu_lock.
      
        After this change, multiple instances of pcpu_extend_area_map() may
        race but the function already implements sufficient synchronization
        using pcpu_lock.
      
        This also allows multiple allocators to arrive at new chunk
        creation.  To avoid creating multiple empty chunks back-to-back, a
        new chunk is created iff there is no other empty chunk after
        grabbing pcpu_alloc_mutex.
      
      * pcpu_lock is now held while modifying chunk->populated bitmap.
        After this, all data structures are protected by pcpu_lock.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b38d08f3
    • T
      percpu: make percpu-km set chunk->populated bitmap properly · a63d4ac4
      Tejun Heo 提交于
      percpu-km instantiates the whole chunk on creation and doesn't make
      use of chunk->populated bitmap and leaves it as zero.  While this
      currently doesn't cause any problem, the inconsistency makes it
      difficult to build further logic on top of chunk->populated.  This
      patch makes percpu-km fill chunk->populated on creation so that the
      bitmap is always consistent.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      a63d4ac4
    • T
      percpu: move region iterations out of pcpu_[de]populate_chunk() · a93ace48
      Tejun Heo 提交于
      Previously, pcpu_[de]populate_chunk() were called with the range which
      may contain multiple target regions in it and
      pcpu_[de]populate_chunk() iterated over the regions.  This has the
      benefit of batching up cache flushes for all the regions; however,
      we're planning to add more bookkeeping logic around [de]population to
      support atomic allocations and this delegation of iterations gets in
      the way.
      
      This patch moves the region iterations out of
      pcpu_[de]populate_chunk() into its callers - pcpu_alloc() and
      pcpu_reclaim() - so that we can later add logic to track more states
      around them.  This change may make cache and tlb flushes more frequent
      but multi-region [de]populations are rare anyway and if this actually
      becomes a problem, it's not difficult to factor out cache flushes as
      separate callbacks which are directly invoked from percpu.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a93ace48
    • T
      percpu: move common parts out of pcpu_[de]populate_chunk() · dca49645
      Tejun Heo 提交于
      percpu-vm and percpu-km implement separate versions of
      pcpu_[de]populate_chunk() and some part which is or should be common
      are currently in the specific implementations.  Make the following
      changes.
      
      * Allocate area clearing is moved from the pcpu_populate_chunk()
        implementations to pcpu_alloc().  This makes percpu-km's version
        noop.
      
      * Quick exit tests in pcpu_[de]populate_chunk() of percpu-vm are moved
        to their respective callers so that they are applied to percpu-km
        too.  This doesn't make any meaningful difference as both functions
        are noop for percpu-km; however, this is more consistent and will
        help implementing atomic allocation support.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      dca49645
  2. 02 10月, 2010 2 次提交
    • T
      percpu: clear memory allocated with the km allocator · ed6c1115
      Tejun Heo 提交于
      Percpu allocator should clear memory before returning it but the km
      allocator forgot to do it.  Fix it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      ed6c1115
    • T
      percpu: use percpu allocator on UP too · 9b8327bb
      Tejun Heo 提交于
      On UP, percpu allocations were redirected to kmalloc.  This has the
      following problems.
      
      * For certain amount of allocations (determined by
        PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
        allocator can be used before the usual kernel memory allocator is
        brought online.  On SMP, this is used to initialize the kernel
        memory allocator.
      
      * percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
        doesn't.  For example, workqueue makes use of larger alignments for
        cpu_workqueues.
      
      Currently, users of percpu allocators need to handle UP differently,
      which is somewhat fragile and ugly.  Other than small amount of
      memory, there isn't much to lose by enabling percpu allocator on UP.
      It can simply use kernel memory based chunk allocation which was added
      for SMP archs w/o MMUs.
      
      This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
      makes UP build use percpu-km.  As percpu addresses and kernel
      addresses are always identity mapped and static percpu variables don't
      need any special treatment, nothing is arch dependent and mm/percpu.c
      implements generic setup_per_cpu_areas() for UP.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      9b8327bb
  3. 10 9月, 2010 1 次提交
  4. 08 9月, 2010 1 次提交
    • T
      percpu: use percpu allocator on UP too · bbddff05
      Tejun Heo 提交于
      On UP, percpu allocations were redirected to kmalloc.  This has the
      following problems.
      
      * For certain amount of allocations (determined by
        PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
        allocator can be used before the usual kernel memory allocator is
        brought online.  On SMP, this is used to initialize the kernel
        memory allocator.
      
      * percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
        doesn't.  For example, workqueue makes use of larger alignments for
        cpu_workqueues.
      
      Currently, users of percpu allocators need to handle UP differently,
      which is somewhat fragile and ugly.  Other than small amount of
      memory, there isn't much to lose by enabling percpu allocator on UP.
      It can simply use kernel memory based chunk allocation which was added
      for SMP archs w/o MMUs.
      
      This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
      makes UP build use percpu-km.  As percpu addresses and kernel
      addresses are always identity mapped and static percpu variables don't
      need any special treatment, nothing is arch dependent and mm/percpu.c
      implements generic setup_per_cpu_areas() for UP.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
      bbddff05
  5. 01 5月, 2010 1 次提交
    • T
      percpu: implement kernel memory based chunk allocation · b0c9778b
      Tejun Heo 提交于
      Implement an alternate percpu chunk management based on kernel memeory
      for nommu SMP architectures.  Instead of mapping into vmalloc area,
      chunks are allocated as a contiguous kernel memory using
      alloc_pages().  As such, percpu allocator on nommu will have the
      following restrictions.
      
      * It can't fill chunks on-demand page-by-page.  It has to allocate
        each chunk fully upfront.
      
      * It can't support sparse chunk for NUMA configurations.  SMP w/o mmu
        is crazy enough.  Let's hope no one does NUMA w/o mmu.  :-P
      
      * If chunk size isn't power-of-two multiple of PAGE_SIZE, the
        unaligned amount will be wasted on each chunk.  So, archs which use
        this better align chunk size.
      
      For instructions on how to use this, read the comment on top of
      mm/percpu-km.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NDavid Howells <dhowells@redhat.com>
      Cc: Graff Yang <graff.yang@gmail.com>
      Cc: Sonic Zhang <sonic.adi@gmail.com>
      b0c9778b