1. 27 9月, 2018 1 次提交
  2. 20 3月, 2018 1 次提交
    • T
      percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods · b3a5d111
      Tejun Heo 提交于
      percpu_ref internally uses sched-RCU to implement the percpu -> atomic
      mode switching and the documentation suggested that this could be
      depended upon.  This doesn't seem like a good idea.
      
      * percpu_ref uses sched-RCU which has different grace periods regular
        RCU.  Users may combine percpu_ref with regular RCU usage and
        incorrectly believe that regular RCU grace periods are performed by
        percpu_ref.  This can lead to, for example, use-after-free due to
        premature freeing.
      
      * percpu_ref has a grace period when switching from percpu to atomic
        mode.  It doesn't have one between the last put and release.  This
        distinction is subtle and can lead to surprising bugs.
      
      * percpu_ref allows starting in and switching to atomic mode manually
        for debugging and other purposes.  This means that there may not be
        any grace periods from kill to release.
      
      This patch makes it clear that the grace periods are percpu_ref's
      internal implementation detail and can't be depended upon by the
      users.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b3a5d111
  3. 05 12月, 2017 1 次提交
  4. 23 3月, 2017 1 次提交
  5. 12 8月, 2016 1 次提交
    • R
      percpu-refcount: init ->confirm_switch member properly · a67823c1
      Roman Pen 提交于
      This patch targets two things which are related to ->confirm_switch:
      
       1. Init ->confirm_switch pointer with NULL on percpu_ref_init() or
          kernel frightfully complains with WARN_ON_ONCE(ref->confirm_switch)
          at __percpu_ref_switch_to_atomic if memory chunk was not properly
          zeroed.
      
       2. Warn if RCU callback is still in progress on percpu_ref_exit().
          The race still exists, because percpu_ref_call_confirm_rcu()
          drops ->confirm_switch to NULL early, but that is only a warning
          and still the caller is responsible that ref is no longer in
          active use.  Hopefully that can help to catch incorrect usage
          of percpu-refcount.
      Signed-off-by: NRoman Pen <roman.penyaev@profitbricks.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a67823c1
  6. 11 8月, 2016 5 次提交
    • T
      percpu_ref: allow operation mode switching operations to be called concurrently · 33e465ce
      Tejun Heo 提交于
      percpu_ref initially didn't have explicit mode switching operations.
      It started out in percpu mode and switched to atomic mode on kill and
      then released.  Ensuring that kill operation is initiated only after
      init completes was naturally the caller's responsibility.
      
      percpu_ref_reinit() was introduced later but it didn't shift the
      synchronization responsibility.  Reinit can't be performed until kill
      is confirmed, so there was nothing to worry about
      synchronization-wise.  Also, as both reinit and kill manipulate the
      base reference, invocations of the same function couldn't be allowed
      to race each other.
      
      The latest additions of percpu_ref_switch_to_atomic/percpu() changed
      the situation.  These two functions can be called any time as long as
      the percpu_ref is between init and exit and thus there are valid valid
      usage scenarios where these new functions race with each other or
      against reinit/kill.  Mostly from inertia, f47ad457 ("percpu_ref:
      decouple switching to percpu mode and reinit") still left
      synchronization among percpu mode switching operations to its users.
      
      That the new switch functions can be freely mixed with kill/reinit but
      the operations themselves should be synchronized is too subtle a
      requirement and led to a very subtle race condition in blk-mq freezing
      path.
      
      This patch fixes the situation by introducing percpu_ref_switch_lock
      to protect mode switching operations.  This ensures that percpu-ref
      users don't have to worry about mode changing operations racing
      against each other, e.g. switch_to_percpu against kill, as long as the
      sequence of operations is valid.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Link: http://lkml.kernel.org/g/1443287365-4244-7-git-send-email-akinobu.mita@gmail.com
      Fixes: f47ad457 ("percpu_ref: decouple switching to percpu mode and reinit")
      33e465ce
    • T
      percpu_ref: restructure operation mode switching · 3f49bdd9
      Tejun Heo 提交于
      Restructure atomic/percpu mode switching.
      
      * The users of __percpu_ref_switch_to_atomic/percpu() now call a new
        function __percpu_ref_switch_mode() which calls either of the
        original switching functions depending on the current state of
        ref->force_atomic and the __PERCPU_REF_DEAD flag.  The callers no
        longer check whether switching is necessary but always invoke
        __percpu_ref_switch_mode().
      
      * !ref->confirm_switch waiting is collected into
        __percpu_ref_switch_mode().
      
      This patch doesn't cause any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3f49bdd9
    • T
      percpu_ref: unify staggered atomic switching wait behavior · 18808354
      Tejun Heo 提交于
      When an atomic or percpu switching starts before the previous atomic
      switching finishes, the taken behaviors are
      
      * If the new atomic switching has confirmation callback, it waits
        for the previous atomic switching to complete.
      
      * If the new percpu switching is the first percpu switching following
        the previous atomic switching, it waits the previous atomic
        switching to complete.
      
      No percpu_ref user depends on these subtleties.  The only meaningful
      part is that, if the caller ensures that atomic switching isn't in
      progress, mode switching operations can be issued from any context.
      
      This patch pulls the wait logic to the top of both switching functions
      so that they always wait for the previous atomic switching to
      complete.  This makes the behavior simpler and consistent for both
      directions and will help allowing concurrent invocations of mode
      switching functions.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      18808354
    • T
      percpu_ref: reorganize __percpu_ref_switch_to_atomic() and relocate percpu_ref_switch_to_atomic() · b2302c7f
      Tejun Heo 提交于
      Reorganize __percpu_ref_switch_to_atomic() so that it looks
      structurally similar to __percpu_ref_switch_to_percpu() and relocate
      percpu_ref_switch_to_atomic so that the two internal functions are
      co-located.
      
      This patch doesn't introduce any functional differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b2302c7f
    • T
      percpu_ref: remove unnecessary RCU grace period for staggered atomic switching confirmation · a2f5630c
      Tejun Heo 提交于
      At the beginning, percpu_ref guaranteed a RCU grace period between a
      call to percpu_ref_kill_and_confirm() and the invocation of the
      confirmation callback.  This guarantee exposed internal implementation
      details and got rescinded while switching over to sched RCU; however,
      __percpu_ref_switch_to_atomic() still inserts a full sched RCU grace
      period even when it can simply wait for the previous attempt.
      
      Remove the unnecessary grace period and perform the confirmation
      synchronously for staggered atomic switching attempts.  Update
      comments accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a2f5630c
  7. 15 2月, 2016 1 次提交
  8. 25 9月, 2014 10 次提交
    • T
      percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky · 1cae13e7
      Tejun Heo 提交于
      Currently, a percpu_ref which is initialized with
      PERPCU_REF_INIT_ATOMIC or switched to atomic mode via
      switch_to_atomic() automatically reverts to percpu mode on the first
      percpu_ref_reinit().  This makes the atomic mode difficult to use for
      cases where a percpu_ref is used as a persistent on/off switch which
      may be cycled multiple times.
      
      This patch makes such atomic state sticky so that it survives through
      kill/reinit cycles.  After this patch, atomic state is cleared only by
      an explicit percpu_ref_switch_to_percpu() call.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      1cae13e7
    • T
      percpu_ref: add PERCPU_REF_INIT_* flags · 2aad2a86
      Tejun Heo 提交于
      With the recent addition of percpu_ref_reinit(), percpu_ref now can be
      used as a persistent switch which can be turned on and off repeatedly
      where turning off maps to killing the ref and waiting for it to drain;
      however, there currently isn't a way to initialize a percpu_ref in its
      off (killed and drained) state, which can be inconvenient for certain
      persistent switch use cases.
      
      Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
      selection of operation mode; however, currently a newly initialized
      percpu_ref is always in percpu mode making it impossible to avoid the
      latency overhead of switching to atomic mode.
      
      This patch adds @flags to percpu_ref_init() and implements the
      following flags.
      
      * PERCPU_REF_INIT_ATOMIC	: start ref in atomic mode
      * PERCPU_REF_INIT_DEAD		: start ref killed and drained
      
      These flags should be able to serve the above two use cases.
      
      v2: target_core_tpg.c conversion was missing.  Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      2aad2a86
    • T
      percpu_ref: decouple switching to percpu mode and reinit · f47ad457
      Tejun Heo 提交于
      percpu_ref has treated the dropping of the base reference and
      switching to atomic mode as an integral operation; however, there's
      nothing inherent tying the two together.
      
      The use cases for percpu_ref have been expanding continuously.  While
      the current init/kill/reinit/exit model can cover a lot, the coupling
      of kill/reinit with atomic/percpu mode switching is turning out to be
      too restrictive for use cases where many percpu_refs are created and
      destroyed back-to-back with only some of them reaching extended
      operation.  The coupling also makes implementing always-atomic debug
      mode difficult.
      
      This patch separates out percpu mode switching into
      percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
      top of it.
      
      * DEAD still requires ATOMIC.  A dead ref can't be switched to percpu
        mode w/o going through reinit.
      
      v2: __percpu_ref_switch_to_percpu() was missing static.  Fixed.
          Reported by Fengguang aka kbuild test robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      f47ad457
    • T
      percpu_ref: decouple switching to atomic mode and killing · 490c79a6
      Tejun Heo 提交于
      percpu_ref has treated the dropping of the base reference and
      switching to atomic mode as an integral operation; however, there's
      nothing inherent tying the two together.
      
      The use cases for percpu_ref have been expanding continuously.  While
      the current init/kill/reinit/exit model can cover a lot, the coupling
      of kill/reinit with atomic/percpu mode switching is turning out to be
      too restrictive for use cases where many percpu_refs are created and
      destroyed back-to-back with only some of them reaching extended
      operation.  The coupling also makes implementing always-atomic debug
      mode difficult.
      
      This patch separates out atomic mode switching into
      percpu_ref_switch_to_atomic() and reimplements
      percpu_ref_kill_and_confirm() on top of it.
      
      * The handling of __PERCPU_REF_ATOMIC and __PERCPU_REF_DEAD is now
        differentiated.  Among get/put operations, percpu_ref_tryget_live()
        is the only one which cares about DEAD.
      
      * percpu_ref_switch_to_atomic() can be called multiple times on the
        same ref.  This means that multiple @confirm_switch may get queued
        up which we can't do reliably without extra memory area.  This is
        handled by making the later invocation synchronously wait for the
        completion of the previous one.  This isn't particularly desirable
        but such synchronous waits shouldn't happen in most cases.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      490c79a6
    • T
      percpu_ref: add PCPU_REF_DEAD · 27344a90
      Tejun Heo 提交于
      percpu_ref will be restructured so that percpu/atomic mode switching
      and reference killing are dedoupled.  In preparation, add
      PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
      For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
      uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
      behavior changes.
      
      percpu_ref_init() now specifies an explicit alignment when allocating
      the percpu counters so that the pointer has enough unused low bits to
      accomodate the flags.  Note that one flag was fine as min alignment
      for percpu memory is 2 bytes but two flags are already too many for
      the natural alignment of unsigned longs on archs like cris and m68k.
      
      v2: The original patch had BUILD_BUG_ON() which triggers if unsigned
          long's alignment isn't enough to accomodate the flags, which
          triggered on cris and m64k.  percpu_ref_init() updated to specify
          the required alignment explicitly.  Reported by Fengguang.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      27344a90
    • T
      percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch · 9e804d1f
      Tejun Heo 提交于
      percpu_ref will be restructured so that percpu/atomic mode switching
      and reference killing are dedoupled.  In preparation, do the following
      renames.
      
      * percpu_ref->confirm_kill	-> percpu_ref->confirm_switch
      * __PERCPU_REF_DEAD		-> __PERCPU_REF_ATOMIC
      * __percpu_ref_alive()		-> __ref_is_percpu()
      
      This patch is pure rename and doesn't introduce any functional
      changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      9e804d1f
    • T
      percpu_ref: replace pcpu_ prefix with percpu_ · eecc16ba
      Tejun Heo 提交于
      percpu_ref uses pcpu_ prefix for internal stuff and percpu_ for
      externally visible ones.  This is the same convention used in the
      percpu allocator implementation.  It works fine there but percpu_ref
      doesn't have too much internal-only stuff and scattered usages of
      pcpu_ prefix are confusing than helpful.
      
      This patch replaces all pcpu_ prefixes with percpu_.  This is pure
      rename and there's no functional change.  Note that PCPU_REF_DEAD is
      renamed to __PERCPU_REF_DEAD to signify that the flag is internal.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      eecc16ba
    • T
      percpu_ref: minor code and comment updates · 6251f997
      Tejun Heo 提交于
      * Some comments became stale.  Updated.
      * percpu_ref_tryget() unnecessarily initializes @ret.  Removed.
      * A blank line removed from percpu_ref_kill_rcu().
      * Explicit function name in a WARN format string replaced with __func__.
      * WARN_ON() in percpu_ref_reinit() converted to WARN_ON_ONCE().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      6251f997
    • T
      percpu_ref: relocate percpu_ref_reinit() · a2237370
      Tejun Heo 提交于
      percpu_ref is gonna go through restructuring.  Move
      percpu_ref_reinit() after percpu_ref_kill_and_confirm().  This will
      make later changes easier to follow and result in cleaner
      organization.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <kmo@daterainc.com>
      a2237370
    • T
      Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe" · 9eca8046
      Tejun Heo 提交于
      This reverts commit 0a30288d, which
      was a temporary fix for SCSI blk-mq stall issue.  The following
      patches will fix the issue properly by introducing atomic mode to
      percpu_ref.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      9eca8046
  9. 24 9月, 2014 1 次提交
    • T
      blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe · 0a30288d
      Tejun Heo 提交于
      blk-mq uses percpu_ref for its usage counter which tracks the number
      of in-flight commands and used to synchronously drain the queue on
      freeze.  percpu_ref shutdown takes measureable wallclock time as it
      involves a sched RCU grace period.  This means that draining a blk-mq
      takes measureable wallclock time.  One would think that this shouldn't
      matter as queue shutdown should be a rare event which takes place
      asynchronously w.r.t. userland.
      
      Unfortunately, SCSI probing involves synchronously setting up and then
      tearing down a lot of request_queues back-to-back for non-existent
      LUNs.  This means that SCSI probing may take more than ten seconds
      when scsi-mq is used.
      
      This will be properly fixed by implementing a mechanism to keep
      q->mq_usage_counter in atomic mode till genhd registration; however,
      that involves rather big updates to percpu_ref which is difficult to
      apply late in the devel cycle (v3.17-rc6 at the moment).  As a
      stop-gap measure till the proper fix can be implemented in the next
      cycle, this patch introduces __percpu_ref_kill_expedited() and makes
      blk_mq_freeze_queue() use it.  This is heavy-handed but should work
      for testing the experimental SCSI blk-mq implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NChristoph Hellwig <hch@infradead.org>
      Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
      Fixes: add703fd ("blk-mq: use percpu_ref for mq usage count")
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Tested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0a30288d
  10. 20 9月, 2014 2 次提交
    • T
      percpu-refcount: make percpu_ref based on longs instead of ints · e625305b
      Tejun Heo 提交于
      percpu_ref is currently based on ints and the number of refs it can
      cover is (1 << 31).  This makes it impossible to use a percpu_ref to
      count memory objects or pages on 64bit machines as it may overflow.
      This forces those users to somehow aggregate the references before
      contributing to the percpu_ref which is often cumbersome and sometimes
      challenging to get the same level of performance as using the
      percpu_ref directly.
      
      While using ints for the percpu counters makes them pack tighter on
      64bit machines, the possible gain from using ints instead of longs is
      extremely small compared to the overall gain from per-cpu operation.
      This patch makes percpu_ref based on longs so that it can be used to
      directly count memory objects or pages.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      e625305b
    • T
      percpu-refcount: improve WARN messages · 4843c332
      Tejun Heo 提交于
      percpu_ref's WARN messages can be a lot more helpful by indicating
      who's the culprit.  Make them report the release function that the
      offending percpu-refcount is associated with.  This should make it a
      lot easier to track down the reported invalid refcnting operations.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      4843c332
  11. 08 9月, 2014 1 次提交
    • T
      percpu-refcount: add @gfp to percpu_ref_init() · a34375ef
      Tejun Heo 提交于
      Percpu allocator now supports allocation mask.  Add @gfp to
      percpu_ref_init() so that !GFP_KERNEL allocation masks can be used
      with percpu_refs too.
      
      This patch doesn't make any functional difference.
      
      v2: blk-mq conversion was missing.  Updated.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      a34375ef
  12. 28 6月, 2014 5 次提交
    • T
      percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero() · 2d722782
      Tejun Heo 提交于
      Now that explicit invocation of percpu_ref_exit() is necessary to free
      the percpu counter, we can implement percpu_ref_reinit() which
      reinitializes a released percpu_ref.  This can be used implement
      scalable gating switch which can be drained and then re-opened without
      worrying about memory allocation failures.
      
      percpu_ref_is_zero() is added to be used in a sanity check in
      percpu_ref_exit().  As this function will be useful for other purposes
      too, make it a public interface.
      
      v2: Use smp_read_barrier_depends() instead of smp_load_acquire().  We
          only need data dep barrier and smp_load_acquire() is stronger and
          heavier on some archs.  Spotted by Lai Jiangshan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      2d722782
    • T
      percpu-refcount: require percpu_ref to be exited explicitly · 9a1049da
      Tejun Heo 提交于
      Currently, a percpu_ref undoes percpu_ref_init() automatically by
      freeing the allocated percpu area when the percpu_ref is killed.
      While seemingly convenient, this has the following niggles.
      
      * It's impossible to re-init a released reference counter without
        going through re-allocation.
      
      * In the similar vein, it's impossible to initialize a percpu_ref
        count with static percpu variables.
      
      * We need and have an explicit destructor anyway for failure paths -
        percpu_ref_cancel_init().
      
      This patch removes the automatic percpu counter freeing in
      percpu_ref_kill_rcu() and repurposes percpu_ref_cancel_init() into a
      generic destructor now named percpu_ref_exit().  percpu_ref_destroy()
      is considered but it gets confusing with percpu_ref_kill() while
      "exit" clearly indicates that it's the counterpart of
      percpu_ref_init().
      
      All percpu_ref_cancel_init() users are updated to invoke
      percpu_ref_exit() instead and explicit percpu_ref_exit() calls are
      added to the destruction path of all percpu_ref users.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBenjamin LaHaise <bcrl@kvack.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Li Zefan <lizefan@huawei.com>
      9a1049da
    • T
      percpu-refcount: use unsigned long for pcpu_count pointer · 7d742075
      Tejun Heo 提交于
      percpu_ref->pcpu_count is a percpu pointer with a status flag in its
      lowest bit.  As such, it always goes through arithmetic operations
      which is very cumbersome to do on a pointer.  It has to be first
      casted to unsigned long and then back.
      
      Let's just make the field unsigned long so that we can skip the first
      casts.  While at it, rename it to pcpu_counter_ptr to clarify that
      it's a pointer value.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      7d742075
    • T
      percpu-refcount: add helpers for ->percpu_count accesses · eae7975d
      Tejun Heo 提交于
      * All four percpu_ref_*() operations implemented in the header file
        perform the same operation to determine whether the percpu_ref is
        alive and extract the percpu pointer.  Factor out the common logic
        into __pcpu_ref_alive().  This doesn't change the generated code.
      
      * There are a couple places in percpu-refcount.c which masks out
        PCPU_REF_DEAD to obtain the percpu pointer.  Factor it out into
        pcpu_count_ptr().
      
      * The above changes make the WARN_ON_ONCE() conditional at the top of
        percpu_ref_kill_and_confirm() the only user of REF_STATUS().  Test
        PCPU_REF_DEAD directly and remove REF_STATUS().
      
      This patch doesn't introduce any functional change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      eae7975d
    • T
      percpu-refcount: one bit is enough for REF_STATUS · d630dc4c
      Tejun Heo 提交于
      percpu-refcount currently reserves two lowest bits of its percpu
      pointer to indicate its state; however, only one bit is used for
      PCPU_REF_DEAD.
      
      Simplify it by removing PCPU_STATUS_BITS/MASK and testing
      PCPU_REF_DEAD directly.  This also allows the compiler to choose a
      more efficient instruction depending on the architecture.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      d630dc4c
  13. 21 1月, 2014 1 次提交
  14. 08 11月, 2013 1 次提交
  15. 17 10月, 2013 1 次提交
  16. 17 6月, 2013 1 次提交
    • T
      percpu-refcount: use RCU-sched insted of normal RCU · a4244454
      Tejun Heo 提交于
      percpu-refcount was incorrectly using preempt_disable/enable() for RCU
      critical sections against call_rcu().  6a24474d ("percpu-refcount:
      consistently use plain (non-sched) RCU") fixed it by converting the
      preepmtion operations with rcu_read_[un]lock() citing that there isn't
      any advantage in using sched-RCU over using the usual one; however,
      rcu_read_[un]lock() for the preemptible RCU implementation -
      CONFIG_TREE_PREEMPT_RCU, chosen when CONFIG_PREEMPT - are slightly
      more expensive than preempt_disable/enable().
      
      In a contrived microbench which repeats the followings,
      
       - percpu_ref_get()
       - copy 32 bytes of data into percpu buffer
       - percpu_put_get()
       - copy 32 bytes of data into percpu buffer
      
      rcu_read_[un]lock() used in percpu_ref_get/put() makes it go slower by
      about 15% when compared to using sched-RCU.
      
      As the RCU critical sections are extremely short, using sched-RCU
      shouldn't have any latency implications.  Convert to RCU-sched.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NKent Overstreet <koverstreet@google.com>
      Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      a4244454
  17. 14 6月, 2013 3 次提交
    • T
      percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm() · dbece3a0
      Tejun Heo 提交于
      Implement percpu_tryget() which stops giving out references once the
      percpu_ref is visible as killed.  Because the refcnt is per-cpu,
      different CPUs will start to see a refcnt as killed at different
      points in time and tryget() may continue to succeed on subset of cpus
      for a while after percpu_ref_kill() returns.
      
      For use cases where it's necessary to know when all CPUs start to see
      the refcnt as dead, percpu_ref_kill_and_confirm() is added.  The new
      function takes an extra argument @confirm_kill which is invoked when
      the refcnt is guaranteed to be viewed as killed on all CPUs.
      
      While this isn't the prettiest interface, it doesn't force synchronous
      wait and is much safer than requiring the caller to do its own
      call_rcu().
      
      v2: Patch description rephrased to emphasize that tryget() may
          continue to succeed on some CPUs after kill() returns as suggested
          by Kent.
      
      v3: Function comment in percpu_ref_kill_and_confirm() updated warning
          people to not depend on the implied RCU grace period from the
          confirm callback as it's an implementation detail.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Slightly-Grumpily-Acked-by: NKent Overstreet <koverstreet@google.com>
      dbece3a0
    • T
      percpu-refcount: implement percpu_ref_cancel_init() · bc497bd3
      Tejun Heo 提交于
      Normally, percpu_ref_init() initializes and percpu_ref_kill()
      initiates destruction which completes asynchronously.  The
      asynchronous destruction can be problematic in init failure path where
      the caller wants to destroy half-constructed object - distinguishing
      half-constructed objects from the usual release method can be painful
      for complex objects.
      
      This patch implements percpu_ref_cancel_init() which synchronously
      destroys the percpu_ref without invoking release.  To avoid
      unintentional misuses, the function requires the ref to have finished
      percpu_ref_init() but never used and triggers WARN otherwise.
      
      v2: Explain the weird name and usage restriction in the function
          comment.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NKent Overstreet <koverstreet@google.com>
      bc497bd3
    • T
      percpu-refcount: add __must_check to percpu_ref_init() and don't use... · acac7883
      Tejun Heo 提交于
      percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu()
      
      Two small changes.
      
      * Unlike most init functions, percpu_ref_init() allocates memory and
        may fail.  Let's mark it with __must_check in case the caller
        forgets.
      
      * percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to
        dereference @ref->pcpu_count, which can be misleading.  The pointer
        is guaranteed to be valid and visible and can't change underneath
        the function.  Drop ACCESS_ONCE().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      acac7883
  18. 13 6月, 2013 1 次提交
  19. 04 6月, 2013 2 次提交
    • K
      percpu-refcount: Don't use silly cmpxchg() · c1ae6e9b
      Kent Overstreet 提交于
      The cmpxchg() was just to ensure the debug check didn't race, which was
      a bit excessive. The caller is supposed to do the appropriate
      synchronization, which means percpu_ref_kill() can just do a simple
      store.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c1ae6e9b
    • K
      percpu: implement generic percpu refcounting · 215e262f
      Kent Overstreet 提交于
      This implements a refcount with similar semantics to
      atomic_get()/atomic_dec_and_test() - but percpu.
      
      It also implements two stage shutdown, as we need it to tear down the
      percpu counts.  Before dropping the initial refcount, you must call
      percpu_ref_kill(); this puts the refcount in "shutting down mode" and
      switches back to a single atomic refcount with the appropriate
      barriers (synchronize_rcu()).
      
      It's also legal to call percpu_ref_kill() multiple times - it only
      returns true once, so callers don't have to reimplement shutdown
      synchronization.
      
      [akpm@linux-foundation.org: fix build]
      [akpm@linux-foundation.org: coding-style tweak]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      215e262f