1. 08 2月, 2014 13 次提交
    • T
      kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends · 3eef34ad
      Tejun Heo 提交于
      kernfs_node->parent and ->name are currently marked as "published"
      indicating that kernfs users may access them directly; however, those
      fields may get updated by kernfs_rename[_ns]() and unrestricted access
      may lead to erroneous values or oops.
      
      Protect ->parent and ->name updates with a irq-safe spinlock
      kernfs_rename_lock and implement the following accessors for these
      fields.
      
      * kernfs_name()		- format the node's name into the specified buffer
      * kernfs_path()		- format the node's path into the specified buffer
      * pr_cont_kernfs_name()	- pr_cont a node's name (doesn't need buffer)
      * pr_cont_kernfs_path()	- pr_cont a node's path (doesn't need buffer)
      * kernfs_get_parent()	- pin and return a node's parent
      
      All can be called under any context.  The recursive sysfs_pathname()
      in fs/sysfs/dir.c is replaced with kernfs_path() and
      sysfs_rename_dir_ns() is updated to use kernfs_get_parent() instead of
      dereferencing parent directly.
      
      v2: Dummy definition of kernfs_path() for !CONFIG_KERNFS was missing
          static inline making it cause a lot of build warnings.  Add it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eef34ad
    • T
      kernfs: implement kernfs_node_from_dentry(), kernfs_root_from_sb() and kernfs_rename() · 0c23b225
      Tejun Heo 提交于
      Implement helpers to determine node from dentry and root from
      super_block.  Also add a kernfs_rename_ns() wrapper which assumes NULL
      namespace.  These generally make sense and will be used by cgroup.
      
      v2: Some dummy implementations for !CONFIG_SYSFS was missing.  Fixed.
          Reported by kbuild test robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c23b225
    • T
      kernfs: add kernfs_open_file->priv · 2536390d
      Tejun Heo 提交于
      Add a private data field to be used by kernfs file operations.  This
      generally makes sense and will be used by cgroup.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2536390d
    • T
      kernfs: implement kernfs_ops->atomic_write_len · 4d3773c4
      Tejun Heo 提交于
      A write to a kernfs_node is buffered through a kernel buffer.  Writes
      <= PAGE_SIZE are performed atomically, while larger ones are executed
      in PAGE_SIZE chunks.  While this is enough for sysfs, cgroup which is
      scheduled to be converted to use kernfs needs a bit more control over
      it.
      
      This patch adds kernfs_ops->atomic_write_len.  If not set (zero), the
      behavior stays the same.  If set, writes upto the size are executed
      atomically and larger writes are rejected with -E2BIG.
      
      A different implementation strategy would be allowing configuring
      chunking size while making the original write size available to the
      write method; however, such strategy, while being more complicated,
      doesn't really buy anything.  If the write implementation has to
      handle chunking, the specific chunk size shouldn't matter all that
      much.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d3773c4
    • T
      kernfs: allow nodes to be created in the deactivated state · d35258ef
      Tejun Heo 提交于
      Currently, kernfs_nodes are made visible to userland on creation,
      which makes it difficult for kernfs users to atomically succeed or
      fail creation of multiple nodes.  In addition, if something fails
      after creating some nodes, the created nodes might already be in use
      and their active refs need to be drained for removal, which has the
      potential to introduce tricky reverse locking dependency on active_ref
      depending on how the error path is synchronized.
      
      This patch introduces per-root flag KERNFS_ROOT_CREATE_DEACTIVATED.
      If set, all nodes under the root are created in the deactivated state
      and stay invisible to userland until explicitly enabled by the new
      kernfs_activate() API.  Also, nodes which have never been activated
      are guaranteed to bypass draining on removal thus allowing error paths
      to not worry about lockding dependency on active_ref draining.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d35258ef
    • T
      kernfs: implement kernfs_syscall_ops->remount_fs() and ->show_options() · 6a7fed4e
      Tejun Heo 提交于
      Add two super_block related syscall callbacks ->remount_fs() and
      ->show_options() to kernfs_syscall_ops.  These simply forward the
      matching super_operations.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a7fed4e
    • T
      kernfs: rename kernfs_dir_ops to kernfs_syscall_ops · 90c07c89
      Tejun Heo 提交于
      We're gonna need non-dir syscall callbacks, which will make dir_ops a
      misnomer.  Let's rename kernfs_dir_ops to kernfs_syscall_ops.
      
      This is pure rename.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      90c07c89
    • T
      kernfs: invoke dir_ops while holding active ref of the target node · 07c7530d
      Tejun Heo 提交于
      kernfs_dir_ops are currently being invoked without any active
      reference, which makes it tricky for the invoked operations to
      determine whether the objects associated those nodes are safe to
      access and will remain that way for the duration of such operations.
      
      kernfs already has active_ref mechanism to deal with this which makes
      the removal of a given node the synchronization point for gating the
      file operations.  There's no reason for dir_ops to be any different.
      Update the dir_ops handling so that active_ref is held while the
      dir_ops are executing.  This guarantees that while a dir_ops is
      executing the target nodes stay alive.
      
      As kernfs_dir_ops doesn't have any in-kernel user at this point, this
      doesn't affect anybody.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07c7530d
    • T
      kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers · 6b0afc2a
      Tejun Heo 提交于
      Sometimes it's necessary to implement a node which wants to delete
      nodes including itself.  This isn't straightforward because of kernfs
      active reference.  While a file operation is in progress, an active
      reference is held and kernfs_remove() waits for all such references to
      drain before completing.  For a self-deleting node, this is a deadlock
      as kernfs_remove() ends up waiting for an active reference that itself
      is sitting on top of.
      
      This currently is worked around in the sysfs layer using
      sysfs_schedule_callback() which makes such removals asynchronous.
      While it works, it's rather cumbersome and inherently breaks
      synchronicity of the operation - the file operation which triggered
      the operation may complete before the removal is finished (or even
      started) and the removal may fail asynchronously.  If a removal
      operation is immmediately followed by another operation which expects
      the specific name to be available (e.g. removal followed by rename
      onto the same name), there's no way to make the latter operation
      reliable.
      
      The thing is there's no inherent reason for this to be asynchrnous.
      All that's necessary to do this synchronous is a dedicated operation
      which drops its own active ref and deactivates self.  This patch
      implements kernfs_remove_self() and its wrappers in sysfs and driver
      core.  kernfs_remove_self() is to be called from one of the file
      operations, drops the active ref the task is holding, removes the self
      node, and restores active ref to the dead node so that the ref is
      balanced afterwards.  __kernfs_remove() is updated so that it takes an
      early exit if the target node is already fully removed so that the
      active ref restored by kernfs_remove_self() after removal doesn't
      confuse the deactivation path.
      
      This makes implementing self-deleting nodes very easy.  The normal
      removal path doesn't even need to be changed to use
      kernfs_remove_self() for the self-deleting node.  The method can
      invoke kernfs_remove_self() on itself before proceeding the normal
      removal path.  kernfs_remove() invoked on the node by the normal
      deletion path will simply be ignored.
      
      This will replace sysfs_schedule_callback().  A subtle feature of
      sysfs_schedule_callback() is that it collapses multiple invocations -
      even if multiple removals are triggered, the removal callback is run
      only once.  An equivalent effect can be achieved by testing the return
      value of kernfs_remove_self() - only the one which gets %true return
      value should proceed with actual deletion.  All other instances of
      kernfs_remove_self() will wait till the enclosing kernfs operation
      which invoked the winning instance of kernfs_remove_self() finishes
      and then return %false.  This trivially makes all users of
      kernfs_remove_self() automatically show correct synchronous behavior
      even when there are multiple concurrent operations - all "echo 1 >
      delete" instances will finish only after the whole operation is
      completed by one of the instances.
      
      Note that manipulation of active ref is implemented in separate public
      functions - kernfs_[un]break_active_protection().
      kernfs_remove_self() is the only user at the moment but this will be
      used to cater to more complex cases.
      
      v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
          and sysfs_remove_file_self() had incorrect return type.  Fix it.
          Reported by kbuild test bot.
      
      v3: kernfs_[un]break_active_protection() separated out from
          kernfs_remove_self() and exposed as public API.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b0afc2a
    • T
      kernfs: remove KERNFS_REMOVED · 81c173cb
      Tejun Heo 提交于
      KERNFS_REMOVED is used to mark half-initialized and dying nodes so
      that they don't show up in lookups and deny adding new nodes under or
      renaming it; however, its role overlaps that of deactivation.
      
      It's necessary to deny addition of new children while removal is in
      progress; however, this role considerably intersects with deactivation
      - KERNFS_REMOVED prevents new children while deactivation prevents new
      file operations.  There's no reason to have them separate making
      things more complex than necessary.
      
      This patch removes KERNFS_REMOVED.
      
      * Instead of KERNFS_REMOVED, each node now starts its life
        deactivated.  This means that we now use both atomic_add() and
        atomic_sub() on KN_DEACTIVATED_BIAS, which is INT_MIN.  The compiler
        generates an overflow warnings when negating INT_MIN as the negation
        can't be represented as a positive number.  Nothing is actually
        broken but let's bump BIAS by one to avoid the warnings for archs
        which negates the subtrahend..
      
      * A new helper kernfs_active() which tests whether kn->active >= 0 is
        added for convenience and lockdep annotation.  All KERNFS_REMOVED
        tests are replaced with negated kernfs_active() tests.
      
      * __kernfs_remove() is updated to deactivate, but not drain, all nodes
        in the subtree instead of setting KERNFS_REMOVED.  This removes
        deactivation from kernfs_deactivate(), which is now renamed to
        kernfs_drain().
      
      * Sanity check on KERNFS_REMOVED in kernfs_put() is replaced with
        checks on the active ref.
      
      * Some comment style updates in the affected area.
      
      v2: Reordered before removal path restructuring.  kernfs_active()
          dropped and kernfs_get/put_active() used instead.  RB_EMPTY_NODE()
          used in the lookup paths.
      
      v3: Reverted most of v2 except for creating a new node with
          KN_DEACTIVATED_BIAS.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81c173cb
    • T
      kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep() · 182fd64b
      Tejun Heo 提交于
      There currently are two mechanisms gating active ref lockdep
      annotations - KERNFS_LOCKDEP flag and KERNFS_ACTIVE_REF type mask.
      The former disables lockdep annotations in kernfs_get/put_active()
      while the latter disables all of kernfs_deactivate().
      
      While KERNFS_ACTIVE_REF also behaves as an optimization to skip the
      deactivation step for non-file nodes, the benefit is marginal and it
      needlessly diverges code paths.  Let's drop KERNFS_ACTIVE_REF.
      
      While at it, add a test helper kernfs_lockdep() to test KERNFS_LOCKDEP
      flag so that it's more convenient and the related code can be compiled
      out when not enabled.
      
      v2: Refreshed on top of ("kernfs: make kernfs_deactivate() honor
          KERNFS_LOCKDEP flag").  As the earlier patch already added
          KERNFS_LOCKDEP tests to kernfs_deactivate(), those additions are
          dropped from this patch and the existing ones are simply converted
          to kernfs_lockdep().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      182fd64b
    • T
      kernfs: remove kernfs_addrm_cxt · 988cd7af
      Tejun Heo 提交于
      kernfs_addrm_cxt and the accompanying kernfs_addrm_start/finish() were
      added because there were operations which should be performed outside
      kernfs_mutex after adding and removing kernfs_nodes.  The necessary
      operations were recorded in kernfs_addrm_cxt and performed by
      kernfs_addrm_finish(); however, after the recent changes which
      relocated deactivation and unmapping so that they're performed
      directly during removal, the only operation kernfs_addrm_finish()
      performs is kernfs_put(), which can be moved inside the removal path
      too.
      
      This patch moves the kernfs_put() of the base ref to __kernfs_remove()
      and remove kernfs_addrm_cxt and kernfs_addrm_start/finish().
      
      * kernfs_add_one() is updated to grab and release kernfs_mutex itself.
        sysfs_addrm_start/finish() invocations around it are removed from
        all users.
      
      * __kernfs_remove() puts an unlinked node directly instead of chaining
        it to kernfs_addrm_cxt.  Its callers are updated to grab and release
        kernfs_mutex instead of calling kernfs_addrm_start/finish() around
        it.
      
      v2: Rebased on top of "kernfs: associate a new kernfs_node with its
          parent on creation" which dropped @parent from kernfs_add_one().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      988cd7af
    • T
      kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq · abd54f02
      Tejun Heo 提交于
      kernfs_node->u.completion is used to notify deactivation completion
      from kernfs_put_active() to kernfs_deactivate().  We now allow
      multiple racing removals of the same node and the current removal
      scheme is no longer correct - kernfs_remove() invocation may return
      before the node is properly deactivated if it races against another
      removal.  The removal path will be restructured to address the issue.
      
      To help such restructure which requires supporting multiple waiters,
      this patch replaces kernfs_node->u.completion with
      kernfs_root->deactivate_waitq.  This makes deactivation event
      notifications share a per-root waitqueue_head; however, the wait path
      is quite cold and this will also allow shaving one pointer off
      kernfs_node.
      
      v2: Refreshed on top of ("kernfs: make kernfs_deactivate() honor
          KERNFS_LOCKDEP flag").
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abd54f02
  2. 18 1月, 2014 1 次提交
  3. 14 1月, 2014 7 次提交
  4. 11 1月, 2014 7 次提交
    • T
      kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers · 1ae06819
      Tejun Heo 提交于
      Sometimes it's necessary to implement a node which wants to delete
      nodes including itself.  This isn't straightforward because of kernfs
      active reference.  While a file operation is in progress, an active
      reference is held and kernfs_remove() waits for all such references to
      drain before completing.  For a self-deleting node, this is a deadlock
      as kernfs_remove() ends up waiting for an active reference that itself
      is sitting on top of.
      
      This currently is worked around in the sysfs layer using
      sysfs_schedule_callback() which makes such removals asynchronous.
      While it works, it's rather cumbersome and inherently breaks
      synchronicity of the operation - the file operation which triggered
      the operation may complete before the removal is finished (or even
      started) and the removal may fail asynchronously.  If a removal
      operation is immmediately followed by another operation which expects
      the specific name to be available (e.g. removal followed by rename
      onto the same name), there's no way to make the latter operation
      reliable.
      
      The thing is there's no inherent reason for this to be asynchrnous.
      All that's necessary to do this synchronous is a dedicated operation
      which drops its own active ref and deactivates self.  This patch
      implements kernfs_remove_self() and its wrappers in sysfs and driver
      core.  kernfs_remove_self() is to be called from one of the file
      operations, drops the active ref and deactivates using
      __kernfs_deactivate_self(), removes the self node, and restores active
      ref to the dead node using __kernfs_reactivate_self() so that the ref
      is balanced afterwards.  __kernfs_remove() is updated so that it takes
      an early exit if the target node is already fully removed so that the
      active ref restored by kernfs_remove_self() after removal doesn't
      confuse the deactivation path.
      
      This makes implementing self-deleting nodes very easy.  The normal
      removal path doesn't even need to be changed to use
      kernfs_remove_self() for the self-deleting node.  The method can
      invoke kernfs_remove_self() on itself before proceeding the normal
      removal path.  kernfs_remove() invoked on the node by the normal
      deletion path will simply be ignored.
      
      This will replace sysfs_schedule_callback().  A subtle feature of
      sysfs_schedule_callback() is that it collapses multiple invocations -
      even if multiple removals are triggered, the removal callback is run
      only once.  An equivalent effect can be achieved by testing the return
      value of kernfs_remove_self() - only the one which gets %true return
      value should proceed with actual deletion.  All other instances of
      kernfs_remove_self() will wait till the enclosing kernfs operation
      which invoked the winning instance of kernfs_remove_self() finishes
      and then return %false.  This trivially makes all users of
      kernfs_remove_self() automatically show correct synchronous behavior
      even when there are multiple concurrent operations - all "echo 1 >
      delete" instances will finish only after the whole operation is
      completed by one of the instances.
      
      v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
          and sysfs_remove_file_self() had incorrect return type.  Fix it.
          Reported by kbuild test bot.
      
      v3: Updated to use __kernfs_{de|re}activate_self().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ae06819
    • T
      kernfs: implement kernfs_{de|re}activate[_self]() · 9f010c2a
      Tejun Heo 提交于
      This patch implements four functions to manipulate deactivation state
      - deactivate, reactivate and the _self suffixed pair.  A new fields
      kernfs_node->deact_depth is added so that concurrent and nested
      deactivations are handled properly.  kernfs_node->hash is moved so
      that it's paired with the new field so that it doesn't increase the
      size of kernfs_node.
      
      A kernfs user's lock would normally nest inside active ref but during
      removal the user may want to perform kernfs_remove() while holding the
      said lock, which would introduce a reverse locking dependency.  This
      function can be used to break such reverse dependency by allowing
      deactivation step to performed separately outside user's critical
      section.
      
      This will also be used implement kernfs_remove_self().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f010c2a
    • T
      kernfs: remove kernfs_addrm_cxt · 99177a34
      Tejun Heo 提交于
      kernfs_addrm_cxt and the accompanying kernfs_addrm_start/finish() were
      added because there were operations which should be performed outside
      kernfs_mutex after adding and removing kernfs_nodes.  The necessary
      operations were recorded in kernfs_addrm_cxt and performed by
      kernfs_addrm_finish(); however, after the recent changes which
      relocated deactivation and unmapping so that they're performed
      directly during removal, the only operation kernfs_addrm_finish()
      performs is kernfs_put(), which can be moved inside the removal path
      too.
      
      This patch moves the kernfs_put() of the base ref to __kernfs_remove()
      and remove kernfs_addrm_cxt and kernfs_addrm_start/finish().
      
      * kernfs_add_one() is updated to grab and release the parent's active
        ref and kernfs_mutex itself.  kernfs_get/put_active() and
        kernfs_addrm_start/finish() invocations around it are removed from
        all users.
      
      * __kernfs_remove() puts an unlinked node directly instead of chaining
        it to kernfs_addrm_cxt.  Its callers are updated to grab and release
        kernfs_mutex instead of calling kernfs_addrm_start/finish() around
        it.
      
      v2: Updated to fit the v2 restructuring of removal path.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99177a34
    • T
      kernfs: restructure removal path to fix possible premature return · 45a140e5
      Tejun Heo 提交于
      The recursive nature of kernfs_remove() means that, even if
      kernfs_remove() is not allowed to be called multiple times on the same
      node, there may be race conditions between removal of parent and its
      descendants.  While we can claim that kernfs_remove() shouldn't be
      called on one of the descendants while the removal of an ancestor is
      in progress, such rule is unnecessarily restrictive and very difficult
      to enforce.  It's better to simply allow invoking kernfs_remove() as
      the caller sees fit as long as the caller ensures that the node is
      accessible.
      
      The current behavior in such situations is broken.  Whoever enters
      removal path first takes the node off the hierarchy and then
      deactivates.  Following removers either return as soon as it notices
      that it's not the first one or can't even find the target node as it
      has already been removed from the hierarchy.  In both cases, the
      following removers may finish prematurely while the nodes which should
      be removed and drained are still being processed by the first one.
      
      This patch restructures so that multiple removers, whether through
      recursion or direction invocation, always follow the following rules.
      
      * When there are multiple concurrent removers, only one puts the base
        ref.
      
      * Regardless of which one puts the base ref, all removers are blocked
        until the target node is fully deactivated and removed.
      
      To achieve the above, removal path now first deactivates the subtree,
      drains it and then unlinks one-by-one.  __kernfs_deactivate() is
      called directly from __kernfs_removal() and drops and regrabs
      kernfs_mutex for each descendant to drain active refs.  As this means
      that multiple removers can enter __kernfs_deactivate() for the same
      node, the function is updated so that it can handle multiple
      deactivators of the same node - only one actually deactivates but all
      wait till drain completion.
      
      The restructured removal path guarantees that a removed node gets
      unlinked only after the node is deactivated and drained.  Combined
      with proper multiple deactivator handling, this guarantees that any
      invocation of kernfs_remove() returns only after the node itself and
      all its descendants are deactivated, drained and removed.
      
      v2: Draining separated into a separate loop (used to be in the same
          loop as unlink) and done from __kernfs_deactivate().  This is to
          allow exposing deactivation as a separate interface later.
      
          Root node removal was broken in v1 patch.  Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45a140e5
    • T
      kernfs: remove KERNFS_REMOVED · ae34372e
      Tejun Heo 提交于
      KERNFS_REMOVED is used to mark half-initialized and dying nodes so
      that they don't show up in lookups and deny adding new nodes under or
      renaming it; however, its role overlaps those of deactivation and
      removal from rbtree.
      
      It's necessary to deny addition of new children while removal is in
      progress; however, this role considerably intersects with deactivation
      - KERNFS_REMOVED prevents new children while deactivation prevents new
      file operations.  There's no reason to have them separate making
      things more complex than necessary.
      
      KERNFS_REMOVED is also used to decide whether a node is still visible
      to vfs layer, which is rather redundant as equivalent determination
      can be made by testing whether the node is on its parent's children
      rbtree or not.
      
      This patch removes KERNFS_REMOVED.
      
      * Instead of KERNFS_REMOVED, each node now starts its life
        deactivated.  This means that we now use both atomic_add() and
        atomic_sub() on KN_DEACTIVATED_BIAS, which is INT_MIN.  The compiler
        generates an overflow warnings when negating INT_MIN as the negation
        can't be represented as a positive number.  Nothing is actually
        broken but let's bump BIAS by one to avoid the warnings for archs
        which negates the subtrahend..
      
      * KERNFS_REMOVED tests in add and rename paths are replaced with
        kernfs_get/put_active() of the target nodes.  Due to the way the add
        path is structured now, active ref handling is done in the callers
        of kernfs_add_one().  This will be consolidated up later.
      
      * kernfs_remove_one() is updated to deactivate instead of setting
        KERNFS_REMOVED.  This removes deactivation from kernfs_deactivate(),
        which is now renamed to kernfs_drain().
      
      * kernfs_dop_revalidate() now tests RB_EMPTY_NODE(&kn->rb) instead of
        KERNFS_REMOVED and KERNFS_REMOVED test in kernfs_dir_pos() is
        dropped.  A node which is removed from the children rbtree is not
        included in the iteration in the first place.  This means that a
        node may be visible through vfs a bit longer - it's now also visible
        after deactivation until the actual removal.  This slightly enlarged
        window difference doesn't make any difference to the userland.
      
      * Sanity check on KERNFS_REMOVED in kernfs_put() is replaced with
        checks on the active ref.
      
      * Some comment style updates in the affected area.
      
      v2: Reordered before removal path restructuring.  kernfs_active()
          dropped and kernfs_get/put_active() used instead.  RB_EMPTY_NODE()
          used in the lookup paths.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae34372e
    • T
      kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep() · a69d001c
      Tejun Heo 提交于
      There currently are two mechanisms gating active ref lockdep
      annotations - KERNFS_LOCKDEP flag and KERNFS_ACTIVE_REF type mask.
      The former disables lockdep annotations in kernfs_get/put_active()
      while the latter disables all of kernfs_deactivate().
      
      While KERNFS_ACTIVE_REF also behaves as an optimization to skip the
      deactivation step for non-file nodes, the benefit is marginal and it
      needlessly diverges code paths.  Let's drop KERNFS_ACTIVE_REF and use
      KERNFS_LOCKDEP in kernfs_deactivate() too.
      
      While at it, add a test helper kernfs_lockdep() to test KERNFS_LOCKDEP
      flag so that it's more convenient and the related code can be compiled
      out when not enabled.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a69d001c
    • T
      kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq · ea1c472d
      Tejun Heo 提交于
      kernfs_node->u.completion is used to notify deactivation completion
      from kernfs_put_active() to kernfs_deactivate().  We now allow
      multiple racing removals of the same node and the current removal
      scheme is no longer correct - kernfs_remove() invocation may return
      before the node is properly deactivated if it races against another
      removal.  The removal path will be restructured to address the issue.
      
      To help such restructure which requires supporting multiple waiters,
      this patch replaces kernfs_node->u.completion with
      kernfs_root->deactivate_waitq.  This makes deactivation event
      notifications share a per-root waitqueue_head; however, the wait path
      is quite cold and this will also allow shaving one pointer off
      kernfs_node.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea1c472d
  5. 18 12月, 2013 3 次提交
    • T
      kernfs: add kernfs_dir_ops · 80b9bbef
      Tejun Heo 提交于
      Add support for mkdir(2), rmdir(2) and rename(2) syscalls.  This is
      implemented through optional kernfs_dir_ops callback table which can
      be specified on kernfs_create_root().  An implemented callback is
      invoked when the matching syscall is invoked.
      
      As kernfs keep dcache syncs with internal representation and
      revalidates dentries on each access, the implementation of these
      methods is extremely simple.  Each just discovers the relevant
      kernfs_node(s) and invokes the requested callback which is allowed to
      do any kernfs operations and the end result doesn't necessarily have
      to match the expected semantics of the syscall.
      
      This will be used to convert cgroup to use kernfs instead of its own
      filesystem implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80b9bbef
    • T
      kernfs: mark static names with KERNFS_STATIC_NAME · 2063d608
      Tejun Heo 提交于
      Because sysfs used struct attribute which are supposed to stay
      constant, sysfs didn't copy names when creating regular files.  The
      specified string for name was supposed to stay constant.  Such
      distinction isn't inherent for kernfs.  kernfs_create_file[_ns]()
      should be able to take the same @name as kernfs_create_dir[_ns]()
      
      As there can be huge number of sysfs attributes, we still want to be
      able to use static names for sysfs attributes.  This patch renames
      kernfs_create_file_ns_key() to __kernfs_create_file() and adds
      @name_is_static parameter so that the caller can explicitly indicate
      that @name can be used without copying.  kernfs is updated to use
      KERNFS_STATIC_NAME to distinguish static and copied names.
      
      This patch doesn't introduce any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2063d608
    • T
      kernfs: add @mode to kernfs_create_dir[_ns]() · bb8b9d09
      Tejun Heo 提交于
      sysfs assumed 0755 for all newly created directories and kernfs
      inherited it.  This assumption is unnecessarily restrictive and
      inconsistent with kernfs_create_file[_ns]().  This patch adds @mode
      parameter to kernfs_create_dir[_ns]() and update uses in sysfs
      accordingly.  Among others, this will be useful for implementations of
      the planned ->mkdir() method.
      
      This patch doesn't introduce any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb8b9d09
  6. 12 12月, 2013 4 次提交
    • T
      kernfs: s/sysfs/kernfs/ in constants · df23fc39
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/SYSFS_DIR/KERNFS_DIR/
      * s/SYSFS_KOBJ_ATTR/KERNFS_FILE/
      * s/SYSFS_KOBJ_LINK/KERNFS_LINK/
      * s/SYSFS_{TYPE_FLAGS}/KERNFS_{TYPE_FLAGS}/
      * s/SYSFS_FLAG_{FLAG}/KERNFS_{FLAG}/
      * s/sysfs_type()/kernfs_type()/
      * s/SD_DEACTIVATED_BIAS/KN_DEACTIVATED_BIAS/
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df23fc39
    • T
      kernfs: s/sysfs/kernfs/ in various data structures · c525aadd
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/sysfs_open_dirent/kernfs_open_node/
      * s/sysfs_open_file/kernfs_open_file/
      * s/sysfs_inode_attrs/kernfs_iattrs/
      * s/sysfs_addrm_cxt/kernfs_addrm_cxt/
      * s/sysfs_super_info/kernfs_super_info/
      * s/sysfs_info()/kernfs_info()/
      * s/sysfs_open_dirent_lock/kernfs_open_node_lock/
      * s/sysfs_open_file_mutex/kernfs_open_file_mutex/
      * s/sysfs_of()/kernfs_of()/
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c525aadd
    • T
      kernfs: drop s_ prefix from kernfs_node members · adc5e8b5
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      s_ prefix for kernfs members is used inconsistently and a misnomer
      now.  It's not like kernfs_node is used widely across the kernel
      making the ability to grep for the members particularly useful.  Let's
      just drop the prefix.
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adc5e8b5
    • T
      kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordingly · 324a56e1
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/sysfs_elem_dir/kernfs_elem_dir/
      * s/sysfs_elem_symlink/kernfs_elem_symlink/
      * s/sysfs_elem_attr/kernfs_elem_file/
      * s/sysfs_dirent/kernfs_node/
      * s/sd/kn/ in kernfs proper
      * s/parent_sd/parent/
      * s/target_sd/target/
      * s/dir_sd/parent/
      * s/to_sysfs_dirent()/rb_to_kn()/
      * misc renames of local vars when they conflict with the above
      
      Because md, mic and gpio dig into sysfs details, this patch ends up
      modifying them.  All are sysfs_dirent renames and trivial.  While we
      can avoid these by introducing a dummy wrapping struct sysfs_dirent
      around kernfs_node, given the limited usage outside kernfs and sysfs
      proper, I don't think such workaround is called for.
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      
      - mic / gpio renames were missing.  Spotted by kbuild test robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      324a56e1
  7. 30 11月, 2013 5 次提交
    • T
      sysfs, kernfs: implement kernfs_ns_enabled() · ac9bba03
      Tejun Heo 提交于
      fs/sysfs/symlink.c::sysfs_delete_link() tests @sd->s_flags for
      SYSFS_FLAG_NS.  Let's add kernfs_ns_enabled() so that sysfs doesn't
      have to test sysfs_dirent flag directly.  This makes things tidier for
      kernfs proper too.
      
      This is purely cosmetic.
      
      v2: To avoid possible NULL deref, use noop dummy implementation which
          always returns false when !CONFIG_SYSFS.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac9bba03
    • T
      sysfs, kernfs: make sysfs_dirent definition public · cf9e5a73
      Tejun Heo 提交于
      sysfs_dirent includes some information which should be available to
      kernfs users - the type, flags, name and parent pointer.  This patch
      moves sysfs_dirent definition from kernfs/kernfs-internal.h to
      include/linux/kernfs.h so that kernfs users can access them.
      
      The type part of flags is exported as enum kernfs_node_type, the flags
      kernfs_node_flag, sysfs_type() and kernfs_enable_ns() are moved to
      include/linux/kernfs.h and the former is updated to return the enum
      type.  sysfs_dirent->s_parent and ->s_name are marked explicitly as
      public.
      
      This patch doesn't introduce any functional changes.
      
      v2: Flags exported too and kernfs_enable_ns() definition moved.
      
      v3: While moving kernfs_enable_ns() to include/linux/kernfs.h, v1 and
          v2 put the definition outside CONFIG_SYSFS replacing the dummy
          implementation with the actual implementation too.  Unfortunately,
          this can lead to oops when !CONFIG_SYSFS because
          kernfs_enable_ns() may be called on a NULL @sd and now tries to
          dereference @sd instead of not doing anything.  This issue was
          reported by Yuanhan Liu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf9e5a73
    • T
      sysfs, kernfs: prepare mount path for kernfs · 4b93dc9b
      Tejun Heo 提交于
      We're in the process of separating out core sysfs functionality into
      kernfs which will deal with sysfs_dirents directly.  This patch
      rearranges mount path so that the kernfs and sysfs parts are separate.
      
      * As sysfs_super_info won't be visible outside kernfs proper,
        kernfs_super_ns() is added to allow kernfs users to access a
        super_block's namespace tag.
      
      * Generic mount operation is separated out into kernfs_mount_ns().
        sysfs_mount() now just performs sysfs-specific permission check,
        acquires namespace tag, and invokes kernfs_mount_ns().
      
      * Generic superblock release is separated out into kernfs_kill_sb()
        which can be used directly as file_system_type->kill_sb().  As sysfs
        needs to put the namespace tag, sysfs_kill_sb() wraps
        kernfs_kill_sb() with ns tag put.
      
      * sysfs_dir_cachep init and sysfs_inode_init() are separated out into
        kernfs_init().  kernfs_init() uses only small amount of memory and
        trying to handle and propagate kernfs_init() failure doesn't make
        much sense.  Use SLAB_PANIC for sysfs_dir_cachep and make
        sysfs_inode_init() panic on failure.
      
        After this change, kernfs_init() should be called before
        sysfs_init(), fs/namespace.c::mnt_init() modified accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b93dc9b
    • T
      sysfs, kernfs: make inode number ida per kernfs_root · bc755553
      Tejun Heo 提交于
      kernfs is being updated to allow multiple sysfs_dirent hierarchies so
      that it can also be used by other users.  Currently, inode number is
      allocated using a global ida, sysfs_ino_ida; however, inos for
      different hierarchies should be handled separately.
      
      This patch makes ino allocation per kernfs_root.  sysfs_ino_ida is
      replaced by kernfs_root->ino_ida and sysfs_new_dirent() is updated to
      take @root and allocate ino from it.  ida_simple_get/remove() are used
      instead of sysfs_ino_lock and sysfs_alloc/free_ino().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc755553
    • T
      sysfs, kernfs: implement kernfs_create/destroy_root() · ba7443bc
      Tejun Heo 提交于
      There currently is single kernfs hierarchy in the whole system which
      is used for sysfs.  kernfs needs to support multiple hierarchies to
      allow other users.  This patch introduces struct kernfs_root which
      serves as the root of each kernfs hierarchy and implements
      kernfs_create/destroy_root().
      
      * Each kernfs_root is associated with a root sd (sysfs_dentry).  The
        root is freed when the root sd is released and kernfs_destory_root()
        simply invokes kernfs_remove() on the root sd.  sysfs_remove_one()
        is updated to handle release of the root sd.  Note that ps_iattr
        update in sysfs_remove_one() is trivially updated for readability.
      
      * Root sd's are now dynamically allocated using sysfs_new_dirent().
        Update sysfs_alloc_ino() so that it gives out ino from 1 so that the
        root sd still gets ino 1.
      
      * While kernfs currently only points to the root sd, it'll soon grow
        fields which are specific to each hierarchy.  As determining a given
        sd's root will be necessary, sd->s_dir.root is added.  This backlink
        fits better as a separate field in sd; however, sd->s_dir is inside
        union with space to spare, so use it to save space and provide
        kernfs_root() accessor to determine the root sd.
      
      * As hierarchies may be destroyed now, each mount needs to hold onto
        the hierarchy it's attached to.  Update sysfs_fill_super() and
        sysfs_kill_sb() so that they get and put the kernfs_root
        respectively.
      
      * sysfs_root is replaced with kernfs_root which is dynamically created
        by invoking kernfs_create_root() from sysfs_init().
      
      This patch doesn't introduce any visible behavior changes.
      
      v2: kernfs_create_root() forgot to set @sd->priv.  Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba7443bc