1. 20 5月, 2014 1 次提交
    • T
      sysfs: make sure read buffer is zeroed · f5c16f29
      Tejun Heo 提交于
      13c589d5 ("sysfs: use seq_file when reading regular files")
      switched sysfs from custom read implementation to seq_file to enable
      later transition to kernfs.  After the change, the buffer passed to
      ->show() is acquired through seq_get_buf(); unfortunately, this
      introduces a subtle behavior change.  Before the commit, the buffer
      passed to ->show() was always zero as it was allocated using
      get_zeroed_page().  Because seq_file doesn't clear buffers on
      allocation and neither does seq_get_buf(), after the commit, depending
      on the behavior of ->show(), we may end up exposing uninitialized data
      to userland thus possibly altering userland visible behavior and
      leaking information.
      
      Fix it by explicitly clearing the buffer.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NRon <ron@debian.org>
      Fixes: 13c589d5 ("sysfs: use seq_file when reading regular files")
      Cc: stable <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5c16f29
  2. 13 5月, 2014 1 次提交
    • T
      kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs · 555724a8
      Tejun Heo 提交于
      The kernfs open method - kernfs_fop_open() - inherited extra
      permission checks from sysfs.  While the vfs layer allows ignoring the
      read/write permissions checks if the issuer has CAP_DAC_OVERRIDE,
      sysfs explicitly denied open regardless of the cap if the file doesn't
      have any of the UGO perms of the requested access or doesn't implement
      the requested operation.  It can be debated whether this was a good
      idea or not but the behavior is too subtle and dangerous to change at
      this point.
      
      After cgroup got converted to kernfs, this extra perm check also got
      applied to cgroup breaking libcgroup which opens write-only files with
      O_RDWR as root.  This patch gates the extra open permission check with
      a new flag KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK and enables it for sysfs.
      For sysfs, nothing changes.  For cgroup, root now can perform any
      operation regardless of the permissions as it was before kernfs
      conversion.  Note that kernfs still fails unimplemented operations
      with -EINVAL.
      
      While at it, add comments explaining KERNFS_ROOT flags.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndrey Wagin <avagin@gmail.com>
      Tested-by: NAndrey Wagin <avagin@gmail.com>
      Cc: Li Zefan <lizefan@huawei.com>
      References: http://lkml.kernel.org/g/CANaxB-xUm3rJ-Cbp72q-rQJO5mZe1qK6qXsQM=vh0U8upJ44+A@mail.gmail.com
      Fixes: 2bd59d48 ("cgroup: convert to kernfs")
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      555724a8
  3. 17 4月, 2014 1 次提交
  4. 26 3月, 2014 1 次提交
  5. 24 3月, 2014 1 次提交
  6. 25 2月, 2014 1 次提交
    • L
      sysfs: fix namespace refcnt leak · fed95bab
      Li Zefan 提交于
      As mount() and kill_sb() is not a one-to-one match, we shoudn't get
      ns refcnt unconditionally in sysfs_mount(), and instead we should
      get the refcnt only when kernfs_mount() allocated a new superblock.
      
      v2:
      - Changed the name of the new argument, suggested by Tejun.
      - Made the argument optional, suggested by Tejun.
      
      v3:
      - Make the new argument as second-to-last arg, suggested by Tejun.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NTejun Heo <tj@kernel.org>
       ---
       fs/kernfs/mount.c      | 8 +++++++-
       fs/sysfs/mount.c       | 5 +++--
       include/linux/kernfs.h | 9 +++++----
       3 files changed, 15 insertions(+), 7 deletions(-)
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fed95bab
  7. 16 2月, 2014 1 次提交
    • C
      sysfs: create bin_attributes under the requested group · aabaf4c2
      Cody P Schafer 提交于
      bin_attributes created/updated in create_files() (such as those listed
      via (struct device).attribute_groups) were not placed under the
      specified group, and instead appeared in the base kobj directory.
      
      Fix this by making bin_attributes use creating code similar to normal
      attributes.
      
      A quick grep shows that no one is using bin_attrs in a named attribute
      group yet, so we can do this without breaking anything in usespace.
      
      Note that I do not add is_visible() support to
      bin_attributes, though that could be done as well.
      Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aabaf4c2
  8. 08 2月, 2014 5 次提交
    • T
      kernfs: add CONFIG_KERNFS · ba341d55
      Tejun Heo 提交于
      As sysfs was kernfs's only user, kernfs has been piggybacking on
      CONFIG_SYSFS; however, kernfs is scheduled to grow a new user very
      soon.  Introduce a separate config option CONFIG_KERNFS which is to be
      selected by kernfs users.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba341d55
    • T
      kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends · 3eef34ad
      Tejun Heo 提交于
      kernfs_node->parent and ->name are currently marked as "published"
      indicating that kernfs users may access them directly; however, those
      fields may get updated by kernfs_rename[_ns]() and unrestricted access
      may lead to erroneous values or oops.
      
      Protect ->parent and ->name updates with a irq-safe spinlock
      kernfs_rename_lock and implement the following accessors for these
      fields.
      
      * kernfs_name()		- format the node's name into the specified buffer
      * kernfs_path()		- format the node's path into the specified buffer
      * pr_cont_kernfs_name()	- pr_cont a node's name (doesn't need buffer)
      * pr_cont_kernfs_path()	- pr_cont a node's path (doesn't need buffer)
      * kernfs_get_parent()	- pin and return a node's parent
      
      All can be called under any context.  The recursive sysfs_pathname()
      in fs/sysfs/dir.c is replaced with kernfs_path() and
      sysfs_rename_dir_ns() is updated to use kernfs_get_parent() instead of
      dereferencing parent directly.
      
      v2: Dummy definition of kernfs_path() for !CONFIG_KERNFS was missing
          static inline making it cause a lot of build warnings.  Add it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eef34ad
    • T
      kernfs: allow nodes to be created in the deactivated state · d35258ef
      Tejun Heo 提交于
      Currently, kernfs_nodes are made visible to userland on creation,
      which makes it difficult for kernfs users to atomically succeed or
      fail creation of multiple nodes.  In addition, if something fails
      after creating some nodes, the created nodes might already be in use
      and their active refs need to be drained for removal, which has the
      potential to introduce tricky reverse locking dependency on active_ref
      depending on how the error path is synchronized.
      
      This patch introduces per-root flag KERNFS_ROOT_CREATE_DEACTIVATED.
      If set, all nodes under the root are created in the deactivated state
      and stay invisible to userland until explicitly enabled by the new
      kernfs_activate() API.  Also, nodes which have never been activated
      are guaranteed to bypass draining on removal thus allowing error paths
      to not worry about lockding dependency on active_ref draining.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d35258ef
    • T
      sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner() · ce8b04aa
      Tejun Heo 提交于
      All device_schedule_callback_owner() users are converted to use
      device_remove_file_self().  Remove now unused
      {sysfs|device}_schedule_callback_owner().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce8b04aa
    • T
      kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers · 6b0afc2a
      Tejun Heo 提交于
      Sometimes it's necessary to implement a node which wants to delete
      nodes including itself.  This isn't straightforward because of kernfs
      active reference.  While a file operation is in progress, an active
      reference is held and kernfs_remove() waits for all such references to
      drain before completing.  For a self-deleting node, this is a deadlock
      as kernfs_remove() ends up waiting for an active reference that itself
      is sitting on top of.
      
      This currently is worked around in the sysfs layer using
      sysfs_schedule_callback() which makes such removals asynchronous.
      While it works, it's rather cumbersome and inherently breaks
      synchronicity of the operation - the file operation which triggered
      the operation may complete before the removal is finished (or even
      started) and the removal may fail asynchronously.  If a removal
      operation is immmediately followed by another operation which expects
      the specific name to be available (e.g. removal followed by rename
      onto the same name), there's no way to make the latter operation
      reliable.
      
      The thing is there's no inherent reason for this to be asynchrnous.
      All that's necessary to do this synchronous is a dedicated operation
      which drops its own active ref and deactivates self.  This patch
      implements kernfs_remove_self() and its wrappers in sysfs and driver
      core.  kernfs_remove_self() is to be called from one of the file
      operations, drops the active ref the task is holding, removes the self
      node, and restores active ref to the dead node so that the ref is
      balanced afterwards.  __kernfs_remove() is updated so that it takes an
      early exit if the target node is already fully removed so that the
      active ref restored by kernfs_remove_self() after removal doesn't
      confuse the deactivation path.
      
      This makes implementing self-deleting nodes very easy.  The normal
      removal path doesn't even need to be changed to use
      kernfs_remove_self() for the self-deleting node.  The method can
      invoke kernfs_remove_self() on itself before proceeding the normal
      removal path.  kernfs_remove() invoked on the node by the normal
      deletion path will simply be ignored.
      
      This will replace sysfs_schedule_callback().  A subtle feature of
      sysfs_schedule_callback() is that it collapses multiple invocations -
      even if multiple removals are triggered, the removal callback is run
      only once.  An equivalent effect can be achieved by testing the return
      value of kernfs_remove_self() - only the one which gets %true return
      value should proceed with actual deletion.  All other instances of
      kernfs_remove_self() will wait till the enclosing kernfs operation
      which invoked the winning instance of kernfs_remove_self() finishes
      and then return %false.  This trivially makes all users of
      kernfs_remove_self() automatically show correct synchronous behavior
      even when there are multiple concurrent operations - all "echo 1 >
      delete" instances will finish only after the whole operation is
      completed by one of the instances.
      
      Note that manipulation of active ref is implemented in separate public
      functions - kernfs_[un]break_active_protection().
      kernfs_remove_self() is the only user at the moment but this will be
      used to cater to more complex cases.
      
      v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
          and sysfs_remove_file_self() had incorrect return type.  Fix it.
          Reported by kbuild test bot.
      
      v3: kernfs_[un]break_active_protection() separated out from
          kernfs_remove_self() and exposed as public API.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b0afc2a
  9. 14 1月, 2014 2 次提交
  10. 11 1月, 2014 2 次提交
    • T
      sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner() · d1ba277e
      Tejun Heo 提交于
      All device_schedule_callback_owner() users are converted to use
      device_remove_file_self().  Remove now unused
      {sysfs|device}_schedule_callback_owner().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1ba277e
    • T
      kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers · 1ae06819
      Tejun Heo 提交于
      Sometimes it's necessary to implement a node which wants to delete
      nodes including itself.  This isn't straightforward because of kernfs
      active reference.  While a file operation is in progress, an active
      reference is held and kernfs_remove() waits for all such references to
      drain before completing.  For a self-deleting node, this is a deadlock
      as kernfs_remove() ends up waiting for an active reference that itself
      is sitting on top of.
      
      This currently is worked around in the sysfs layer using
      sysfs_schedule_callback() which makes such removals asynchronous.
      While it works, it's rather cumbersome and inherently breaks
      synchronicity of the operation - the file operation which triggered
      the operation may complete before the removal is finished (or even
      started) and the removal may fail asynchronously.  If a removal
      operation is immmediately followed by another operation which expects
      the specific name to be available (e.g. removal followed by rename
      onto the same name), there's no way to make the latter operation
      reliable.
      
      The thing is there's no inherent reason for this to be asynchrnous.
      All that's necessary to do this synchronous is a dedicated operation
      which drops its own active ref and deactivates self.  This patch
      implements kernfs_remove_self() and its wrappers in sysfs and driver
      core.  kernfs_remove_self() is to be called from one of the file
      operations, drops the active ref and deactivates using
      __kernfs_deactivate_self(), removes the self node, and restores active
      ref to the dead node using __kernfs_reactivate_self() so that the ref
      is balanced afterwards.  __kernfs_remove() is updated so that it takes
      an early exit if the target node is already fully removed so that the
      active ref restored by kernfs_remove_self() after removal doesn't
      confuse the deactivation path.
      
      This makes implementing self-deleting nodes very easy.  The normal
      removal path doesn't even need to be changed to use
      kernfs_remove_self() for the self-deleting node.  The method can
      invoke kernfs_remove_self() on itself before proceeding the normal
      removal path.  kernfs_remove() invoked on the node by the normal
      deletion path will simply be ignored.
      
      This will replace sysfs_schedule_callback().  A subtle feature of
      sysfs_schedule_callback() is that it collapses multiple invocations -
      even if multiple removals are triggered, the removal callback is run
      only once.  An equivalent effect can be achieved by testing the return
      value of kernfs_remove_self() - only the one which gets %true return
      value should proceed with actual deletion.  All other instances of
      kernfs_remove_self() will wait till the enclosing kernfs operation
      which invoked the winning instance of kernfs_remove_self() finishes
      and then return %false.  This trivially makes all users of
      kernfs_remove_self() automatically show correct synchronous behavior
      even when there are multiple concurrent operations - all "echo 1 >
      delete" instances will finish only after the whole operation is
      completed by one of the instances.
      
      v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
          and sysfs_remove_file_self() had incorrect return type.  Fix it.
          Reported by kbuild test bot.
      
      v3: Updated to use __kernfs_{de|re}activate_self().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ae06819
  11. 18 12月, 2013 3 次提交
    • T
      kernfs: add kernfs_dir_ops · 80b9bbef
      Tejun Heo 提交于
      Add support for mkdir(2), rmdir(2) and rename(2) syscalls.  This is
      implemented through optional kernfs_dir_ops callback table which can
      be specified on kernfs_create_root().  An implemented callback is
      invoked when the matching syscall is invoked.
      
      As kernfs keep dcache syncs with internal representation and
      revalidates dentries on each access, the implementation of these
      methods is extremely simple.  Each just discovers the relevant
      kernfs_node(s) and invokes the requested callback which is allowed to
      do any kernfs operations and the end result doesn't necessarily have
      to match the expected semantics of the syscall.
      
      This will be used to convert cgroup to use kernfs instead of its own
      filesystem implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80b9bbef
    • T
      kernfs: mark static names with KERNFS_STATIC_NAME · 2063d608
      Tejun Heo 提交于
      Because sysfs used struct attribute which are supposed to stay
      constant, sysfs didn't copy names when creating regular files.  The
      specified string for name was supposed to stay constant.  Such
      distinction isn't inherent for kernfs.  kernfs_create_file[_ns]()
      should be able to take the same @name as kernfs_create_dir[_ns]()
      
      As there can be huge number of sysfs attributes, we still want to be
      able to use static names for sysfs attributes.  This patch renames
      kernfs_create_file_ns_key() to __kernfs_create_file() and adds
      @name_is_static parameter so that the caller can explicitly indicate
      that @name can be used without copying.  kernfs is updated to use
      KERNFS_STATIC_NAME to distinguish static and copied names.
      
      This patch doesn't introduce any behavior changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2063d608
    • T
      kernfs: add @mode to kernfs_create_dir[_ns]() · bb8b9d09
      Tejun Heo 提交于
      sysfs assumed 0755 for all newly created directories and kernfs
      inherited it.  This assumption is unnecessarily restrictive and
      inconsistent with kernfs_create_file[_ns]().  This patch adds @mode
      parameter to kernfs_create_dir[_ns]() and update uses in sysfs
      accordingly.  Among others, this will be useful for implementations of
      the planned ->mkdir() method.
      
      This patch doesn't introduce any behavior differences.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb8b9d09
  12. 12 12月, 2013 4 次提交
    • T
      kernfs: s/sysfs/kernfs/ in constants · df23fc39
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/SYSFS_DIR/KERNFS_DIR/
      * s/SYSFS_KOBJ_ATTR/KERNFS_FILE/
      * s/SYSFS_KOBJ_LINK/KERNFS_LINK/
      * s/SYSFS_{TYPE_FLAGS}/KERNFS_{TYPE_FLAGS}/
      * s/SYSFS_FLAG_{FLAG}/KERNFS_{FLAG}/
      * s/sysfs_type()/kernfs_type()/
      * s/SD_DEACTIVATED_BIAS/KN_DEACTIVATED_BIAS/
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df23fc39
    • T
      kernfs: s/sysfs/kernfs/ in various data structures · c525aadd
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/sysfs_open_dirent/kernfs_open_node/
      * s/sysfs_open_file/kernfs_open_file/
      * s/sysfs_inode_attrs/kernfs_iattrs/
      * s/sysfs_addrm_cxt/kernfs_addrm_cxt/
      * s/sysfs_super_info/kernfs_super_info/
      * s/sysfs_info()/kernfs_info()/
      * s/sysfs_open_dirent_lock/kernfs_open_node_lock/
      * s/sysfs_open_file_mutex/kernfs_open_file_mutex/
      * s/sysfs_of()/kernfs_of()/
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c525aadd
    • T
      kernfs: drop s_ prefix from kernfs_node members · adc5e8b5
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      s_ prefix for kernfs members is used inconsistently and a misnomer
      now.  It's not like kernfs_node is used widely across the kernel
      making the ability to grep for the members particularly useful.  Let's
      just drop the prefix.
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adc5e8b5
    • T
      kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordingly · 324a56e1
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/sysfs_elem_dir/kernfs_elem_dir/
      * s/sysfs_elem_symlink/kernfs_elem_symlink/
      * s/sysfs_elem_attr/kernfs_elem_file/
      * s/sysfs_dirent/kernfs_node/
      * s/sd/kn/ in kernfs proper
      * s/parent_sd/parent/
      * s/target_sd/target/
      * s/dir_sd/parent/
      * s/to_sysfs_dirent()/rb_to_kn()/
      * misc renames of local vars when they conflict with the above
      
      Because md, mic and gpio dig into sysfs details, this patch ends up
      modifying them.  All are sysfs_dirent renames and trivial.  While we
      can avoid these by introducing a dummy wrapping struct sysfs_dirent
      around kernfs_node, given the limited usage outside kernfs and sysfs
      proper, I don't think such workaround is called for.
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      
      - mic / gpio renames were missing.  Spotted by kbuild test robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      324a56e1
  13. 11 12月, 2013 2 次提交
    • T
      sysfs: fix use-after-free in sysfs_kill_sb() · a7560a01
      Tejun Heo 提交于
      While restructuring the [u]mount path, 4b93dc9b ("sysfs, kernfs:
      prepare mount path for kernfs") incorrectly updated sysfs_kill_sb() so
      that it first kills super_block and then tries to dereference its
      namespace tag to drop it.  Fix it by caching namespace tag before
      killing the superblock and then drop the cached namespace tag.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      Tested-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      Tested-by: NVlastimil Babka <vbabka@suse.cz>
      Link: http://lkml.kernel.org/g/20131205031051.GC5135@yliu-dev.sh.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7560a01
    • T
      sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warning · 9b2db6e1
      Tejun Heo 提交于
      This is v3.14 fix for the same issue that a8b14744 ("sysfs: give
      different locking key to regular and bin files") addresses for v3.13.
      Due to the extensive kernfs reorganization in v3.14 branch, the same
      fix couldn't be ported as-is.  The v3.13 fix was ignored while merging
      it into v3.14 branch.
      
      027a485d ("sysfs: use a separate locking class for open files
      depending on mmap") assigned different lockdep key to
      sysfs_open_file->mutex depending on whether the file implements mmap
      or not in an attempt to avoid spurious lockdep warning caused by
      merging of regular and bin file paths.
      
      While this restored some of the original behavior of using different
      locks (at least lockdep is concerned) for the different clases of
      files.  The restoration wasn't full because now the lockdep key
      assignment depends on whether the file has mmap or not instead of
      whether it's a regular file or not.
      
      This means that bin files which don't implement mmap will get assigned
      the same lockdep class as regular files.  This is problematic because
      file_operations for bin files still implements the mmap file operation
      and checking whether the sysfs file actually implements mmap happens
      in the file operation after grabbing @sysfs_open_file->mutex.  We
      still end up adding locking dependency from mmap locking to
      sysfs_open_file->mutex to the regular file mutex which triggers
      spurious circular locking warning.
      
      For v3.13, a8b14744 ("sysfs: give different locking key to regular
      and bin files") fixed it by giving sysfs_open_file->mutex different
      lockdep keys depending on whether the file is regular or bin instead
      of whether mmap exists or not; however, due to the way sysfs is now
      layered behind kernfs, this approach is no longer viable.  kernfs can
      tell whether a sysfs node has mmap implemented or not but can't tell
      whether a bin file from a regular one.
      
      This patch updates kernfs such that kernfs_file_mmap() checks
      SYSFS_FLAG_HAS_MMAP and bail before grabbing sysfs_open_file->mutex so
      that it doesn't add spurious locking dependency from mmap to
      sysfs_open_file->mutex and changes sysfs so that it specifies
      kernfs_ops->mmap iff the sysfs file implements mmap.  Combined, this
      ensures that sysfs_open_file->mutex is grabbed under mmap path iff the
      sysfs file actually implements mmap.  As sysfs_open_file->mutex is
      already given a different lockdep key if mmap is implemented, this
      removes the spurious locking dependency.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDave Jones <davej@redhat.com>
      Link: http://lkml.kernel.org/g/20131203184324.GA11320@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b2db6e1
  14. 08 12月, 2013 1 次提交
    • T
      sysfs: give different locking key to regular and bin files · a8b14744
      Tejun Heo 提交于
      027a485d ("sysfs: use a separate locking class for open files
      depending on mmap") assigned different lockdep key to
      sysfs_open_file->mutex depending on whether the file implements mmap
      or not in an attempt to avoid spurious lockdep warning caused by
      merging of regular and bin file paths.
      
      While this restored some of the original behavior of using different
      locks (at least lockdep is concerned) for the different clases of
      files.  The restoration wasn't full because now the lockdep key
      assignment depends on whether the file has mmap or not instead of
      whether it's a regular file or not.
      
      This means that bin files which don't implement mmap will get assigned
      the same lockdep class as regular files.  This is problematic because
      file_operations for bin files still implements the mmap file operation
      and checking whether the sysfs file actually implements mmap happens
      in the file operation after grabbing @sysfs_open_file->mutex.  We
      still end up adding locking dependency from mmap locking to
      sysfs_open_file->mutex to the regular file mutex which triggers
      spurious circular locking warning.
      
      Fix it by restoring the original behavior fully by differentiating
      lockdep key by whether the file is regular or bin, instead of the
      existence of mmap.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NDave Jones <davej@redhat.com>
      Link: http://lkml.kernel.org/g/20131203184324.GA11320@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8b14744
  15. 30 11月, 2013 14 次提交
    • T
      sysfs, kernfs: remove cross inclusions of internal headers · bfc5c173
      Tejun Heo 提交于
      fs/kernfs/kernfs-internal.h needed to include fs/sysfs/sysfs.h because
      part of kernfs core implementation was living in sysfs.
      
      fs/sysfs/sysfs.h needed to include fs/kernfs/kernfs-internal.h because
      include/linux/kernfs.h didn't expose enough interface.
      
      The separation is complete and neither is true anymore.  Remove the
      cross inclusion and make sysfs a proper user of kernfs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfc5c173
    • T
      sysfs, kernfs: implement kernfs_ns_enabled() · ac9bba03
      Tejun Heo 提交于
      fs/sysfs/symlink.c::sysfs_delete_link() tests @sd->s_flags for
      SYSFS_FLAG_NS.  Let's add kernfs_ns_enabled() so that sysfs doesn't
      have to test sysfs_dirent flag directly.  This makes things tidier for
      kernfs proper too.
      
      This is purely cosmetic.
      
      v2: To avoid possible NULL deref, use noop dummy implementation which
          always returns false when !CONFIG_SYSFS.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac9bba03
    • T
      sysfs, kernfs: move mount core code to fs/kernfs/mount.c · fa736a95
      Tejun Heo 提交于
      Move core mount code to fs/kernfs/mount.c.  The respective
      declarations in fs/sysfs/sysfs.h are moved to
      fs/kernfs/kernfs-internal.h.
      
      This is pure relocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa736a95
    • T
      sysfs, kernfs: prepare mount path for kernfs · 4b93dc9b
      Tejun Heo 提交于
      We're in the process of separating out core sysfs functionality into
      kernfs which will deal with sysfs_dirents directly.  This patch
      rearranges mount path so that the kernfs and sysfs parts are separate.
      
      * As sysfs_super_info won't be visible outside kernfs proper,
        kernfs_super_ns() is added to allow kernfs users to access a
        super_block's namespace tag.
      
      * Generic mount operation is separated out into kernfs_mount_ns().
        sysfs_mount() now just performs sysfs-specific permission check,
        acquires namespace tag, and invokes kernfs_mount_ns().
      
      * Generic superblock release is separated out into kernfs_kill_sb()
        which can be used directly as file_system_type->kill_sb().  As sysfs
        needs to put the namespace tag, sysfs_kill_sb() wraps
        kernfs_kill_sb() with ns tag put.
      
      * sysfs_dir_cachep init and sysfs_inode_init() are separated out into
        kernfs_init().  kernfs_init() uses only small amount of memory and
        trying to handle and propagate kernfs_init() failure doesn't make
        much sense.  Use SLAB_PANIC for sysfs_dir_cachep and make
        sysfs_inode_init() panic on failure.
      
        After this change, kernfs_init() should be called before
        sysfs_init(), fs/namespace.c::mnt_init() modified accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b93dc9b
    • T
      sysfs, kernfs: make super_blocks bind to different kernfs_roots · df394fb5
      Tejun Heo 提交于
      kernfs is being updated to allow multiple sysfs_dirent hierarchies so
      that it can also be used by other users.  Currently, sysfs
      super_blocks are always attached to one kernfs_root - sysfs_root - and
      distinguished only by their namespace tags.
      
      This patch adds sysfs_super_info->root and update
      sysfs_fill/test_super() so that super_blocks are identified by the
      combination of both the associated kernfs_root and namespace tag.
      This allows mounting different kernfs hierarchies.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df394fb5
    • T
      sysfs, kernfs: implement kernfs_create/destroy_root() · ba7443bc
      Tejun Heo 提交于
      There currently is single kernfs hierarchy in the whole system which
      is used for sysfs.  kernfs needs to support multiple hierarchies to
      allow other users.  This patch introduces struct kernfs_root which
      serves as the root of each kernfs hierarchy and implements
      kernfs_create/destroy_root().
      
      * Each kernfs_root is associated with a root sd (sysfs_dentry).  The
        root is freed when the root sd is released and kernfs_destory_root()
        simply invokes kernfs_remove() on the root sd.  sysfs_remove_one()
        is updated to handle release of the root sd.  Note that ps_iattr
        update in sysfs_remove_one() is trivially updated for readability.
      
      * Root sd's are now dynamically allocated using sysfs_new_dirent().
        Update sysfs_alloc_ino() so that it gives out ino from 1 so that the
        root sd still gets ino 1.
      
      * While kernfs currently only points to the root sd, it'll soon grow
        fields which are specific to each hierarchy.  As determining a given
        sd's root will be necessary, sd->s_dir.root is added.  This backlink
        fits better as a separate field in sd; however, sd->s_dir is inside
        union with space to spare, so use it to save space and provide
        kernfs_root() accessor to determine the root sd.
      
      * As hierarchies may be destroyed now, each mount needs to hold onto
        the hierarchy it's attached to.  Update sysfs_fill_super() and
        sysfs_kill_sb() so that they get and put the kernfs_root
        respectively.
      
      * sysfs_root is replaced with kernfs_root which is dynamically created
        by invoking kernfs_create_root() from sysfs_init().
      
      This patch doesn't introduce any visible behavior changes.
      
      v2: kernfs_create_root() forgot to set @sd->priv.  Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba7443bc
    • T
      sysfs, kernfs: introduce sysfs_root_sd · 061447a4
      Tejun Heo 提交于
      Currently, it's assumed that there's a single kernfs hierarchy in the
      system anchored at sysfs_root which is defined as a global struct.  To
      allow other users of kernfs, this will be made dynamic.  Introduce a
      new global variable sysfs_root_sd which points to &sysfs_root and
      convert all &sysfs_root users.
      
      This patch doesn't introduce any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      061447a4
    • T
      sysfs, kernfs: no need to kern_mount() sysfs from sysfs_init() · 9e30cc95
      Tejun Heo 提交于
      It has been very long since sysfs depended on vfs to keep track of
      internal states and whether sysfs is mounted or not doesn't make any
      difference to sysfs's internal operation.
      
      In addition to init and filesystem type registration, sysfs_init()
      invokes kern_mount() to create in-kernel mount of sysfs.  This
      internal mounting doesn't server any purpose anymore.  Remove it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e30cc95
    • T
      sysfs, kernfs: make sysfs_super_info->ns const · 51a35e9f
      Tejun Heo 提交于
      Add const qualifier to sysfs_super_info->ns so that it's consistent
      with other namespace tag usages in sysfs.  Because kobject doesn't use
      const qualifier for namespace tags, this ends up requiring an explicit
      cast to drop const qualifier in free_sysfs_super_info().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51a35e9f
    • T
      sysfs, kernfs: drop unused params from sysfs_fill_super() · ccc532dc
      Tejun Heo 提交于
      sysfs_fill_super() takes three params - @sb, @data and @silent - but
      uses only @sb.  Drop the latter two.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ccc532dc
    • T
      sysfs, kernfs: move symlink core code to fs/kernfs/symlink.c · 2072f1af
      Tejun Heo 提交于
      Move core symlink code to fs/kernfs/symlink.c.  fs/sysfs/symlink.c now
      only contains sysfs wrappers around kernfs interfaces.  The respective
      declarations in fs/sysfs/sysfs.h are moved to
      fs/kernfs/kernfs-internal.h.
      
      This is pure relocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2072f1af
    • T
      sysfs, kernfs: move file core code to fs/kernfs/file.c · 414985ae
      Tejun Heo 提交于
      Move core file code to fs/kernfs/file.c.  fs/sysfs/file.c now contains
      sysfs kernfs_ops callbacks, sysfs wrappers around kernfs interfaces,
      and sysfs_schedule_callback().  The respective declarations in
      fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.
      
      This is pure relocation.
      
      v2: Refreshed on top of the v2 of "sysfs, kernfs: prepare read path
          for kernfs".
      
      v3: Refreshed on top of the v3 of "sysfs, kernfs: prepare read path
          for kernfs".
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      414985ae
    • T
      sysfs, kernfs: move dir core code to fs/kernfs/dir.c · fd7b9f7b
      Tejun Heo 提交于
      Move core dir code to fs/kernfs/dir.c.  fs/sysfs/dir.c now only
      contains sysfs_warn_dup() and sysfs wrappers around kernfs interfaces.
      The respective declarations in fs/sysfs/sysfs.h are moved to
      fs/kernfs/kernfs-internal.h.
      
      This is pure relocation.
      
      v2: sysfs_symlink_target_lock was mistakenly relocated to kernfs.  It
          should remain with sysfs.  Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd7b9f7b
    • T
      sysfs, kernfs: move inode code to fs/kernfs/inode.c · ffed24e2
      Tejun Heo 提交于
      There's nothing sysfs-specific in fs/sysfs/inode.c.  Move everything
      in it to fs/kernfs/inode.c.  The respective declarations in
      fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.
      
      This is pure relocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ffed24e2