1. 23 1月, 2018 2 次提交
  2. 28 11月, 2017 1 次提交
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  3. 24 6月, 2016 2 次提交
    • E
      kernfs: The cgroup filesystem also benefits from SB_I_NOEXEC · 29a517c2
      Eric W. Biederman 提交于
      The cgroup filesystem is in the same boat as sysfs.  No one ever
      permits executables of any kind on the cgroup filesystem, and there is
      no reasonable future case to support executables in the future.
      
      Therefore move the setting of SB_I_NOEXEC which makes the code proof
      against future mistakes of accidentally creating executables from
      sysfs to kernfs itself.  Making the code simpler and covering the
      sysfs, cgroup, and cgroup2 filesystems.
      Acked-by: NSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      29a517c2
    • E
      mnt: Refactor fs_fully_visible into mount_too_revealing · 8654df4e
      Eric W. Biederman 提交于
      Replace the call of fs_fully_visible in do_new_mount from before the
      new superblock is allocated with a call of mount_too_revealing after
      the superblock is allocated.   This winds up being a much better location
      for maintainability of the code.
      
      The first change this enables is the replacement of FS_USERNS_VISIBLE
      with SB_I_USERNS_VISIBLE.  Moving the flag from struct filesystem_type
      to sb_iflags on the superblock.
      
      Unfortunately mount_too_revealing fundamentally needs to touch
      mnt_flags adding several MNT_LOCKED_XXX flags at the appropriate
      times.  If the mnt_flags did not need to be touched the code
      could be easily moved into the filesystem specific mount code.
      Acked-by: NSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      8654df4e
  4. 10 7月, 2015 1 次提交
    • E
      vfs: Commit to never having exectuables on proc and sysfs. · 90f8572b
      Eric W. Biederman 提交于
      Today proc and sysfs do not contain any executable files.  Several
      applications today mount proc or sysfs without noexec and nosuid and
      then depend on there being no exectuables files on proc or sysfs.
      Having any executable files show on proc or sysfs would cause
      a user space visible regression, and most likely security problems.
      
      Therefore commit to never allowing executables on proc and sysfs by
      adding a new flag to mark them as filesystems without executables and
      enforce that flag.
      
      Test the flag where MNT_NOEXEC is tested today, so that the only user
      visible effect will be that exectuables will be treated as if the
      execute bit is cleared.
      
      The filesystems proc and sysfs do not currently incoporate any
      executable files so this does not result in any user visible effects.
      
      This makes it unnecessary to vet changes to proc and sysfs tightly for
      adding exectuable files or changes to chattr that would modify
      existing files, as no matter what the individual file say they will
      not be treated as exectuable files by the vfs.
      
      Not having to vet changes to closely is important as without this we
      are only one proc_create call (or another goof up in the
      implementation of notify_change) from having problematic executables
      on proc.  Those mistakes are all too easy to make and would create
      a situation where there are security issues or the assumptions of
      some program having to be broken (and cause userspace regressions).
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      90f8572b
  5. 14 5月, 2015 1 次提交
    • E
      mnt: Refactor the logic for mounting sysfs and proc in a user namespace · 1b852bce
      Eric W. Biederman 提交于
      Fresh mounts of proc and sysfs are a very special case that works very
      much like a bind mount.  Unfortunately the current structure can not
      preserve the MNT_LOCK... mount flags.  Therefore refactor the logic
      into a form that can be modified to preserve those lock bits.
      
      Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount
      of the filesystem be fully visible in the current mount namespace,
      before the filesystem may be mounted.
      
      Move the logic for calling fs_fully_visible from proc and sysfs into
      fs/namespace.c where it has greater access to mount namespace state.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      1b852bce
  6. 03 6月, 2014 1 次提交
  7. 28 5月, 2014 1 次提交
  8. 13 5月, 2014 1 次提交
    • T
      kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs · 555724a8
      Tejun Heo 提交于
      The kernfs open method - kernfs_fop_open() - inherited extra
      permission checks from sysfs.  While the vfs layer allows ignoring the
      read/write permissions checks if the issuer has CAP_DAC_OVERRIDE,
      sysfs explicitly denied open regardless of the cap if the file doesn't
      have any of the UGO perms of the requested access or doesn't implement
      the requested operation.  It can be debated whether this was a good
      idea or not but the behavior is too subtle and dangerous to change at
      this point.
      
      After cgroup got converted to kernfs, this extra perm check also got
      applied to cgroup breaking libcgroup which opens write-only files with
      O_RDWR as root.  This patch gates the extra open permission check with
      a new flag KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK and enables it for sysfs.
      For sysfs, nothing changes.  For cgroup, root now can perform any
      operation regardless of the permissions as it was before kernfs
      conversion.  Note that kernfs still fails unimplemented operations
      with -EINVAL.
      
      While at it, add comments explaining KERNFS_ROOT flags.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndrey Wagin <avagin@gmail.com>
      Tested-by: NAndrey Wagin <avagin@gmail.com>
      Cc: Li Zefan <lizefan@huawei.com>
      References: http://lkml.kernel.org/g/CANaxB-xUm3rJ-Cbp72q-rQJO5mZe1qK6qXsQM=vh0U8upJ44+A@mail.gmail.com
      Fixes: 2bd59d48 ("cgroup: convert to kernfs")
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      555724a8
  9. 25 2月, 2014 1 次提交
    • L
      sysfs: fix namespace refcnt leak · fed95bab
      Li Zefan 提交于
      As mount() and kill_sb() is not a one-to-one match, we shoudn't get
      ns refcnt unconditionally in sysfs_mount(), and instead we should
      get the refcnt only when kernfs_mount() allocated a new superblock.
      
      v2:
      - Changed the name of the new argument, suggested by Tejun.
      - Made the argument optional, suggested by Tejun.
      
      v3:
      - Make the new argument as second-to-last arg, suggested by Tejun.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Acked-by: NTejun Heo <tj@kernel.org>
       ---
       fs/kernfs/mount.c      | 8 +++++++-
       fs/sysfs/mount.c       | 5 +++--
       include/linux/kernfs.h | 9 +++++----
       3 files changed, 15 insertions(+), 7 deletions(-)
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fed95bab
  10. 08 2月, 2014 1 次提交
    • T
      kernfs: allow nodes to be created in the deactivated state · d35258ef
      Tejun Heo 提交于
      Currently, kernfs_nodes are made visible to userland on creation,
      which makes it difficult for kernfs users to atomically succeed or
      fail creation of multiple nodes.  In addition, if something fails
      after creating some nodes, the created nodes might already be in use
      and their active refs need to be drained for removal, which has the
      potential to introduce tricky reverse locking dependency on active_ref
      depending on how the error path is synchronized.
      
      This patch introduces per-root flag KERNFS_ROOT_CREATE_DEACTIVATED.
      If set, all nodes under the root are created in the deactivated state
      and stay invisible to userland until explicitly enabled by the new
      kernfs_activate() API.  Also, nodes which have never been activated
      are guaranteed to bypass draining on removal thus allowing error paths
      to not worry about lockding dependency on active_ref draining.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d35258ef
  11. 18 12月, 2013 1 次提交
    • T
      kernfs: add kernfs_dir_ops · 80b9bbef
      Tejun Heo 提交于
      Add support for mkdir(2), rmdir(2) and rename(2) syscalls.  This is
      implemented through optional kernfs_dir_ops callback table which can
      be specified on kernfs_create_root().  An implemented callback is
      invoked when the matching syscall is invoked.
      
      As kernfs keep dcache syncs with internal representation and
      revalidates dentries on each access, the implementation of these
      methods is extremely simple.  Each just discovers the relevant
      kernfs_node(s) and invokes the requested callback which is allowed to
      do any kernfs operations and the end result doesn't necessarily have
      to match the expected semantics of the syscall.
      
      This will be used to convert cgroup to use kernfs instead of its own
      filesystem implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80b9bbef
  12. 12 12月, 2013 1 次提交
    • T
      kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordingly · 324a56e1
      Tejun Heo 提交于
      kernfs has just been separated out from sysfs and we're already in
      full conflict mode.  Nothing can make the situation any worse.  Let's
      take the chance to name things properly.
      
      This patch performs the following renames.
      
      * s/sysfs_elem_dir/kernfs_elem_dir/
      * s/sysfs_elem_symlink/kernfs_elem_symlink/
      * s/sysfs_elem_attr/kernfs_elem_file/
      * s/sysfs_dirent/kernfs_node/
      * s/sd/kn/ in kernfs proper
      * s/parent_sd/parent/
      * s/target_sd/target/
      * s/dir_sd/parent/
      * s/to_sysfs_dirent()/rb_to_kn()/
      * misc renames of local vars when they conflict with the above
      
      Because md, mic and gpio dig into sysfs details, this patch ends up
      modifying them.  All are sysfs_dirent renames and trivial.  While we
      can avoid these by introducing a dummy wrapping struct sysfs_dirent
      around kernfs_node, given the limited usage outside kernfs and sysfs
      proper, I don't think such workaround is called for.
      
      This patch is strictly rename only and doesn't introduce any
      functional difference.
      
      - mic / gpio renames were missing.  Spotted by kbuild test robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      324a56e1
  13. 11 12月, 2013 1 次提交
  14. 30 11月, 2013 9 次提交
    • T
      sysfs, kernfs: move mount core code to fs/kernfs/mount.c · fa736a95
      Tejun Heo 提交于
      Move core mount code to fs/kernfs/mount.c.  The respective
      declarations in fs/sysfs/sysfs.h are moved to
      fs/kernfs/kernfs-internal.h.
      
      This is pure relocation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa736a95
    • T
      sysfs, kernfs: prepare mount path for kernfs · 4b93dc9b
      Tejun Heo 提交于
      We're in the process of separating out core sysfs functionality into
      kernfs which will deal with sysfs_dirents directly.  This patch
      rearranges mount path so that the kernfs and sysfs parts are separate.
      
      * As sysfs_super_info won't be visible outside kernfs proper,
        kernfs_super_ns() is added to allow kernfs users to access a
        super_block's namespace tag.
      
      * Generic mount operation is separated out into kernfs_mount_ns().
        sysfs_mount() now just performs sysfs-specific permission check,
        acquires namespace tag, and invokes kernfs_mount_ns().
      
      * Generic superblock release is separated out into kernfs_kill_sb()
        which can be used directly as file_system_type->kill_sb().  As sysfs
        needs to put the namespace tag, sysfs_kill_sb() wraps
        kernfs_kill_sb() with ns tag put.
      
      * sysfs_dir_cachep init and sysfs_inode_init() are separated out into
        kernfs_init().  kernfs_init() uses only small amount of memory and
        trying to handle and propagate kernfs_init() failure doesn't make
        much sense.  Use SLAB_PANIC for sysfs_dir_cachep and make
        sysfs_inode_init() panic on failure.
      
        After this change, kernfs_init() should be called before
        sysfs_init(), fs/namespace.c::mnt_init() modified accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b93dc9b
    • T
      sysfs, kernfs: make super_blocks bind to different kernfs_roots · df394fb5
      Tejun Heo 提交于
      kernfs is being updated to allow multiple sysfs_dirent hierarchies so
      that it can also be used by other users.  Currently, sysfs
      super_blocks are always attached to one kernfs_root - sysfs_root - and
      distinguished only by their namespace tags.
      
      This patch adds sysfs_super_info->root and update
      sysfs_fill/test_super() so that super_blocks are identified by the
      combination of both the associated kernfs_root and namespace tag.
      This allows mounting different kernfs hierarchies.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df394fb5
    • T
      sysfs, kernfs: implement kernfs_create/destroy_root() · ba7443bc
      Tejun Heo 提交于
      There currently is single kernfs hierarchy in the whole system which
      is used for sysfs.  kernfs needs to support multiple hierarchies to
      allow other users.  This patch introduces struct kernfs_root which
      serves as the root of each kernfs hierarchy and implements
      kernfs_create/destroy_root().
      
      * Each kernfs_root is associated with a root sd (sysfs_dentry).  The
        root is freed when the root sd is released and kernfs_destory_root()
        simply invokes kernfs_remove() on the root sd.  sysfs_remove_one()
        is updated to handle release of the root sd.  Note that ps_iattr
        update in sysfs_remove_one() is trivially updated for readability.
      
      * Root sd's are now dynamically allocated using sysfs_new_dirent().
        Update sysfs_alloc_ino() so that it gives out ino from 1 so that the
        root sd still gets ino 1.
      
      * While kernfs currently only points to the root sd, it'll soon grow
        fields which are specific to each hierarchy.  As determining a given
        sd's root will be necessary, sd->s_dir.root is added.  This backlink
        fits better as a separate field in sd; however, sd->s_dir is inside
        union with space to spare, so use it to save space and provide
        kernfs_root() accessor to determine the root sd.
      
      * As hierarchies may be destroyed now, each mount needs to hold onto
        the hierarchy it's attached to.  Update sysfs_fill_super() and
        sysfs_kill_sb() so that they get and put the kernfs_root
        respectively.
      
      * sysfs_root is replaced with kernfs_root which is dynamically created
        by invoking kernfs_create_root() from sysfs_init().
      
      This patch doesn't introduce any visible behavior changes.
      
      v2: kernfs_create_root() forgot to set @sd->priv.  Fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba7443bc
    • T
      sysfs, kernfs: introduce sysfs_root_sd · 061447a4
      Tejun Heo 提交于
      Currently, it's assumed that there's a single kernfs hierarchy in the
      system anchored at sysfs_root which is defined as a global struct.  To
      allow other users of kernfs, this will be made dynamic.  Introduce a
      new global variable sysfs_root_sd which points to &sysfs_root and
      convert all &sysfs_root users.
      
      This patch doesn't introduce any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      061447a4
    • T
      sysfs, kernfs: no need to kern_mount() sysfs from sysfs_init() · 9e30cc95
      Tejun Heo 提交于
      It has been very long since sysfs depended on vfs to keep track of
      internal states and whether sysfs is mounted or not doesn't make any
      difference to sysfs's internal operation.
      
      In addition to init and filesystem type registration, sysfs_init()
      invokes kern_mount() to create in-kernel mount of sysfs.  This
      internal mounting doesn't server any purpose anymore.  Remove it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e30cc95
    • T
      sysfs, kernfs: make sysfs_super_info->ns const · 51a35e9f
      Tejun Heo 提交于
      Add const qualifier to sysfs_super_info->ns so that it's consistent
      with other namespace tag usages in sysfs.  Because kobject doesn't use
      const qualifier for namespace tags, this ends up requiring an explicit
      cast to drop const qualifier in free_sysfs_super_info().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51a35e9f
    • T
      sysfs, kernfs: drop unused params from sysfs_fill_super() · ccc532dc
      Tejun Heo 提交于
      sysfs_fill_super() takes three params - @sb, @data and @silent - but
      uses only @sb.  Drop the latter two.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ccc532dc
    • T
      sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put() · ccf73cf3
      Tejun Heo 提交于
      Introduce kernfs interface for finding, getting and putting
      sysfs_dirents.
      
      * sysfs_find_dirent() is renamed to kernfs_find_ns() and lockdep
        assertion for sysfs_mutex is added.
      
      * sysfs_get_dirent_ns() is renamed to kernfs_find_and_get().
      
      * Macro inline dancing around __sysfs_get/put() are removed and
        kernfs_get/put() are made proper functions implemented in
        fs/sysfs/dir.c.
      
      While the conversions are mostly equivalent, there's one difference -
      kernfs_get() doesn't return the input param as its return value.  This
      change is intentional.  While passing through the input increases
      writability in some areas, it is unnecessary and has been shown to
      cause confusion regarding how the last ref is handled.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ccf73cf3
  15. 28 11月, 2013 1 次提交
    • T
      sysfs: drop kobj_ns_type handling, take #2 · c84a3b27
      Tejun Heo 提交于
      The way namespace tags are implemented in sysfs is more complicated
      than necessary.  As each tag is a pointer value and required to be
      non-NULL under a namespace enabled parent, there's no need to record
      separately what type each tag is.  If multiple namespace types are
      needed, which currently aren't, we can simply compare the tag to a set
      of allowed tags in the superblock assuming that the tags, being
      pointers, won't have the same value across multiple types.
      
      This patch rips out kobj_ns_type handling from sysfs.  sysfs now has
      an enable switch to turn on namespace under a node.  If enabled, all
      children are required to have non-NULL namespace tags and filtered
      against the super_block's tag.
      
      kobject namespace determination is now performed in
      lib/kobject.c::create_dir() making sysfs_read_ns_type() unnecessary.
      The sanity checks are also moved.  create_dir() is restructured to
      ease such addition.  This removes most kobject namespace knowledge
      from sysfs proper which will enable proper separation and layering of
      sysfs.
      
      This is the second try.  The first one was cb26a311 ("sysfs: drop
      kobj_ns_type handling") which tried to automatically enable namespace
      if there are children with non-NULL namespace tags; however, it was
      broken for symlinks as they should inherit the target's tag iff
      namespace is enabled in the parent.  This led to namespace filtering
      enabled incorrectly for wireless net class devices through phy80211
      symlinks and thus network configuration failure.  a1212d27
      ("Revert "sysfs: drop kobj_ns_type handling"") reverted the commit.
      
      This shouldn't introduce any behavior changes, for real.
      
      v2: Dummy implementation of sysfs_enable_ns() for !CONFIG_SYSFS was
          missing and caused build failure.  Reported by kbuild test robot.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c84a3b27
  16. 07 11月, 2013 1 次提交
    • L
      Revert "sysfs: drop kobj_ns_type handling" · a1212d27
      Linus Torvalds 提交于
      This reverts commit cb26a311.
      
      It mysteriously causes NetworkManager to not find the wireless device
      for me.  As far as I can tell, Tejun *meant* for this commit to not make
      any semantic changes, but there clearly are some.  So revert it, taking
      into account some of the calling convention changes that happened in
      this area in subsequent commits.
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1212d27
  17. 27 9月, 2013 1 次提交
    • T
      sysfs: drop kobj_ns_type handling · cb26a311
      Tejun Heo 提交于
      The way namespace tags are implemented in sysfs is more complicated
      than necessary.  As each tag is a pointer value and required to be
      non-NULL under a namespace enabled parent, there's no need to record
      separately what type each tag is or where namespace is enabled.
      
      If multiple namespace types are needed, which currently aren't, we can
      simply compare the tag to a set of allowed tags in the superblock
      assuming that the tags, being pointers, won't have the same value
      across multiple types.  Also, whether to filter by namespace tag or
      not can be trivially determined by whether the node has any tagged
      children or not.
      
      This patch rips out kobj_ns_type handling from sysfs.  sysfs no longer
      cares whether specific type of namespace is enabled or not.  If a
      sysfs_dirent has a non-NULL tag, the parent is marked as needing
      namespace filtering and the value is tested against the allowed set of
      tags for the superblock (currently only one but increasing this number
      isn't difficult) and the sysfs_dirent is ignored if it doesn't match.
      
      This removes most kobject namespace knowledge from sysfs proper which
      will enable proper separation and layering of sysfs.  The namespace
      sanity checks in fs/sysfs/dir.c are replaced by the new sanity check
      in kobject_namespace().  As this is the only place ktype->namespace()
      is called for sysfs, this doesn't weaken the sanity check
      significantly.  I omitted converting the sanity check in
      sysfs_do_create_link_sd().  While the check can be shifted to upper
      layer, mistakes there are well contained and should be easily visible
      anyway.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb26a311
  18. 29 8月, 2013 1 次提交
    • E
      sysfs: Restrict mounting sysfs · 7dc5dbc8
      Eric W. Biederman 提交于
      Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights
      over the net namespace.  The principle here is if you create or have
      capabilities over it you can mount it, otherwise you get to live with
      what other people have mounted.
      
      Instead of testing this with a straight forward ns_capable call,
      perform this check the long and torturous way with kobject helpers,
      this keeps direct knowledge of namespaces out of sysfs, and preserves
      the existing sysfs abstractions.
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7dc5dbc8
  19. 27 8月, 2013 1 次提交
    • E
      userns: Better restrictions on when proc and sysfs can be mounted · e51db735
      Eric W. Biederman 提交于
      Rely on the fact that another flavor of the filesystem is already
      mounted and do not rely on state in the user namespace.
      
      Verify that the mounted filesystem is not covered in any significant
      way.  I would love to verify that the previously mounted filesystem
      has no mounts on top but there are at least the directories
      /proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
      for other filesystems to mount on top of.
      
      Refactor the test into a function named fs_fully_visible and call that
      function from the mount routines of proc and sysfs.  This makes this
      test local to the filesystems involved and the results current of when
      the mounts take place, removing a weird threading of the user
      namespace, the mount namespace and the filesystems themselves.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      e51db735
  20. 22 8月, 2013 1 次提交
  21. 27 3月, 2013 1 次提交
    • E
      userns: Restrict when proc and sysfs can be mounted · 87a8ebd6
      Eric W. Biederman 提交于
      Only allow unprivileged mounts of proc and sysfs if they are already
      mounted when the user namespace is created.
      
      proc and sysfs are interesting because they have content that is
      per namespace, and so fresh mounts are needed when new namespaces
      are created while at the same time proc and sysfs have content that
      is shared between every instance.
      
      Respect the policy of who may see the shared content of proc and sysfs
      by only allowing new mounts if there was an existing mount at the time
      the user namespace was created.
      
      In practice there are only two interesting cases: proc and sysfs are
      mounted at their usual places, proc and sysfs are not mounted at all
      (some form of mount namespace jail).
      
      Cc: stable@vger.kernel.org
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      87a8ebd6
  22. 18 1月, 2013 1 次提交
  23. 20 11月, 2012 1 次提交
  24. 14 7月, 2012 2 次提交
  25. 21 3月, 2012 1 次提交
  26. 25 1月, 2012 1 次提交
  27. 13 6月, 2011 1 次提交
    • A
      Delay struct net freeing while there's a sysfs instance refering to it · a685e089
      Al Viro 提交于
      	* new refcount in struct net, controlling actual freeing of the memory
      	* new method in kobj_ns_type_operations (->drop_ns())
      	* ->current_ns() semantics change - it's supposed to be followed by
      corresponding ->drop_ns().  For struct net in case of CONFIG_NET_NS it bumps
      the new refcount; net_drop_ns() decrements it and calls net_free() if the
      last reference has been dropped.  Method renamed to ->grab_current_ns().
      	* old net_free() callers call net_drop_ns() instead.
      	* sysfs_exit_ns() is gone, along with a large part of callchain
      leading to it; now that the references stored in ->ns[...] stay valid we
      do not need to hunt them down and replace them with NULL.  That fixes
      problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
      of sb->s_instances abuse.
      
      	Note that struct net *shutdown* logics has not changed - net_cleanup()
      is called exactly when it used to be called.  The only thing postponed by
      having a sysfs instance refering to that struct net is actual freeing of
      memory occupied by struct net.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a685e089
  28. 29 10月, 2010 1 次提交
  29. 10 8月, 2010 1 次提交