1. 31 5月, 2018 2 次提交
  2. 21 3月, 2018 2 次提交
    • E
      fuse: Support fuse filesystems outside of init_user_ns · 8cb08329
      Eric W. Biederman 提交于
      In order to support mounts from namespaces other than init_user_ns, fuse
      must translate uids and gids to/from the userns of the process servicing
      requests on /dev/fuse. This patch does that, with a couple of restrictions
      on the namespace:
      
       - The userns for the fuse connection is fixed to the namespace
         from which /dev/fuse is opened.
      
       - The namespace must be the same as s_user_ns.
      
      These restrictions simplify the implementation by avoiding the need to pass
      around userns references and by allowing fuse to rely on the checks in
      setattr_prepare for ownership changes.  Either restriction could be relaxed
      in the future if needed.
      
      For cuse the userns used is the opener of /dev/cuse.  Semantically the cuse
      support does not appear safe for unprivileged users.  Practically the
      permissions on /dev/cuse only make it accessible to the global root user.
      If something slips through the cracks in a user namespace the only users
      who will be able to use the cuse device are those users mapped into the
      user namespace.
      
      Translation in the posix acl is updated to use the uuser namespace of the
      filesystem.  Avoiding cases which might bypass this translation is handled
      in a following change.
      
      This change is stronlgy based on a similar change from Seth Forshee and
      Dongsu Park.
      
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Dongsu Park <dongsu@kinvolk.io>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      8cb08329
    • S
      fuse: return -ECONNABORTED on /dev/fuse read after abort · 3b7008b2
      Szymon Lukasz 提交于
      Currently the userspace has no way of knowing whether the fuse
      connection ended because of umount or abort via sysfs. It makes it hard
      for filesystems to free the mountpoint after abort without worrying
      about removing some new mount.
      
      The patch fixes it by returning different errors when userspace reads
      from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort).
      
      Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is
      gone because of sysfs abort, reading from the device will return
      -ECONNABORTED.
      Signed-off-by: NSzymon Lukasz <noh4hss@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      3b7008b2
  3. 28 11月, 2017 1 次提交
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  4. 16 11月, 2017 1 次提交
  5. 31 10月, 2017 1 次提交
    • K
      treewide: Fix function prototypes for module_param_call() · e4dca7b7
      Kees Cook 提交于
      Several function prototypes for the set/get functions defined by
      module_param_call() have a slightly wrong argument types. This fixes
      those in an effort to clean up the calls when running under type-enforced
      compiler instrumentation for CFI. This is the result of running the
      following semantic patch:
      
      @match_module_param_call_function@
      declarer name module_param_call;
      identifier _name, _set_func, _get_func;
      expression _arg, _mode;
      @@
      
       module_param_call(_name, _set_func, _get_func, _arg, _mode);
      
      @fix_set_prototype
       depends on match_module_param_call_function@
      identifier match_module_param_call_function._set_func;
      identifier _val, _param;
      type _val_type, _param_type;
      @@
      
       int _set_func(
      -_val_type _val
      +const char * _val
       ,
      -_param_type _param
      +const struct kernel_param * _param
       ) { ... }
      
      @fix_get_prototype
       depends on match_module_param_call_function@
      identifier match_module_param_call_function._get_func;
      identifier _val, _param;
      type _val_type, _param_type;
      @@
      
       int _get_func(
      -_val_type _val
      +char * _val
       ,
      -_param_type _param
      +const struct kernel_param * _param
       ) { ... }
      
      Two additional by-hand changes are included for places where the above
      Coccinelle script didn't notice them:
      
      	drivers/platform/x86/thinkpad_acpi.c
      	fs/lockd/svc.c
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NJessica Yu <jeyu@kernel.org>
      e4dca7b7
  6. 19 10月, 2017 1 次提交
  7. 17 5月, 2017 1 次提交
  8. 21 4月, 2017 2 次提交
  9. 18 4月, 2017 2 次提交
  10. 18 10月, 2016 1 次提交
  11. 01 10月, 2016 4 次提交
  12. 31 7月, 2016 1 次提交
  13. 29 7月, 2016 1 次提交
  14. 30 6月, 2016 1 次提交
  15. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  16. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  17. 01 7月, 2015 11 次提交
  18. 16 4月, 2015 1 次提交
  19. 21 1月, 2015 1 次提交
  20. 06 1月, 2015 2 次提交
    • M
      fuse: add memory barrier to INIT · 9759bd51
      Miklos Szeredi 提交于
      Theoretically we need to order setting of various fields in fc with
      fc->initialized.
      
      No known bug reports related to this yet.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      9759bd51
    • M
      fuse: fix LOOKUP vs INIT compat handling · 21f62174
      Miklos Szeredi 提交于
      Analysis from Marc:
      
       "Commit 7078187a ("fuse: introduce fuse_simple_request() helper")
        from the above pull request triggers some EIO errors for me in some tests
        that rely on fuse
      
        Looking at the code changes and a bit of debugging info I think there's a
        general problem here that fuse_get_req checks and possibly waits for
        fc->initialized, and this was always called first.  But this commit
        changes the ordering and in many places fc->minor is now possibly used
        before fuse_get_req, and we can't be sure that fc has been initialized.
        In my case fuse_lookup_init sets req->out.args[0].size to the wrong size
        because fc->minor at that point is still 0, leading to the EIO error."
      
      Fix by moving the compat adjustments into fuse_simple_request() to after
      fuse_get_req().
      
      This is also more readable than the original, since now compatibility is
      handled in a single function instead of cluttering each operation.
      Reported-by: NMarc Dionne <marc.c.dionne@gmail.com>
      Tested-by: NMarc Dionne <marc.c.dionne@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Fixes: 7078187a ("fuse: introduce fuse_simple_request() helper")
      21f62174
  21. 12 12月, 2014 2 次提交
    • M
      fuse: introduce fuse_simple_request() helper · 7078187a
      Miklos Szeredi 提交于
      The following pattern is repeated many times:
      
      	req = fuse_get_req_nopages(fc);
      	/* Initialize req->(in|out).args */
      	fuse_request_send(fc, req);
      	err = req->out.h.error;
      	fuse_put_request(req);
      
      Create a new replacement helper:
      
      	/* Initialize args */
      	err = fuse_simple_request(fc, &args);
      
      In addition to reducing the code size, this will ease moving from the
      complex arg-based to a simpler page-based I/O on the fuse device.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      7078187a
    • M
      fuse: flush requests on umount · 580640ba
      Miklos Szeredi 提交于
      Use fuse_abort_conn() instead of fuse_conn_kill() in fuse_put_super().
      This flushes and aborts requests still on any queues.  But since we've
      already reset fc->connected, those requests would not be useful anyway and
      would be flushed when the fuse device is closed.
      
      Next patches will rely on requests being flushed before the superblock is
      destroyed.
      
      Use fuse_abort_conn() in cuse_process_init_reply() too, since it makes no
      difference there, and we can get rid of fuse_conn_kill().
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      580640ba