1. 20 6月, 2016 1 次提交
  2. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  3. 27 1月, 2016 1 次提交
  4. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  5. 16 4月, 2015 1 次提交
  6. 25 2月, 2015 1 次提交
    • C
      eCryptfs: ensure copy to crypt_stat->cipher does not overrun · 2a559a8b
      Colin Ian King 提交于
      The patch 237fead6: "[PATCH] ecryptfs: fs/Makefile and
      fs/Kconfig" from Oct 4, 2006, leads to the following static checker
      warning:
      
        fs/ecryptfs/crypto.c:846 ecryptfs_new_file_context()
        error: off-by-one overflow 'crypt_stat->cipher' size 32.  rl = '0-32'
      
      There is a mismatch between the size of ecryptfs_crypt_stat.cipher
      and ecryptfs_mount_crypt_stat.global_default_cipher_name causing the
      copy of the cipher name to cause a off-by-one string copy error. This
      fix ensures the space reserved for this string is the same size including
      the trailing zero at the end throughout ecryptfs.
      
      This fix avoids increasing the size of ecryptfs_crypt_stat.cipher
      and also ecryptfs_parse_tag_70_packet_silly_stack.cipher_string and instead
      reduces the of ECRYPTFS_MAX_CIPHER_NAME_SIZE to 31 and includes the + 1 for
      the end of string terminator.
      
      NOTE: An overflow is not possible in practice since the value copied
      into global_default_cipher_name is validated by
      ecryptfs_code_for_cipher_string() at mount time. None of the allowed
      cipher strings are long enough to cause the potential buffer overflow
      fixed by this patch.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      [tyhicks: Added the NOTE about the overflow not being triggerable]
      Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      2a559a8b
  7. 21 1月, 2015 1 次提交
  8. 24 10月, 2014 1 次提交
    • M
      fs: limit filesystem stacking depth · 69c433ed
      Miklos Szeredi 提交于
      Add a simple read-only counter to super_block that indicates how deep this
      is in the stack of filesystems.  Previously ecryptfs was the only stackable
      filesystem and it explicitly disallowed multiple layers of itself.
      
      Overlayfs, however, can be stacked recursively and also may be stacked
      on top of ecryptfs or vice versa.
      
      To limit the kernel stack usage we must limit the depth of the
      filesystem stack.  Initially the limit is set to 2.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      69c433ed
  9. 23 10月, 2014 1 次提交
    • T
      eCryptfs: Force RO mount when encrypted view is enabled · 332b122d
      Tyler Hicks 提交于
      The ecryptfs_encrypted_view mount option greatly changes the
      functionality of an eCryptfs mount. Instead of encrypting and decrypting
      lower files, it provides a unified view of the encrypted files in the
      lower filesystem. The presence of the ecryptfs_encrypted_view mount
      option is intended to force a read-only mount and modifying files is not
      supported when the feature is in use. See the following commit for more
      information:
      
        e77a56dd [PATCH] eCryptfs: Encrypted passthrough
      
      This patch forces the mount to be read-only when the
      ecryptfs_encrypted_view mount option is specified by setting the
      MS_RDONLY flag on the superblock. Additionally, this patch removes some
      broken logic in ecryptfs_open() that attempted to prevent modifications
      of files when the encrypted view feature was in use. The check in
      ecryptfs_open() was not sufficient to prevent file modifications using
      system calls that do not operate on a file descriptor.
      Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      Reported-by: NPriya Bansal <p.bansal@samsung.com>
      Cc: stable@vger.kernel.org # v2.6.21+: e77a56dd [PATCH] eCryptfs: Encrypted passthrough
      332b122d
  10. 25 10月, 2013 1 次提交
  11. 10 7月, 2013 1 次提交
  12. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  13. 03 10月, 2012 1 次提交
  14. 21 9月, 2012 1 次提交
  15. 15 9月, 2012 1 次提交
  16. 23 7月, 2012 1 次提交
  17. 14 7月, 2012 2 次提交
  18. 09 7月, 2012 1 次提交
    • T
      eCryptfs: Copy up POSIX ACL and read-only flags from lower mount · 069ddcda
      Tyler Hicks 提交于
      When the eCryptfs mount options do not include '-o acl', but the lower
      filesystem's mount options do include 'acl', the MS_POSIXACL flag is not
      flipped on in the eCryptfs super block flags. This flag is what the VFS
      checks in do_last() when deciding if the current umask should be applied
      to a newly created inode's mode or not. When a default POSIX ACL mask is
      set on a directory, the current umask is incorrectly applied to new
      inodes created in the directory. This patch ignores the MS_POSIXACL flag
      passed into ecryptfs_mount() and sets the flag on the eCryptfs super
      block depending on the flag's presence on the lower super block.
      
      Additionally, it is incorrect to allow a writeable eCryptfs mount on top
      of a read-only lower mount. This missing check did not allow writes to
      the read-only lower mount because permissions checks are still performed
      on the lower filesystem's objects but it is best to simply not allow a
      rw mount on top of ro mount. However, a ro eCryptfs mount on top of a rw
      mount is valid and still allowed.
      
      https://launchpad.net/bugs/1009207Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      Reported-by: NStefan Beller <stefanbeller@googlemail.com>
      Cc: John Johansen <john.johansen@canonical.com>
      069ddcda
  19. 21 3月, 2012 2 次提交
  20. 10 8月, 2011 1 次提交
  21. 30 5月, 2011 3 次提交
  22. 26 4月, 2011 1 次提交
    • T
      eCryptfs: Add reference counting to lower files · 332ab16f
      Tyler Hicks 提交于
      For any given lower inode, eCryptfs keeps only one lower file open and
      multiplexes all eCryptfs file operations through that lower file. The
      lower file was considered "persistent" and stayed open from the first
      lookup through the lifetime of the inode.
      
      This patch keeps the notion of a single, per-inode lower file, but adds
      reference counting around the lower file so that it is closed when not
      currently in use. If the reference count is at 0 when an operation (such
      as open, create, etc.) needs to use the lower file, a new lower file is
      opened. Since the file is no longer persistent, all references to the
      term persistent file are changed to lower file.
      
      Locking is added around the sections of code that opens the lower file
      and assign the pointer in the inode info, as well as the code the fputs
      the lower file when all eCryptfs users are done with it.
      
      This patch is needed to fix issues, when mounted on top of the NFSv3
      client, where the lower file is left silly renamed until the eCryptfs
      inode is destroyed.
      Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
      332ab16f
  23. 31 3月, 2011 1 次提交
  24. 28 3月, 2011 3 次提交
  25. 18 1月, 2011 2 次提交
  26. 14 1月, 2011 1 次提交
  27. 13 1月, 2011 1 次提交
  28. 07 1月, 2011 1 次提交
    • N
      fs: dcache reduce branches in lookup path · fb045adb
      Nick Piggin 提交于
      Reduce some branches and memory accesses in dcache lookup by adding dentry
      flags to indicate common d_ops are set, rather than having to check them.
      This saves a pointer memory access (dentry->d_op) in common path lookup
      situations, and saves another pointer load and branch in cases where we
      have d_op but not the particular operation.
      
      Patched with:
      
      git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fb045adb
  29. 29 10月, 2010 2 次提交
  30. 05 10月, 2010 2 次提交
    • A
      BKL: Remove BKL from ecryptfs · 18dfe89d
      Arnd Bergmann 提交于
      The BKL is only used in fill_super, which is protected by the superblocks
      s_umount rw_semaphorei, and in fasync, which does not do anything that
      could require the BKL. Therefore it is safe to remove the BKL entirely.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Dustin Kirkland <kirkland@canonical.com>
      Cc: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
      Cc: ecryptfs-devel@lists.launchpad.net
      18dfe89d
    • J
      BKL: Explicitly add BKL around get_sb/fill_super · db719222
      Jan Blunck 提交于
      This patch is a preparation necessary to remove the BKL from do_new_mount().
      It explicitly adds calls to lock_kernel()/unlock_kernel() around
      get_sb/fill_super operations for filesystems that still uses the BKL.
      
      I've read through all the code formerly covered by the BKL inside
      do_kern_mount() and have satisfied myself that it doesn't need the BKL
      any more.
      
      do_kern_mount() is already called without the BKL when mounting the rootfs
      and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
      from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
      through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
      afs_mntpt_follow_link(). Both later functions are actually the filesystems
      follow_link inode operation. vfs_kern_mount() is calling the specified
      get_sb function and lets the filesystem do its job by calling the given
      fill_super function.
      
      Therefore I think it is safe to push down the BKL from the VFS to the
      low-level filesystems get_sb/fill_super operation.
      
      [arnd: do not add the BKL to those file systems that already
             don't use it elsewhere]
      Signed-off-by: NJan Blunck <jblunck@infradead.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Christoph Hellwig <hch@infradead.org>
      db719222
  31. 22 5月, 2010 1 次提交
    • A
      Ban ecryptfs over ecryptfs · 4403158b
      Al Viro 提交于
      This is a seriously simplified patch from Eric Sandeen; copy of
      rationale follows:
      ===
        mounting stacked ecryptfs on ecryptfs has been shown to lead to bugs
        in testing.  For crypto info in xattr, there is no mechanism for handling
        this at all, and for normal file headers, we run into other trouble:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        IP: [<ffffffffa015b0b3>] ecryptfs_d_revalidate+0x43/0xa0 [ecryptfs]
        ...
      
        There doesn't seem to be any good usecase for this, so I'd suggest just
        disallowing the configuration.
      
        Based on a patch originally, I believe, from Mike Halcrow.
      ===
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4403158b