1. 18 6月, 2017 1 次提交
    • A
      ufs: fix the logics for tail relocation · 77e9ce32
      Al Viro 提交于
      * original hysteresis loop got broken by typo back in 2002; now
      it never switches out of OPTTIME state.  Fixed.
      * critical levels for switching from OPTTIME to OPTSPACE and back
      ought to be calculated once, at mount time.
      * we should use mul_u64_u32_div() for those calculations, now that
      ->s_dsize is 64bit.
      * to quote Kirk McKusick (in 1995 FreeBSD commit message):
          The threshold for switching from time-space and space-time is too small
          when minfree is 5%...so make it stay at space in this case.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      77e9ce32
  2. 15 6月, 2017 2 次提交
    • A
      ufs: fix s_size/s_dsize users · c596961d
      Al Viro 提交于
      For UFS2 we need 64bit variants; we even store them in uspi, but
      use 32bit ones instead.  One wrinkle is in handling of reserved
      space - recalculating it every time had been stupid all along, but
      now it would become really ugly.  Just calculate it once...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c596961d
    • A
      ufs: fix logics in "ufs: make fsck -f happy" · 96ecff14
      Al Viro 提交于
      Storing stats _only_ at new locations is wrong for UFS1; old
      locations should always be kept updated.  The check for "has
      been converted to use of new locations" is also wrong - it
      should be "->fs_maxbsize is equal to ->fs_bsize".
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      96ecff14
  3. 10 6月, 2017 1 次提交
  4. 25 12月, 2016 1 次提交
  5. 11 4月, 2016 1 次提交
  6. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  7. 07 7月, 2015 2 次提交
    • A
      ufs: kill lock_ufs() · dff7cfd3
      Al Viro 提交于
      There were 3 remaining users; in two of them we took ->s_lock immediately
      after lock_ufs() and held it until just before unlock_ufs(); the third
      one (statfs) could not be called from itself or from other two (remount
      and sync_fs).  Just use ->s_lock in statfs and don't bother with lock_ufs
      at all.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      dff7cfd3
    • A
      ufs: don't use lock_ufs() for block pointers tree protection · 724bb09f
      Al Viro 提交于
      * stores to block pointers are under per-inode seqlock (meta_lock) and
      mutex (truncate_mutex)
      * fetches of block pointers are either under truncate_mutex, or wrapped
      into seqretry loop on meta_lock
      * all changes of ->i_size are under truncate_mutex and i_mutex
      * all changes of ->i_lastfrag are under truncate_mutex
      
      It's similar to what ext2 is doing; the main difference is that unlike
      ext2 we can't rely upon the atomicity of stores into block pointers -
      on UFS2 they are 64bit.  So we can't cut the corner when switching
      a pointer from NULL to non-NULL as we could in ext2_splice_branch()
      and need to use meta_lock on all modifications.
      
      We use seqlock where ext2 uses rwlock; ext2 could probably also benefit
      from such change...
      
      Another non-trivial difference is that with UFS we *cannot* have reader
      grab truncate_mutex in case of race - it has to keep retrying.  That
      might be possible to change, but not until we lift tail unpacking
      several levels up in call chain.
      
      After that commit we do *NOT* hold fs-wide serialization on accesses
      to block pointers anymore.  Moreover, lock_ufs() can become a normal
      mutex now - it's only used on statfs, remount and sync_fs and none
      of those uses are recursive.  As the matter of fact, *now* it can be
      collapsed with ->s_lock, and be eventually replaced with saner
      per-cylinder-group spinlocks, but that's a separate story.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      724bb09f
  8. 18 6月, 2015 1 次提交
  9. 16 6月, 2015 1 次提交
  10. 02 6月, 2015 1 次提交
    • T
      writeback: separate out include/linux/backing-dev-defs.h · 66114cad
      Tejun Heo 提交于
      With the planned cgroup writeback support, backing-dev related
      declarations will be more widely used across block and cgroup;
      unfortunately, including backing-dev.h from include/linux/blkdev.h
      makes cyclic include dependency quite likely.
      
      This patch separates out backing-dev-defs.h which only has the
      essential definitions and updates blkdev.h to include it.  c files
      which need access to more backing-dev details now include
      backing-dev.h directly.  This takes backing-dev.h off the common
      include dependency chain making it a lot easier to use it across block
      and cgroup.
      
      v2: fs/fat build failure fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      66114cad
  11. 16 4月, 2015 1 次提交
  12. 18 2月, 2015 2 次提交
  13. 09 8月, 2014 5 次提交
  14. 07 6月, 2014 1 次提交
    • F
      ufs: sb mutex merge + mutex_destroy · 0244756e
      Fabian Frederick 提交于
      Commit 788257d6 ("ufs: remove the BKL") replaced BKL with mutex
      protection using functions lock_ufs, unlock_ufs and struct mutex 'mutex'
      in sb_info.
      
      Commit b6963327 ("ufs: drop lock/unlock super") removed lock/unlock
      super and added struct mutex 's_lock' in sb_info.
      
      Those 2 mutexes are generally locked/unlocked at the same time except in
      allocation (balloc, ialloc).
      
      This patch merges the 2 mutexes and propagates first commit solution.
      It also adds mutex destruction before kfree during ufs_fill_super
      failure and ufs_put_super.
      
      [akpm@linux-foundation.org: avoid ifdefs, return -EROFS not -EINVAL]
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Cc: Evgeniy Dushistov <dushistov@mail.ru>
      Cc: "Chen, Jet" <jet.chen@intel.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0244756e
  15. 08 4月, 2014 4 次提交
  16. 13 3月, 2014 1 次提交
    • T
      fs: push sync_filesystem() down to the file system's remount_fs() · 02b9984d
      Theodore Ts'o 提交于
      Previously, the no-op "mount -o mount /dev/xxx" operation when the
      file system is already mounted read-write causes an implied,
      unconditional syncfs().  This seems pretty stupid, and it's certainly
      documented or guaraunteed to do this, nor is it particularly useful,
      except in the case where the file system was mounted rw and is getting
      remounted read-only.
      
      However, it's possible that there might be some file systems that are
      actually depending on this behavior.  In most file systems, it's
      probably fine to only call sync_filesystem() when transitioning from
      read-write to read-only, and there are some file systems where this is
      not needed at all (for example, for a pseudo-filesystem or something
      like romfs).
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Artem Bityutskiy <dedekind1@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Evgeniy Dushistov <dushistov@mail.ru>
      Cc: Jan Kara <jack@suse.cz>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Anders Larsen <al@alarsen.net>
      Cc: Phillip Lougher <phillip@squashfs.org.uk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: Petr Vandrovec <petr@vandrovec.name>
      Cc: xfs@oss.sgi.com
      Cc: linux-btrfs@vger.kernel.org
      Cc: linux-cifs@vger.kernel.org
      Cc: samba-technical@lists.samba.org
      Cc: codalist@coda.cs.cmu.edu
      Cc: linux-ext4@vger.kernel.org
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: fuse-devel@lists.sourceforge.net
      Cc: cluster-devel@redhat.com
      Cc: linux-mtd@lists.infradead.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-nilfs@vger.kernel.org
      Cc: linux-ntfs-dev@lists.sourceforge.net
      Cc: ocfs2-devel@oss.oracle.com
      Cc: reiserfs-devel@vger.kernel.org
      02b9984d
  17. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  18. 10 10月, 2012 1 次提交
  19. 03 10月, 2012 1 次提交
  20. 23 7月, 2012 3 次提交
    • A
      fs/ufs: get rid of write_super · 9e9ad5f4
      Artem Bityutskiy 提交于
      This patch makes UFS stop using the VFS '->write_super()' method along with
      the 's_dirt' superblock flag, because they are on their way out.
      
      The way we implement this is that we schedule a delay job instead relying on
      's_dirt' and '->write_super()'.
      
      The whole "superblock write-out" VFS infrastructure is served by the
      'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
      writes out all dirty superblocks using the '->write_super()' call-back.  But the
      problem with this thread is that it wastes power by waking up the system every
      5 seconds, even if there are no diry superblocks, or there are no client
      file-systems which would need this (e.g., btrfs does not use
      '->write_super()'). So we want to kill it completely and thus, we need to make
      file-systems to stop using the '->write_super()' VFS service, and then remove
      it together with the kernel thread.
      
      Tested using fsstress from the LTP project.
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9e9ad5f4
    • A
      fs/ufs: re-arrange the code a bit · 7bd54ef7
      Artem Bityutskiy 提交于
      This patch does not do any functional changes. It only moves 3 functions
      in fs/ufs/super.c a little bit up in order to prepare for further changes
      where I'll need this new arrangement to avoid forward declarations.
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7bd54ef7
    • A
      fs/ufs: remove extra superblock write on unmount · 65e5e83f
      Artem Bityutskiy 提交于
      UFS calls 'ufs_write_super()' from 'ufs_put_super()' in order to write the
      superblocks to the media. However, it is not needed because VFS calls
      '->sync_fs()' before calling '->put_super()' - so by the time we are in
      'ufs_write_super()', the superblocks are already synchronized.
      Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      65e5e83f
  21. 11 5月, 2012 1 次提交
    • L
      vfs: make it possible to access the dentry hash/len as one 64-bit entry · 26fe5750
      Linus Torvalds 提交于
      This allows comparing hash and len in one operation on 64-bit
      architectures.  Right now only __d_lookup_rcu() takes advantage of this,
      since that is the case we care most about.
      
      The use of anonymous struct/unions hides the alternate 64-bit approach
      from most users, the exception being a few cases where we initialize a
      'struct qstr' with a static initializer.  This makes the problematic
      cases use a new QSTR_INIT() helper function for that (but initializing
      just the name pointer with a "{ .name = xyzzy }" initializer remains
      valid, as does just copying another qstr structure).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      26fe5750
  22. 29 3月, 2012 1 次提交
  23. 21 3月, 2012 2 次提交
  24. 07 1月, 2012 1 次提交
  25. 04 1月, 2012 1 次提交
    • A
      vfs: fix the stupidity with i_dentry in inode destructors · 6b520e05
      Al Viro 提交于
      Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
      it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
      the cost of taking it into inode_init_always() will be negligible for pipes
      and sockets and negative for everything else.  Not to mention the removal of
      boilerplate code from ->destroy_inode() instances...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6b520e05
  26. 31 3月, 2011 1 次提交
  27. 03 3月, 2011 1 次提交
    • A
      ufs: remove the BKL · 788257d6
      Arnd Bergmann 提交于
      This introduces a new per-superblock mutex in UFS to replace
      the big kernel lock. I have been careful to avoid nested
      calls to lock_ufs and to get the lock order right with
      respect to other mutexes, in particular lock_super.
      
      I did not make any attempt to prove that the big kernel
      lock is not needed in a particular place in the code,
      which is very possible.
      
      The mutex has a significant performance impact, so it is only
      used on SMP or PREEMPT configurations.
      
      As Nick Piggin noticed, any allocation inside of the lock
      may end up deadlocking when we get to ufs_getfrag_block
      in the reclaim task, so we now use GFP_NOFS.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Tested-by: NNick Bowler <nbowler@elliptictech.com>
      Cc: Evgeniy Dushistov <dushistov@mail.ru>
      Cc: Nick Piggin <npiggin@gmail.com>
      788257d6