1. 19 12月, 2014 1 次提交
    • S
      hfsplus: fix longname handling · 89ac9b4d
      Sougata Santra 提交于
      Longname is not correctly handled by hfsplus driver.  If an attempt to
      create a longname(>255) file/directory is made, it succeeds by creating a
      file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key.  Thus
      leaving the volume in an inconsistent state.  This patch fixes this issue.
      
      Although lookup is always called first to create a negative entry, so just
      doing a check in lookup would probably fix this issue.  I choose to
      propagate error to other iops as well.
      
      Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from
      hfsplus_cat_build_key, to avoid unncessary branching.
      
      Thanks a lot.
      
        TEST:
        ------
        dir="TEST_DIR"
        cdir=`pwd`
        name255="_123456789_123456789_123456789_123456789_123456789_123456789\
        _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
        _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
        _123456789_123456789_123456789_123456789_123456789_1234"
        name256="${name255}5"
      
        mkdir $dir
        cd $dir
        touch $name255
        rm -f $name255
        touch $name256
        ls -la
        cd $cdir
        rm -rf $dir
      
        RESULT:
        -------
        [sougata@ultrabook tmp]$ cdir=`pwd`
        [sougata@ultrabook tmp]$
        name255="_123456789_123456789_123456789_123456789_123456789_123456789\
         > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
         > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
         > _123456789_123456789_123456789_123456789_123456789_1234"
        [sougata@ultrabook tmp]$ name256="${name255}5"
        [sougata@ultrabook tmp]$
        [sougata@ultrabook tmp]$ mkdir $dir
        [sougata@ultrabook tmp]$ cd $dir
        [sougata@ultrabook TEST_DIR]$ touch $name255
        [sougata@ultrabook TEST_DIR]$ rm -f $name255
        [sougata@ultrabook TEST_DIR]$ touch $name256
        [sougata@ultrabook TEST_DIR]$ ls -la
        ls: cannot access
        _123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234:
        No such file or directory
        total 0
        drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 .
        drwxrwxrwx 1 root    root    6 Feb 20 19:56 ..
        -????????? ? ?       ?       ?            ?
        _123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234
        [sougata@ultrabook TEST_DIR]$ cd $cdir
        [sougata@ultrabook tmp]$ rm -rf $dir
        rm: cannot remove `TEST_DIR': Directory not empty
      
      -ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops.
      This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN
      and incorrect keys, leaving the FS in an inconsistent state.  This patch
      fixes this issue.
      Signed-off-by: NSougata Santra <sougata@tuxera.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      89ac9b4d
  2. 07 6月, 2014 1 次提交
  3. 04 4月, 2014 1 次提交
    • J
      mm + fs: store shadow entries in page cache · 91b0abe3
      Johannes Weiner 提交于
      Reclaim will be leaving shadow entries in the page cache radix tree upon
      evicting the real page.  As those pages are found from the LRU, an
      iput() can lead to the inode being freed concurrently.  At this point,
      reclaim must no longer install shadow pages because the inode freeing
      code needs to ensure the page tree is really empty.
      
      Add an address_space flag, AS_EXITING, that the inode freeing code sets
      under the tree lock before doing the final truncate.  Reclaim will check
      for this flag before installing shadow pages.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Metin Doslu <metin@citusdata.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ozgun Erdogan <ozgun@citusdata.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Ryan Mallon <rmallon@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91b0abe3
  4. 13 3月, 2014 1 次提交
    • T
      fs: push sync_filesystem() down to the file system's remount_fs() · 02b9984d
      Theodore Ts'o 提交于
      Previously, the no-op "mount -o mount /dev/xxx" operation when the
      file system is already mounted read-write causes an implied,
      unconditional syncfs().  This seems pretty stupid, and it's certainly
      documented or guaraunteed to do this, nor is it particularly useful,
      except in the case where the file system was mounted rw and is getting
      remounted read-only.
      
      However, it's possible that there might be some file systems that are
      actually depending on this behavior.  In most file systems, it's
      probably fine to only call sync_filesystem() when transitioning from
      read-write to read-only, and there are some file systems where this is
      not needed at all (for example, for a pseudo-filesystem or something
      like romfs).
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Artem Bityutskiy <dedekind1@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Evgeniy Dushistov <dushistov@mail.ru>
      Cc: Jan Kara <jack@suse.cz>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Anders Larsen <al@alarsen.net>
      Cc: Phillip Lougher <phillip@squashfs.org.uk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: Petr Vandrovec <petr@vandrovec.name>
      Cc: xfs@oss.sgi.com
      Cc: linux-btrfs@vger.kernel.org
      Cc: linux-cifs@vger.kernel.org
      Cc: samba-technical@lists.samba.org
      Cc: codalist@coda.cs.cmu.edu
      Cc: linux-ext4@vger.kernel.org
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: fuse-devel@lists.sourceforge.net
      Cc: cluster-devel@redhat.com
      Cc: linux-mtd@lists.infradead.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-nilfs@vger.kernel.org
      Cc: linux-ntfs-dev@lists.sourceforge.net
      Cc: ocfs2-devel@oss.oracle.com
      Cc: reiserfs-devel@vger.kernel.org
      02b9984d
  5. 13 11月, 2013 1 次提交
  6. 01 5月, 2013 2 次提交
  7. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  8. 28 2月, 2013 1 次提交
  9. 21 12月, 2012 2 次提交
  10. 03 10月, 2012 1 次提交
  11. 31 7月, 2012 1 次提交
  12. 23 7月, 2012 4 次提交
  13. 06 5月, 2012 1 次提交
  14. 21 3月, 2012 1 次提交
  15. 11 1月, 2012 1 次提交
  16. 04 1月, 2012 1 次提交
    • A
      vfs: fix the stupidity with i_dentry in inode destructors · 6b520e05
      Al Viro 提交于
      Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
      it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
      the cost of taking it into inode_init_always() will be negligible for pipes
      and sockets and negative for everything else.  Not to mention the removal of
      boilerplate code from ->destroy_inode() instances...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6b520e05
  17. 16 9月, 2011 2 次提交
  18. 22 7月, 2011 1 次提交
  19. 07 7月, 2011 2 次提交
  20. 30 6月, 2011 2 次提交
  21. 04 2月, 2011 1 次提交
  22. 13 1月, 2011 1 次提交
  23. 07 1月, 2011 2 次提交
    • N
      fs: dcache reduce branches in lookup path · fb045adb
      Nick Piggin 提交于
      Reduce some branches and memory accesses in dcache lookup by adding dentry
      flags to indicate common d_ops are set, rather than having to check them.
      This saves a pointer memory access (dentry->d_op) in common path lookup
      situations, and saves another pointer load and branch in cases where we
      have d_op but not the particular operation.
      
      Patched with:
      
      git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fb045adb
    • N
      fs: icache RCU free inodes · fa0d7e3d
      Nick Piggin 提交于
      RCU free the struct inode. This will allow:
      
      - Subsequent store-free path walking patch. The inode must be consulted for
        permissions when walking, so an RCU inode reference is a must.
      - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
        to take i_lock no longer need to take sb_inode_list_lock to walk the list in
        the first place. This will simplify and optimize locking.
      - Could remove some nested trylock loops in dcache code
      - Could potentially simplify things a bit in VM land. Do not need to take the
        page lock to follow page->mapping.
      
      The downsides of this is the performance cost of using RCU. In a simple
      creat/unlink microbenchmark, performance drops by about 10% due to inability to
      reuse cache-hot slab objects. As iterations increase and RCU freeing starts
      kicking over, this increases to about 20%.
      
      In cases where inode lifetimes are longer (ie. many inodes may be allocated
      during the average life span of a single inode), a lot of this cache reuse is
      not applicable, so the regression caused by this patch is smaller.
      
      The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
      however this adds some complexity to list walking and store-free path walking,
      so I prefer to implement this at a later date, if it is shown to be a win in
      real situations. I haven't found a regression in any non-micro benchmark so I
      doubt it will be a problem.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fa0d7e3d
  24. 17 12月, 2010 1 次提交
  25. 23 11月, 2010 7 次提交
    • C
      hfsplus: flush disk caches in sync and fsync · 34a2d313
      Christoph Hellwig 提交于
      Flush the disk cache in fsync and sync to make sure data actually is
      on disk on completion of these system calls.  There is a nobarrier
      mount option to disable this behaviour.  It's slightly misnamed now
      that barrier actually are gone, but it matches the name used by all
      major filesystems.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      34a2d313
    • C
      hfsplus: optimize fsync · e3494705
      Christoph Hellwig 提交于
      Avoid doing unessecary work in fsync.  Do nothing unless the inode
      was marked dirty, and only write the various metadata inodes out if
      they contain any dirty state from this inode.  This is archived by
      adding three new dirty bits to the hfsplus-specific inode which are
      set in the correct places.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      e3494705
    • C
      hfsplus: split up inode flags · b33b7921
      Christoph Hellwig 提交于
      Split the flags field in the hfsplus inode into an extent_state
      flag that is locked by the extent_lock, and a new flags field
      that uses atomic bitops.  The second will grow more flags in the
      next patch.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      b33b7921
    • C
      hfsplus: avoid useless work in hfsplus_sync_fs · f02e26f8
      Christoph Hellwig 提交于
      There is no reason to write out the metadata inodes or volume headers
      during a non-blocking sync, as we are almost guaranteed to dirty them
      again during the inode writeouts.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      f02e26f8
    • C
      hfsplus: make sure sync writes out all metadata · 7dc4f001
      Christoph Hellwig 提交于
      hfsplus stores all metadata except for the volume headers in special
      inodes.  While these are marked hashed and periodically written out
      by the flusher threads, we can't rely on that for sync.  For the case
      of a data integrity sync the VM has life-lock avoidance code that
      avoids writing inodes again that are redirtied during the sync,
      which is something that can happen easily for hfsplus.  So make sure
      we explicitly write out the metadata inodes at the beginning of
      hfsplus_sync_fs.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      7dc4f001
    • C
      hfsplus: use raw bio access for the volume headers · 52399b17
      Christoph Hellwig 提交于
      The hfsplus backup volume header is located two blocks from the end of
      the device.  In case of device sizes that are not 4k aligned this means
      we can't access it using buffer_heads when using the default 4k block
      size.
      
      Switch to using raw bios to read/write all buffer headers.  We were not
      relying on any caching behaviour of the buffer heads anyway.  Additionally
      always read in the backup volume header during mount to verify that we
      can actually read it.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      52399b17
    • C
      hfsplus: always use hfsplus_sync_fs to write the volume header · 3b5ce8ae
      Christoph Hellwig 提交于
      Remove opencoded writing of the volume header in hfsplus_fill_super
      and hfsplus_put_super and offload it to hfsplus_sync_fs.  In the
      put_super case this means we only write the superblock once instead
      of twice.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      3b5ce8ae