1. 14 10月, 2010 5 次提交
    • C
      hfsplus: fix link corruption · f6089ff8
      Christoph Hellwig 提交于
      HFS implements hardlink by using indirect catalog entries that refer to a hidden
      directly.  The link target is cached in the dev field in the HFS+ specific
      inode, which is also used for the device number for device files, and inside
      for passing the nlink value of the indirect node from hfsplus_cat_write_inode
      to a helper function.  Now if we happen to write out the indirect node while
      hfsplus_link is creating the catalog entry we'll get a link pointing to the
      linkid of the current nlink value.  This can easily be reproduced by a large
      enough loop of local git-clone operations.
      
      Stop abusing the dev field in the HFS+ inode for short term storage by
      refactoring the way the permission structure in the catalog entry is
      set up, and rename the dev field to linkid to avoid any confusion.
      
      While we're at it also prevent creating hard links to special files, as
      the HFS+ dev and linkid share the same space in the on-disk structure.
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      f6089ff8
    • C
      hfsplus: validate btree flags · 13571a69
      Christoph Hellwig 提交于
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      13571a69
    • E
      hfsplus: handle more on-disk corruptions without oopsing · 9250f925
      Eric Sandeen 提交于
      hfs seems prone to bad things when it encounters on disk corruption.  Many
      values are read from disk, and used as lengths to memcpy, as an example.
      This patch fixes up several of these problematic cases.
      
      o sanity check the on-disk maximum key lengths on mount
        (these are set to a defined value at mkfs time and shouldn't differ)
      o check on-disk node keylens against the maximum key length for each tree
      o fix hfs_btree_open so that going out via free_tree: doesn't wind
        up in hfs_releasepage, which wants to follow the very pointer
        we were trying to set up:
      	HFS_SB(sb)->cat_tree = hfs_btree_open()
          .
        failure gets to hfs_releasepage and tries to follow HFS_SB(sb)->cat_tree
      
      Tested with the fsfuzzer; it survives more than it used to.
      
      [hch: ported of commit cf059462 from hfs]
      [hch: added the fixes from 5581d018ed3493d226e7a4d645d9c8a5af6c36b]
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      9250f925
    • A
      hfsplus: hfs_bnode_find() can fail, resulting in hfs_bnode_split() breakage · b6b41424
      Al Viro 提交于
      oops and fs corruption; the latter can happen even on valid fs in case of oom.
      
      [hch: port of commit 3d10a15d from hfs]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      b6b41424
    • J
      hfsplus: fix oops on mount with corrupted btree extent records · ee527162
      Jeff Mahoney 提交于
      A particular fsfuzzer run caused an hfs file system to crash on mount. This
      is due to a corrupted MDB extent record causing a miscalculation of
      HFSPLUS_I(inode)->first_blocks for the extent tree. If the extent records
      are zereod out, then it won't trigger the first_blocks special case and
      instead falls through to the extent code, which we're in the middle
      of initializing.
      
      This patch catches the 0 size extent records, reports the corruption,
      and fails the mount.
      
      [hch: ported of commit 47f365eb from hfs]
      Reported-by: NRamon de Carvalho Valle <rcvalle@linux.vnet.ibm.com>
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NChristoph Hellwig <hch@tuxera.com>
      ee527162
  2. 01 10月, 2010 20 次提交
  3. 24 9月, 2010 5 次提交
  4. 23 9月, 2010 4 次提交
    • K
      /proc/pid/smaps: fix dirty pages accounting · 1c2499ae
      KOSAKI Motohiro 提交于
      Currently, /proc/<pid>/smaps has wrong dirty pages accounting.
      Shared_Dirty and Private_Dirty output only pte dirty pages and ignore
      PG_dirty page flag.  It is difference against documentation, but also
      inconsistent against Referenced field.  (Referenced checks both pte and
      page flags)
      
      This patch fixes it.
      
      Test program:
      
       large-array.c
       ---------------------------------------------------
       #include <stdio.h>
       #include <stdlib.h>
       #include <string.h>
       #include <unistd.h>
      
       char array[1*1024*1024*1024L];
      
       int main(void)
       {
               memset(array, 1, sizeof(array));
               pause();
      
               return 0;
       }
       ---------------------------------------------------
      
      Test case:
       1. run ./large-array
       2. cat /proc/`pidof large-array`/smaps
       3. swapoff -a
       4. cat /proc/`pidof large-array`/smaps again
      
      Test result:
       <before patch>
      
      00601000-40601000 rw-p 00000000 00:00 0
      Size:            1048576 kB
      Rss:             1048576 kB
      Pss:             1048576 kB
      Shared_Clean:          0 kB
      Shared_Dirty:          0 kB
      Private_Clean:    218992 kB   <-- showed pages as clean incorrectly
      Private_Dirty:    829584 kB
      Referenced:       388364 kB
      Swap:                  0 kB
      KernelPageSize:        4 kB
      MMUPageSize:           4 kB
      
       <after patch>
      
      00601000-40601000 rw-p 00000000 00:00 0
      Size:            1048576 kB
      Rss:             1048576 kB
      Pss:             1048576 kB
      Shared_Clean:          0 kB
      Shared_Dirty:          0 kB
      Private_Clean:         0 kB
      Private_Dirty:   1048576 kB  <-- fixed
      Referenced:       388480 kB
      Swap:                  0 kB
      KernelPageSize:        4 kB
      MMUPageSize:           4 kB
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c2499ae
    • J
      aio: do not return ERESTARTSYS as a result of AIO · a0c42bac
      Jan Kara 提交于
      OCFS2 can return ERESTARTSYS from its write function when the process is
      signalled while waiting for a cluster lock (and the filesystem is mounted
      with intr mount option).  Generally, it seems reasonable to allow
      filesystems to return this error code from its IO functions.  As we must
      not leak ERESTARTSYS (and similar error codes) to userspace as a result of
      an AIO operation, we have to properly convert it to EINTR inside AIO code
      (restarting the syscall isn't really an option because other AIO could
      have been already submitted by the same io_submit syscall).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0c42bac
    • A
      /proc/vmcore: fix seeking · c227e690
      Arnd Bergmann 提交于
      Commit 73296bc6 ("procfs: Use generic_file_llseek in /proc/vmcore")
      broke seeking on /proc/vmcore.  This changes it back to use default_llseek
      in order to restore the original behaviour.
      
      The problem with generic_file_llseek is that it only allows seeks up to
      inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual
      file systems.  We should merge generic_file_llseek and default_llseek some
      day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore
      is the safer solution.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NCAI Qian <caiqian@redhat.com>
      Tested-by: NCAI Qian <caiqian@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c227e690
    • D
      Prevent freeing uninitialized pointer in compat_do_readv_writev · 767b68e9
      Dan Rosenberg 提交于
      In 32-bit compatibility mode, the error handling for
      compat_do_readv_writev() may free an uninitialized pointer, potentially
      leading to all sorts of ugly memory corruption.  This is reliably
      triggerable by unprivileged users by invoking the readv()/writev()
      syscalls with an invalid iovec pointer.  The below patch fixes this to
      emulate the non-compat version.
      
      Introduced by commit b8373363 ("compat: factor out
      compat_rw_copy_check_uvector from compat_do_readv_writev")
      Signed-off-by: NDan Rosenberg <dan.j.rosenberg@gmail.com>
      Cc: stable@kernel.org (2.6.35)
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      767b68e9
  5. 22 9月, 2010 2 次提交
    • J
      bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends · 692ebd17
      Jan Kara 提交于
      Inodes of devices such as /dev/zero can get dirty for example via
      utime(2) syscall or due to atime update. Backing device of such inodes
      (zero_bdi, etc.) is however unable to handle dirty inodes and thus
      __mark_inode_dirty complains.  In fact, inode should be rather dirtied
      against backing device of the filesystem holding it. This is generally a
      good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
      in these pseudofilesystems are referenced from ordinary filesystem
      inodes and carry mapping with real data of the device. Thus for these
      inodes we have to use inode->i_mapping->backing_dev_info as we did so
      far. We distinguish these filesystems by checking whether sb->s_bdi
      points to a non-trivial backing device or not.
      
      Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
      There's a device inode A described by a path "/dev/sdb" on this
      filesystem. This inode will be dirtied against backing device "8:0"
      after this patch. bdev filesystem contains block device inode B coupled
      with our inode A. When someone modifies a page of /dev/sdb, it's B that
      gets dirtied and the dirtying happens against the backing device "8:16".
      Thus both inodes get filed to a correct bdi list.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      692ebd17
    • J
      char: Mark /dev/zero and /dev/kmem as not capable of writeback · 371d217e
      Jan Kara 提交于
      These devices don't do any writeback but their device inodes still can get
      dirty so mark bdi appropriately so that bdi code does the right thing and files
      inodes to lists of bdi carrying the device inodes.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      371d217e
  6. 20 9月, 2010 1 次提交
  7. 18 9月, 2010 3 次提交