1. 21 10月, 2010 5 次提交
  2. 15 10月, 2010 2 次提交
    • L
      Un-inline the core-dump helper functions · 3aa0ce82
      Linus Torvalds 提交于
      Tony Luck reports that the addition of the access_ok() check in commit
      0eead9ab ("Don't dump task struct in a.out core-dumps") broke the
      ia64 compile due to missing the necessary header file includes.
      
      Rather than add yet another include (<asm/unistd.h>) to make everything
      happy, just uninline the silly core dump helper functions and move the
      bodies to fs/exec.c where they make a lot more sense.
      
      dump_seek() in particular was too big to be an inline function anyway,
      and none of them are in any way performance-critical.  And we really
      don't need to mess up our include file headers more than they already
      are.
      Reported-and-tested-by: NTony Luck <tony.luck@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3aa0ce82
    • L
      Don't dump task struct in a.out core-dumps · 0eead9ab
      Linus Torvalds 提交于
      akiphie points out that a.out core-dumps have that odd task struct
      dumping that was never used and was never really a good idea (it goes
      back into the mists of history, probably the original core-dumping
      code).  Just remove it.
      
      Also do the access_ok() check on dump_write().  It probably doesn't
      matter (since normal filesystems all seem to do it anyway), but he
      points out that it's normally done by the VFS layer, so ...
      
      [ I suspect that we should possibly do "vfs_write()" instead of
        calling ->write directly.  That also does the whole fsnotify and write
        statistics thing, which may or may not be a good idea. ]
      
      And just to be anal, do this all for the x86-64 32-bit a.out emulation
      code too, even though it's not enabled (and won't currently even
      compile)
      Reported-by: Nakiphie <akiphie@lavabit.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0eead9ab
  3. 14 10月, 2010 1 次提交
  4. 12 10月, 2010 1 次提交
    • E
      fanotify: disable fanotify syscalls · 7c534773
      Eric Paris 提交于
      This patch disables the fanotify syscalls by just not building them and
      letting the cond_syscall() statements in kernel/sys_ni.c redirect them
      to sys_ni_syscall().
      
      It was pointed out by Tvrtko Ursulin that the fanotify interface did not
      include an explicit prioritization between groups.  This is necessary
      for fanotify to be usable for hierarchical storage management software,
      as they must get first access to the file, before inotify-like notifiers
      see the file.
      
      This feature can be added in an ABI compatible way in the next release
      (by using a number of bits in the flags field to carry the info) but it
      was suggested by Alan that maybe we should just hold off and do it in
      the next cycle, likely with an (new) explicit argument to the syscall.
      I don't like this approach best as I know people are already starting to
      use the current interface, but Alan is all wise and noone on list backed
      me up with just using what we have.  I feel this is needlessly ripping
      the rug out from under people at the last minute, but if others think it
      needs to be a new argument it might be the best way forward.
      
      Three choices:
      Go with what we got (and implement the new feature next cycle).  Add a
      new field right now (and implement the new feature next cycle).  Wait
      till next cycle to release the ABI (and implement the new feature next
      cycle).  This is number 3.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c534773
  5. 08 10月, 2010 1 次提交
    • B
      exofs: Fix double page_unlock BUG in write_begin/end · f17b1f9f
      Boaz Harrosh 提交于
      This BUG is there since the first submit of the code, but only triggered
      in last Kernel. It's timing related do to the asynchronous object-creation
      behaviour of exofs. (Which should be investigated farther)
      
      The bug is obvious hence the fixed.
      
      Signed-off-by: Boaz Harrosh <Boaz Harrosh bharrosh@panasas.com>
      f17b1f9f
  6. 07 10月, 2010 7 次提交
  7. 04 10月, 2010 2 次提交
    • C
      writeback: always use sb->s_bdi for writeback purposes · aaead25b
      Christoph Hellwig 提交于
      We currently use struct backing_dev_info for various different purposes.
      Originally it was introduced to describe a backing device which includes
      an unplug and congestion function and various bits of readahead information
      and VM-relevant flags.  We're also using for tracking dirty inodes for
      writeback.
      
      To make writeback properly find all inodes we need to only access the
      per-filesystem backing_device pointed to by the superblock in ->s_bdi
      inside the writeback code, and not the instances pointeded to by
      inode->i_mapping->backing_dev which can be overriden by special devices
      or might not be set at all by some filesystems.
      
      Long term we should split out the writeback-relevant bits of struct
      backing_device_info (which includes more than the current bdi_writeback)
      and only point to it from the superblock while leaving the traditional
      backing device as a separate structure that can be overriden by devices.
      
      The one exception for now is the block device filesystem which really
      wants different writeback contexts for it's different (internal) inodes
      to handle the writeout more efficiently.  For now we do this with
      a hack in fs-writeback.c because we're so late in the cycle, but in
      the future I plan to replace this with a superblock method that allows
      for multiple writeback contexts per filesystem.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      aaead25b
    • G
      fuse: Initialize total_len in fuse_retrieve() · 0157443c
      Geert Uytterhoeven 提交于
      fs/fuse/dev.c:1357: warning: ‘total_len’ may be used uninitialized in this
      function
      
      Initialize total_len to zero, else its value will be undefined.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      0157443c
  8. 02 10月, 2010 4 次提交
    • F
      reiserfs: fix unwanted reiserfs lock recursion · 9d8117e7
      Frederic Weisbecker 提交于
      Prevent from recursively locking the reiserfs lock in reiserfs_unpack()
      because we may call journal_begin() that requires the lock to be taken
      only once, otherwise it won't be able to release the lock while taking
      other mutexes, ending up in inverted dependencies between the journal
      mutex and the reiserfs lock for example.
      
      This fixes:
      
        =======================================================
        [ INFO: possible circular locking dependency detected ]
        2.6.35.4.4a #3
        -------------------------------------------------------
        lilo/1620 is trying to acquire lock:
         (&journal->j_mutex){+.+...}, at: [<d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
      
        but task is already holding lock:
         (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
               [<c10562b7>] lock_acquire+0x67/0x80
               [<c12facad>] __mutex_lock_common+0x4d/0x410
               [<c12fb0c8>] mutex_lock_nested+0x18/0x20
               [<d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
               [<d0325c06>] do_journal_begin_r+0x86/0x340 [reiserfs]
               [<d0325f77>] journal_begin+0x77/0x140 [reiserfs]
               [<d0315be4>] reiserfs_remount+0x224/0x530 [reiserfs]
               [<c10b6a20>] do_remount_sb+0x60/0x110
               [<c10cee25>] do_mount+0x625/0x790
               [<c10cf014>] sys_mount+0x84/0xb0
               [<c12fca3d>] syscall_call+0x7/0xb
      
        -> #0 (&journal->j_mutex){+.+...}:
               [<c10560f6>] __lock_acquire+0x1026/0x1180
               [<c10562b7>] lock_acquire+0x67/0x80
               [<c12facad>] __mutex_lock_common+0x4d/0x410
               [<c12fb0c8>] mutex_lock_nested+0x18/0x20
               [<d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
               [<d0325f77>] journal_begin+0x77/0x140 [reiserfs]
               [<d0326271>] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
               [<d030d06c>] reiserfs_get_block+0x22c/0x1530 [reiserfs]
               [<c10db9db>] __block_prepare_write+0x1bb/0x3a0
               [<c10dbbe6>] block_prepare_write+0x26/0x40
               [<d030b738>] reiserfs_prepare_write+0x88/0x170 [reiserfs]
               [<d03294d6>] reiserfs_unpack+0xe6/0x120 [reiserfs]
               [<d0329782>] reiserfs_ioctl+0x272/0x320 [reiserfs]
               [<c10c3188>] vfs_ioctl+0x28/0xa0
               [<c10c3bbd>] do_vfs_ioctl+0x32d/0x5c0
               [<c10c3eb3>] sys_ioctl+0x63/0x70
               [<c12fca3d>] syscall_call+0x7/0xb
      
        other info that might help us debug this:
      
        2 locks held by lilo/1620:
         #0:  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<d032945a>] reiserfs_unpack+0x6a/0x120 [reiserfs]
         #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
      
        stack backtrace:
        Pid: 1620, comm: lilo Not tainted 2.6.35.4.4a #3
        Call Trace:
         [<c10560f6>] __lock_acquire+0x1026/0x1180
         [<c10562b7>] lock_acquire+0x67/0x80
         [<c12facad>] __mutex_lock_common+0x4d/0x410
         [<c12fb0c8>] mutex_lock_nested+0x18/0x20
         [<d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
         [<d0325f77>] journal_begin+0x77/0x140 [reiserfs]
         [<d0326271>] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
         [<d030d06c>] reiserfs_get_block+0x22c/0x1530 [reiserfs]
         [<c10db9db>] __block_prepare_write+0x1bb/0x3a0
         [<c10dbbe6>] block_prepare_write+0x26/0x40
         [<d030b738>] reiserfs_prepare_write+0x88/0x170 [reiserfs]
         [<d03294d6>] reiserfs_unpack+0xe6/0x120 [reiserfs]
         [<d0329782>] reiserfs_ioctl+0x272/0x320 [reiserfs]
         [<c10c3188>] vfs_ioctl+0x28/0xa0
         [<c10c3bbd>] do_vfs_ioctl+0x32d/0x5c0
         [<c10c3eb3>] sys_ioctl+0x63/0x70
         [<c12fca3d>] syscall_call+0x7/0xb
      Reported-by: NJarek Poplawski <jarkao2@gmail.com>
      Tested-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: All since 2.6.32 <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d8117e7
    • F
      reiserfs: fix dependency inversion between inode and reiserfs mutexes · 3f259d09
      Frederic Weisbecker 提交于
      The reiserfs mutex already depends on the inode mutex, so we can't lock
      the inode mutex in reiserfs_unpack() without using the safe locking API,
      because reiserfs_unpack() is always called with the reiserfs mutex locked.
      
      This fixes:
      
        =======================================================
        [ INFO: possible circular locking dependency detected ]
        2.6.35c #13
        -------------------------------------------------------
        lilo/1606 is trying to acquire lock:
         (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
      
        but task is already holding lock:
         (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
               [<c1056347>] lock_acquire+0x67/0x80
               [<c12f083d>] __mutex_lock_common+0x4d/0x410
               [<c12f0c58>] mutex_lock_nested+0x18/0x20
               [<d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
               [<d0329e9a>] reiserfs_lookup_privroot+0x2a/0x90 [reiserfs]
               [<d0316b81>] reiserfs_fill_super+0x941/0xe60 [reiserfs]
               [<c10b7d17>] get_sb_bdev+0x117/0x170
               [<d0313e21>] get_super_block+0x21/0x30 [reiserfs]
               [<c10b74ba>] vfs_kern_mount+0x6a/0x1b0
               [<c10b7659>] do_kern_mount+0x39/0xe0
               [<c10cebe0>] do_mount+0x340/0x790
               [<c10cf0b4>] sys_mount+0x84/0xb0
               [<c12f25cd>] syscall_call+0x7/0xb
      
        -> #0 (&sb->s_type->i_mutex_key#8){+.+.+.}:
               [<c1056186>] __lock_acquire+0x1026/0x1180
               [<c1056347>] lock_acquire+0x67/0x80
               [<c12f083d>] __mutex_lock_common+0x4d/0x410
               [<c12f0c58>] mutex_lock_nested+0x18/0x20
               [<d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
               [<d0329772>] reiserfs_ioctl+0x272/0x320 [reiserfs]
               [<c10c3228>] vfs_ioctl+0x28/0xa0
               [<c10c3c5d>] do_vfs_ioctl+0x32d/0x5c0
               [<c10c3f53>] sys_ioctl+0x63/0x70
               [<c12f25cd>] syscall_call+0x7/0xb
      
        other info that might help us debug this:
      
        1 lock held by lilo/1606:
         #0:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
      
        stack backtrace:
        Pid: 1606, comm: lilo Not tainted 2.6.35c #13
        Call Trace:
         [<c1056186>] __lock_acquire+0x1026/0x1180
         [<c1056347>] lock_acquire+0x67/0x80
         [<c12f083d>] __mutex_lock_common+0x4d/0x410
         [<c12f0c58>] mutex_lock_nested+0x18/0x20
         [<d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
         [<d0329772>] reiserfs_ioctl+0x272/0x320 [reiserfs]
         [<c10c3228>] vfs_ioctl+0x28/0xa0
         [<c10c3c5d>] do_vfs_ioctl+0x32d/0x5c0
         [<c10c3f53>] sys_ioctl+0x63/0x70
         [<c12f25cd>] syscall_call+0x7/0xb
      Reported-by: NJarek Poplawski <jarkao2@gmail.com>
      Tested-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: <stable@kernel.org>		[2.6.32 and later]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f259d09
    • J
      proc: make /proc/pid/limits world readable · 3036e7b4
      Jiri Olsa 提交于
      Having the limits file world readable will ease the task of system
      management on systems where root privileges might be restricted.
      
      Having admin restricted with root priviledges, he/she could not check
      other users process' limits.
      
      Also it'd align with most of the /proc stat files.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Cc: Eugene Teo <eugene@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3036e7b4
    • J
      cifs: prevent infinite recursion in cifs_reconnect_tcon · f569599a
      Jeff Layton 提交于
      cifs_reconnect_tcon is called from smb_init. After a successful
      reconnect, cifs_reconnect_tcon will call reset_cifs_unix_caps. That
      function will, in turn call CIFSSMBQFSUnixInfo and CIFSSMBSetFSUnixInfo.
      Those functions also call smb_init.
      
      It's possible for the session and tcon reconnect to succeed, and then
      for another cifs_reconnect to occur before CIFSSMBQFSUnixInfo or
      CIFSSMBSetFSUnixInfo to be called. That'll cause those functions to call
      smb_init and cifs_reconnect_tcon again, ad infinitum...
      
      Break the infinite recursion by having those functions use a new
      smb_init variant that doesn't attempt to perform a reconnect.
      Reported-and-Tested-by: NMichal Suchanek <hramrach@centrum.cz>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      f569599a
  9. 30 9月, 2010 2 次提交
  10. 29 9月, 2010 1 次提交
    • D
      xfs: force background CIL push under sustained load · 80168676
      Dave Chinner 提交于
      I have been seeing occasional pauses in transaction throughput up to
      30s long under heavy parallel workloads. The only notable thing was
      that the xfsaild was trying to be active during the pauses, but
      making no progress. It was running exactly 20 times a second (on the
      50ms no-progress backoff), and the number of pushbuf events was
      constant across this time as well.  IOWs, the xfsaild appeared to be
      stuck on buffers that it could not push out.
      
      Further investigation indicated that it was trying to push out inode
      buffers that were pinned and/or locked. The xfsbufd was also getting
      woken at the same frequency (by the xfsaild, no doubt) to push out
      delayed write buffers. The xfsbufd was not making any progress
      because all the buffers in the delwri queue were pinned. This scan-
      and-make-no-progress dance went one in the trace for some seconds,
      before the xfssyncd came along an issued a log force, and then
      things started going again.
      
      However, I noticed something strange about the log force - there
      were way too many IO's issued. 516 log buffers were written, to be
      exact. That added up to 129MB of log IO, which got me very
      interested because it's almost exactly 25% of the size of the log.
      He delayed logging code is suppose to aggregate the minimum of 25%
      of the log or 8MB worth of changes before flushing. That's what
      really puzzled me - why did a log force write 129MB instead of only
      8MB?
      
      Essentially what has happened is that no CIL pushes had occurred
      since the previous tail push which cleared out 25% of the log space.
      That caused all the new transactions to block because there wasn't
      log space for them, but they kick the xfsaild to push the tail.
      However, the xfsaild was not making progress because there were
      buffers it could not lock and flush, and the xfsbufd could not flush
      them because they were pinned. As a result, both the xfsaild and the
      xfsbufd could not move the tail of the log forward without the CIL
      first committing.
      
      The cause of the problem was that the background CIL push, which
      should happen when 8MB of aggregated changes have been committed, is
      being held off by the concurrent transaction commit load. The
      background push does a down_write_trylock() which will fail if there
      is a concurrent transaction commit holding the push lock in read
      mode. With 8 CPUs all doing transactions as fast as they can, there
      was enough concurrent transaction commits to hold off the background
      push until tail-pushing could no longer free log space, and the halt
      would occur.
      
      It should be noted that there is no reason why it would halt at 25%
      of log space used by a single CIL checkpoint. This bug could
      definitely violate the "no transaction should be larger than half
      the log" requirement and hence result in corruption if the system
      crashed under heavy load. This sort of bug is exactly the reason why
      delayed logging was tagged as experimental....
      
      The fix is to start blocking background pushes once the threshold
      has been exceeded. Rework the threshold calculations to keep the
      amount of log space a CIL checkpoint can use to below that of the
      AIL push threshold to avoid the problem completely.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      80168676
  11. 24 9月, 2010 5 次提交
  12. 23 9月, 2010 4 次提交
    • K
      /proc/pid/smaps: fix dirty pages accounting · 1c2499ae
      KOSAKI Motohiro 提交于
      Currently, /proc/<pid>/smaps has wrong dirty pages accounting.
      Shared_Dirty and Private_Dirty output only pte dirty pages and ignore
      PG_dirty page flag.  It is difference against documentation, but also
      inconsistent against Referenced field.  (Referenced checks both pte and
      page flags)
      
      This patch fixes it.
      
      Test program:
      
       large-array.c
       ---------------------------------------------------
       #include <stdio.h>
       #include <stdlib.h>
       #include <string.h>
       #include <unistd.h>
      
       char array[1*1024*1024*1024L];
      
       int main(void)
       {
               memset(array, 1, sizeof(array));
               pause();
      
               return 0;
       }
       ---------------------------------------------------
      
      Test case:
       1. run ./large-array
       2. cat /proc/`pidof large-array`/smaps
       3. swapoff -a
       4. cat /proc/`pidof large-array`/smaps again
      
      Test result:
       <before patch>
      
      00601000-40601000 rw-p 00000000 00:00 0
      Size:            1048576 kB
      Rss:             1048576 kB
      Pss:             1048576 kB
      Shared_Clean:          0 kB
      Shared_Dirty:          0 kB
      Private_Clean:    218992 kB   <-- showed pages as clean incorrectly
      Private_Dirty:    829584 kB
      Referenced:       388364 kB
      Swap:                  0 kB
      KernelPageSize:        4 kB
      MMUPageSize:           4 kB
      
       <after patch>
      
      00601000-40601000 rw-p 00000000 00:00 0
      Size:            1048576 kB
      Rss:             1048576 kB
      Pss:             1048576 kB
      Shared_Clean:          0 kB
      Shared_Dirty:          0 kB
      Private_Clean:         0 kB
      Private_Dirty:   1048576 kB  <-- fixed
      Referenced:       388480 kB
      Swap:                  0 kB
      KernelPageSize:        4 kB
      MMUPageSize:           4 kB
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c2499ae
    • J
      aio: do not return ERESTARTSYS as a result of AIO · a0c42bac
      Jan Kara 提交于
      OCFS2 can return ERESTARTSYS from its write function when the process is
      signalled while waiting for a cluster lock (and the filesystem is mounted
      with intr mount option).  Generally, it seems reasonable to allow
      filesystems to return this error code from its IO functions.  As we must
      not leak ERESTARTSYS (and similar error codes) to userspace as a result of
      an AIO operation, we have to properly convert it to EINTR inside AIO code
      (restarting the syscall isn't really an option because other AIO could
      have been already submitted by the same io_submit syscall).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0c42bac
    • A
      /proc/vmcore: fix seeking · c227e690
      Arnd Bergmann 提交于
      Commit 73296bc6 ("procfs: Use generic_file_llseek in /proc/vmcore")
      broke seeking on /proc/vmcore.  This changes it back to use default_llseek
      in order to restore the original behaviour.
      
      The problem with generic_file_llseek is that it only allows seeks up to
      inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual
      file systems.  We should merge generic_file_llseek and default_llseek some
      day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore
      is the safer solution.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Reported-by: NCAI Qian <caiqian@redhat.com>
      Tested-by: NCAI Qian <caiqian@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c227e690
    • D
      Prevent freeing uninitialized pointer in compat_do_readv_writev · 767b68e9
      Dan Rosenberg 提交于
      In 32-bit compatibility mode, the error handling for
      compat_do_readv_writev() may free an uninitialized pointer, potentially
      leading to all sorts of ugly memory corruption.  This is reliably
      triggerable by unprivileged users by invoking the readv()/writev()
      syscalls with an invalid iovec pointer.  The below patch fixes this to
      emulate the non-compat version.
      
      Introduced by commit b8373363 ("compat: factor out
      compat_rw_copy_check_uvector from compat_do_readv_writev")
      Signed-off-by: NDan Rosenberg <dan.j.rosenberg@gmail.com>
      Cc: stable@kernel.org (2.6.35)
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      767b68e9
  13. 22 9月, 2010 2 次提交
    • J
      bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends · 692ebd17
      Jan Kara 提交于
      Inodes of devices such as /dev/zero can get dirty for example via
      utime(2) syscall or due to atime update. Backing device of such inodes
      (zero_bdi, etc.) is however unable to handle dirty inodes and thus
      __mark_inode_dirty complains.  In fact, inode should be rather dirtied
      against backing device of the filesystem holding it. This is generally a
      good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
      in these pseudofilesystems are referenced from ordinary filesystem
      inodes and carry mapping with real data of the device. Thus for these
      inodes we have to use inode->i_mapping->backing_dev_info as we did so
      far. We distinguish these filesystems by checking whether sb->s_bdi
      points to a non-trivial backing device or not.
      
      Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
      There's a device inode A described by a path "/dev/sdb" on this
      filesystem. This inode will be dirtied against backing device "8:0"
      after this patch. bdev filesystem contains block device inode B coupled
      with our inode A. When someone modifies a page of /dev/sdb, it's B that
      gets dirtied and the dirtying happens against the backing device "8:16".
      Thus both inodes get filed to a correct bdi list.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      692ebd17
    • J
      char: Mark /dev/zero and /dev/kmem as not capable of writeback · 371d217e
      Jan Kara 提交于
      These devices don't do any writeback but their device inodes still can get
      dirty so mark bdi appropriately so that bdi code does the right thing and files
      inodes to lists of bdi carrying the device inodes.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      371d217e
  14. 20 9月, 2010 1 次提交
  15. 18 9月, 2010 2 次提交