1. 17 4月, 2005 13 次提交
    • A
      [PATCH] jbd dirty buffer leak fix · d13df84f
      akpm@osdl.org 提交于
      This fixes the lots-of-fsx-linux-instances-cause-a-slow-leak bug.
      
      It's been there since 2.6.6, caused by:
      
      ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.5/2.6.5-mm4/broken-out/jbd-move-locked-buffers.patch
      
      That patch moves under-writeout ordered-data buffers onto a separate journal
      list during commit.  It took out the old code which was based on a single
      list.
      
      The old code (necessarily) had logic which would restart I/O against buffers
      which had been redirtied while they were on the committing transaction's
      t_sync_datalist list.  The new code only writes buffers once, ignoring
      redirtyings by a later transaction, which is good.
      
      But over on the truncate side of things, in journal_unmap_buffer(), we're
      treating buffers on the t_locked_list as inviolable things which belong to the
      committing transaction, and we just leave them alone during concurrent
      truncate-vs-commit.
      
      The net effect is that when truncate tries to invalidate a page whose buffers
      are on t_locked_list and have been redirtied, journal_unmap_buffer() just
      leaves those buffers alone.  truncate will remove the page from its mapping
      and we end up with an anonymous clean page with dirty buffers, which is an
      illegal state for a page.  The JBD commit will not clean those buffers as they
      are removed from t_locked_list.  The VM (try_to_free_buffers) cannot reclaim
      these pages.
      
      The patch teaches journal_unmap_buffer() about buffers which are on the
      committing transaction's t_locked_list.  These buffers have been written and
      I/O has completed.  We can take them off the transaction and undirty them
      within the context of journal_invalidatepage()->journal_unmap_buffer().
      Acked-by: N"Stephen C. Tweedie" <sct@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d13df84f
    • D
      [PATCH] Direct IO async short read fix · 29504ff3
      Daniel McNeil 提交于
      The direct I/O code is mapping the read request to the file system block.  If
      the file size was not on a block boundary, the result would show the the read
      reading past EOF.  This was only happening for the AIO case.  The non-AIO case
      truncates the result to match file size (in direct_io_worker).  This patch
      does the same thing for the AIO case, it truncates the result to match the
      file size if the read reads past EOF.
      
      When I/O completes the result can be truncated to match the file size
      without using i_size_read(), thus the aio result now matches the number of
      bytes read to the end of file.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      29504ff3
    • D
      [PATCH] undo do_readv_writev() behavior change · 1f08ad02
      Dave Hansen 提交于
      Bugme bug 4326: http://bugme.osdl.org/show_bug.cgi?id=4326 reports:
      
      executing the systemcall readv with Bad argument
      ->len == -1) it gives out error EFAULT instead of EINVAL 
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1f08ad02
    • N
      [PATCH] quota: possible bug in quota format v2 support · e821d94d
      Niu YaWei 提交于
      Don't put root block of quota tree to the free list (when quota file is
      completely empty).  That should not actually happen anyway (somebody should
      get accounted for the filesystem root and so quota file should never be
      empty) but better prevent it here than solve magical quota file
      corruption.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e821d94d
    • J
      [PATCH] quota: fix possible oops on quotaoff · 31e7ad6a
      Jan Kara 提交于
      Remove dquot structures from quota file on quotaon - quota code does not
      expect them to be there.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      31e7ad6a
    • B
      [PATCH] ext2 corruption - regression between 2.6.9 and 2.6.10 · e072c6f2
      Bernard Blackham 提交于
      Whilst trying to stress test a Promise SX8 card, we stumbled across
      some nasty filesystem corruption in ext2. Our tests involved
      creating an ext2 partition, mounting, running several concurrent
      fsx's over it, umounting, and fsck'ing, all scripted[1]. The fsck
      would always return with errors.
      
      This regression was traced back to a change between 2.6.9 and
      2.6.10, which moves the functionality of ext2_put_inode into
      ext2_clear_inode.  The attached patch reverses this change, and
      eliminated the source of corruption.
      
      Mingming Cao <cmm@us.ibm.com> said:
      
      I think his patch for ext2 is correct.  The corruption on ext3 is not the same
      issue he saw on ext2.  I believe that's the race between discard reservation
      and reservation in-use that we already fixed it in 2.6.12- rc1.
      
      For the problem related to ext2, at the time when we design reservation for
      ext3, we decide we only need to discard the reservation at the last file
      close, so we have ext3_discard_reservation on iput_final- >ext3_clear_inode.
      
      The ext2 handle discard preallocation differently at that time, it discard the
      preallocation at each iput(), not in input_final(), so we think it's
      unnecessary to thrash it so frequently, and the right thing to do, as we did
      for ext3 reservation, discard preallocation on last iput().  So we moved the
      ext2_discard_preallocation from ext2_put_inode(0 to ext2_clear_inode.
      
      Since ext2 preallocation is doing pre-allocation on disk, so it is possible
      that at the unmount time, someone is still hold the reference of the inode, so
      the preallocation for a file is not discard yet, so we still mark those blocks
      allocated on disk, while they are not actually in the inode's block map, so
      fsck will catch/fix that error later.
      
      This is not a issue for ext3, as ext3 reservation(pre-allocation) is done in
      memory.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e072c6f2
    • B
      [PATCH] AYSNC IO using singals other than SIGIO · fc9c9ab2
      Bharath Ramesh 提交于
      A question on sigwaitinfo based IO mechanism in multithreaded applications.
      
      I am trying to use RT signals to notify me of IO events using RT signals
      instead of SIGIO in a multithreaded applications.  I noticed that there was
      some discussion on lkml during november 1999 with the subject of the
      discussion as "Signal driven IO".  In the thread I noticed that RT signals
      were being delivered to the worker thread.  I am running 2.6.10 kernel and
      I am trying to use the very same mechanism and I find that only SIGIO being
      propogated to the worker threads and RT signals only being propogated to
      the main thread and not the worker threads where I actually want them to be
      propogated too.  On further inspection I found that the following patch
      which I have attached solves the problem.
      
      I am not sure if this is a bug or feature in the kernel.
      
      
      Roland McGrath <roland@redhat.com> said:
      
      This relates only to fcntl F_SETSIG, which is a Linux extension.  So there is
      no POSIX issue.  When changing various things like the normal SIGIO signalling
      to do group signals, I was concerned strictly with the POSIX semantics and
      generally avoided touching things in the domain of Linux inventions.  That's
      why I didn't change this when I changed the call right next to it.  There is
      no reason I can see that F_SETSIG-requested signals shouldn't use a group
      signal like normal SIGIO does.  I'm happy to ACK this patch, there is nothing
      wrong with its change to the semantics in my book.  But neither POSIX nor I
      care a whit what F_SETSIG does.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fc9c9ab2
    • B
      [PATCH] ppc64: Improve mapping of vDSO · 547ee84c
      Benjamin Herrenschmidt 提交于
      This patch reworks the way the ppc64 is mapped in user memory by the kernel
      to make it more robust against possible collisions with executable
      segments.  Instead of just whacking a VMA at 1Mb, I now use
      get_unmapped_area() with a hint, and I moved the mapping of the vDSO to
      after the mapping of the various ELF segments and of the interpreter, so
      that conflicts get caught properly (it still has to be before
      create_elf_tables since the later will fill the AT_SYSINFO_EHDR with the
      proper address).
      
      While I was at it, I also changed the 32 and 64 bits vDSO's to link at
      their "natural" address of 1Mb instead of 0.  This is the address where
      they are normally mapped in absence of conflict.  By doing so, it should be
      possible to properly prelink one it's been verified to work on glibc.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      547ee84c
    • M
      [PATCH] meminfo: add Cached underflow check · 4c4c402d
      Martin Hicks 提交于
      Working on some code lately I've been getting huge values for "Cached".
      The cause is that get_page_cache_size() is an approximate value, and for a
      sufficiently small returned value of get_page_cache_size() the value
      underflows.
      Signed-off-by: NMartin Hicks <mort@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4c4c402d
    • A
      76c3073a
    • A
      [PATCH] oom-killer disable for iscsi/lvm2/multipath userland critical sections · 79befd0c
      Andrea Arcangeli 提交于
      iscsi/lvm2/multipath needs guaranteed protection from the oom-killer, so
      make the magical value of -17 in /proc/<pid>/oom_adj defeat the oom-killer
      altogether.
      
      (akpm: we still need to document oom_adj and friends in
      Documentation/filesystems/proc.txt!)
      Signed-off-by: NAndrea Arcangeli <andrea@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      79befd0c
    • A
      [PATCH] Fix acl Oops · e493073d
      akpm@osdl.org 提交于
      )
      
      
      From: Andreas Gruenbacher <agruen@suse.de>
      
      ext[23]_get_acl will return an error when reading the attribute fails or
      out-of-memory occurs.  Catch this case.
      Signed-off-by: NAndreas Gruenbacher <agruen@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e493073d
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4