1. 21 2月, 2007 2 次提交
  2. 13 2月, 2007 2 次提交
  3. 12 2月, 2007 2 次提交
    • N
      [PATCH] buffer: memorder fix · 72ed3d03
      Nick Piggin 提交于
      unlock_buffer(), like unlock_page(), must not clear the lock without
      ensuring that the critical section is closed.
      
      Mingming later sent the same patch, saying:
      
        We are running SDET benchmark and saw double free issue for ext3 extended
        attributes block, which complains the same xattr block already being freed (in
        ext3_xattr_release_block()).  The problem could also been triggered by
        multiple threads loop untar/rm a kernel tree.
      
        The race is caused by missing a memory barrier at unlock_buffer() before the
        lock bit being cleared, resulting in possible concurrent h_refcounter update.
        That causes a reference counter leak, then later leads to the double free that
        we have seen.
      
        Inside unlock_buffer(), there is a memory barrier is placed *after* the lock
        bit is being cleared, however, there is no memory barrier *before* the bit is
        cleared.  On some arch the h_refcount update instruction and the clear bit
        instruction could be reordered, thus leave the critical section re-entered.
      
        The race is like this: For example, if the h_refcount is initialized as 1,
      
        cpu 0:                                   cpu1
        --------------------------------------   -----------------------------------
        lock_buffer() /* test_and_set_bit */
        clear_buffer_locked(bh);
                                                lock_buffer() /* test_and_set_bit */
        h_refcount = h_refcount+1; /* = 2*/     h_refcount = h_refcount + 1; /*= 2 */
                                                clear_buffer_locked(bh);
        ....                                    ......
      
        We lost a h_refcount here. We need a memory barrier before the buffer head lock
        bit being cleared to force the order of the two writes.  Please apply.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72ed3d03
    • A
      [PATCH] remove invalidate_inode_pages() · fc0ecff6
      Andrew Morton 提交于
      Convert all calls to invalidate_inode_pages() into open-coded calls to
      invalidate_mapping_pages().
      
      Leave the invalidate_inode_pages() wrapper in place for now, marked as
      deprecated.
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc0ecff6
  4. 30 1月, 2007 1 次提交
  5. 27 1月, 2007 1 次提交
    • L
      Resurrect 'try_to_free_buffers()' VM hackery · ecdfc978
      Linus Torvalds 提交于
      It's not pretty, but it appears that ext3 with data=journal will clean
      pages without ever actually telling the VM that they are clean.  This,
      in turn, will result in the VM (and balance_dirty_pages() in particular)
      to never realize that the pages got cleaned, and wait forever for an
      event that already happened.
      
      Technically, this seems to be a problem with ext3 itself, but it used to
      be hidden by 'try_to_free_buffers()' noticing this situation on its own,
      and just working around the filesystem problem.
      
      This commit re-instates that hack, in order to avoid a regression for
      the 2.6.20 release. This fixes bugzilla 7844:
      
      	http://bugzilla.kernel.org/show_bug.cgi?id=7844
      
      Peter Zijlstra points out that we should probably retain the debugging
      code that this removes from cancel_dirty_page(), and I agree, but for
      the imminent release we might as well just silence the warning too
      (since it's not a new bug: anything that triggers that warning has been
      around forever).
      Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
      Acked-by: NJens Axboe <jens.axboe@oracle.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ecdfc978
  6. 12 1月, 2007 1 次提交
  7. 22 12月, 2006 1 次提交
  8. 11 12月, 2006 3 次提交
    • A
      [PATCH] io-accounting: write-cancel accounting · e08748ce
      Andrew Morton 提交于
      Account for the number of byte writes which this process caused to not happen
      after all.
      
      Cc: Jay Lan <jlan@sgi.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Chris Sturtivant <csturtiv@sgi.com>
      Cc: Tony Ernst <tee@sgi.com>
      Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
      Cc: David Wright <daw@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e08748ce
    • A
      [PATCH] io-accounting: write accounting · 55e829af
      Andrew Morton 提交于
      Accounting writes is fairly simple: whenever a process flips a page from clean
      to dirty, we accuse it of having caused a write to underlying storage of
      PAGE_CACHE_SIZE bytes.
      
      This may overestimate the amount of writing: the page-dirtying may cause only
      one buffer_head's worth of writeout.  Fixing that is possible, but probably a
      bit messy and isn't obviously important.
      
      Cc: Jay Lan <jlan@sgi.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Chris Sturtivant <csturtiv@sgi.com>
      Cc: Tony Ernst <tee@sgi.com>
      Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
      Cc: David Wright <daw@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      55e829af
    • A
      [PATCH] clean up __set_page_dirty_nobuffers() · 8c08540f
      Andrew Morton 提交于
      Save a tabstop in __set_page_dirty_nobuffers() and __set_page_dirty_buffers()
      and a few other places.  No functional changes.
      
      Cc: Jay Lan <jlan@sgi.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Chris Sturtivant <csturtiv@sgi.com>
      Cc: Tony Ernst <tee@sgi.com>
      Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
      Cc: David Wright <daw@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8c08540f
  9. 08 12月, 2006 2 次提交
  10. 17 10月, 2006 1 次提交
    • J
      [PATCH] Fix IO error reporting on fsync() · 58ff407b
      Jan Kara 提交于
      When IO error happens on metadata buffer, buffer is freed from memory and
      later fsync() is called, filesystems like ext2 fail to report EIO.  We
      
      solve the problem by introducing a pointer to associated address space into
      the buffer_head.  When a buffer is removed from a list of metadata buffers
      associated with an address space, IO error is transferred from the buffer to
      the address space, so that fsync can later report it.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      58ff407b
  11. 12 10月, 2006 2 次提交
  12. 10 10月, 2006 1 次提交
    • N
      [PATCH] mm: bug in set_page_dirty_buffers · ebf7a227
      Nick Piggin 提交于
      This was triggered, but not the fault of, the dirty page accounting
      patches. Suitable for -stable as well, after it goes upstream.
      
        Unable to handle kernel NULL pointer dereference at virtual address 0000004c
        EIP is at _spin_lock+0x12/0x66
        Call Trace:
         [<401766e7>] __set_page_dirty_buffers+0x15/0xc0
         [<401401e7>] set_page_dirty+0x2c/0x51
         [<40140db2>] set_page_dirty_balance+0xb/0x3b
         [<40145d29>] __do_fault+0x1d8/0x279
         [<40147059>] __handle_mm_fault+0x125/0x951
         [<401133f1>] do_page_fault+0x440/0x59f
         [<4034d0c1>] error_code+0x39/0x40
         [<08048a33>] 0x8048a33
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ebf7a227
  13. 01 10月, 2006 1 次提交
    • D
      [PATCH] BLOCK: Move functions out of buffer code [try #6] · cf9a2ae8
      David Howells 提交于
      Move some functions out of the buffering code that aren't strictly buffering
      specific.  This is a precursor to being able to disable the block layer.
      
       (*) Moved some stuff out of fs/buffer.c:
      
           (*) The file sync and general sync stuff moved to fs/sync.c.
      
           (*) The superblock sync stuff moved to fs/super.c.
      
           (*) do_invalidatepage() moved to mm/truncate.c.
      
           (*) try_to_release_page() moved to mm/filemap.c.
      
       (*) Moved some related declarations between header files:
      
           (*) declarations for do_invalidatepage() and try_to_release_page() moved
           	 to linux/mm.h.
      
           (*) __set_page_dirty_buffers() moved to linux/buffer_head.h.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cf9a2ae8
  14. 26 9月, 2006 1 次提交
    • P
      [PATCH] mm: tracking shared dirty pages · d08b3851
      Peter Zijlstra 提交于
      Tracking of dirty pages in shared writeable mmap()s.
      
      The idea is simple: write protect clean shared writeable pages, catch the
      write-fault, make writeable and set dirty.  On page write-back clean all the
      PTE dirty bits and write protect them once again.
      
      The implementation is a tad harder, mainly because the default
      backing_dev_info capabilities were too loosely maintained.  Hence it is not
      enough to test the backing_dev_info for cap_account_dirty.
      
      The current heuristic is as follows, a VMA is eligible when:
       - its shared writeable
          (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
       - it is not a 'special' mapping
          (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
       - the backing_dev_info is cap_account_dirty
          mapping_cap_account_dirty(vma->vm_file->f_mapping)
       - f_op->mmap() didn't change the default page protection
      
      Page from remap_pfn_range() are explicitly excluded because their COW
      semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
      because they don't have a backing store anyway.
      
      mprotect() is taught about the new behaviour as well.  However it overrides
      the last condition.
      
      Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
      It can be called on any page, but is currently only implemented for mapped
      pages, if the page is found the be of a VMA that accounts dirty pages it will
      also wrprotect the PTE.
      
      Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
      under ->private_lock.  This seems to be safe, since ->private_lock is used to
      serialize access to the buffers, not the page itself.  This is needed because
      clear_page_dirty() will call into page_mkclean() and would thereby violate
      locking order.
      
      [dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d08b3851
  15. 01 8月, 2006 1 次提交
  16. 01 7月, 2006 2 次提交
  17. 29 6月, 2006 1 次提交
  18. 28 6月, 2006 1 次提交
  19. 23 6月, 2006 1 次提交
    • J
      [PATCH] Kill PF_SYNCWRITE flag · b31dc66a
      Jens Axboe 提交于
      A process flag to indicate whether we are doing sync io is incredibly
      ugly. It also causes performance problems when one does a lot of async
      io and then proceeds to sync it. Part of the io will go out as async,
      and the other part as sync. This causes a disconnect between the
      previously submitted io and the synced io. For io schedulers such as CFQ,
      this will cause us lost merges and suboptimal behaviour in scheduling.
      
      Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
      the O_DIRECT path just directly indicate that the writes are sync
      by using WRITE_SYNC instead.
      Signed-off-by: NJens Axboe <axboe@suse.de>
      b31dc66a
  20. 28 3月, 2006 1 次提交
  21. 27 3月, 2006 5 次提交
  22. 26 3月, 2006 1 次提交
  23. 24 3月, 2006 4 次提交
  24. 23 3月, 2006 1 次提交
  25. 22 3月, 2006 1 次提交
    • C
      [PATCH] page migration reorg · b20a3503
      Christoph Lameter 提交于
      Centralize the page migration functions in anticipation of additional
      tinkering.  Creates a new file mm/migrate.c
      
      1. Extract buffer_migrate_page() from fs/buffer.c
      
      2. Extract central migration code from vmscan.c
      
      3. Extract some components from mempolicy.c
      
      4. Export pageout() and remove_from_swap() from vmscan.c
      
      5. Make it possible to configure NUMA systems without page migration
         and non-NUMA systems with page migration.
      
      I had to so some #ifdeffing in mempolicy.c that may need a cleanup.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b20a3503