1. 27 1月, 2007 2 次提交
  2. 12 1月, 2007 1 次提交
    • T
      [PATCH] NFS: Fix race in nfs_release_page() · e3db7691
      Trond Myklebust 提交于
          NFS: Fix race in nfs_release_page()
      
          invalidate_inode_pages2() may find the dirty bit has been set on a page
          owing to the fact that the page may still be mapped after it was locked.
          Only after the call to unmap_mapping_range() are we sure that the page
          can no longer be dirtied.
          In order to fix this, NFS has hooked the releasepage() method and tries
          to write the page out between the call to unmap_mapping_range() and the
          call to remove_mapping(). This, however leads to deadlocks in the page
          reclaim code, where the page may be locked without holding a reference
          to the inode or dentry.
      
          Fix is to add a new address_space_operation, launder_page(), which will
          attempt to write out a dirty page without releasing the page lock.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      
          Also, the bare SetPageDirty() can skew all sort of accounting leading to
          other nasties.
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e3db7691
  3. 24 12月, 2006 1 次提交
  4. 23 12月, 2006 1 次提交
  5. 22 12月, 2006 2 次提交
    • A
      [PATCH] truncate: clear page dirtiness before running try_to_free_buffers() · 3e67c098
      Andrew Morton 提交于
      truncate presently invalidates the dirty page's buffer_heads then shoots down
      the page.  But try_to_free_buffers() will now bale out because the page is
      dirty.
      
      Net effect: the LRU gets filled with dirty pages which have invalidated
      buffer_heads attached.  They have no ->mapping and hence cannot be cleaned.
      The machine leaks memory at an enormous rate.
      
      Fix this by cleaning the page before running try_to_free_buffers(), so
      try_to_free_buffers() can do its work.
      
      Also, remember to do dirty-page-acoounting in cancel_dirty_page() so the
      machine won't wedge up trying to write non-existent dirty pages.
      
      Probably still wrong, but now less so.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3e67c098
    • L
      VM: Remove "clear_page_dirty()" and "test_clear_page_dirty()" functions · fba2591b
      Linus Torvalds 提交于
      They were horribly easy to mis-use because of their tempting naming, and
      they also did way more than any users of them generally wanted them to
      do.
      
      A dirty page can become clean under two circumstances:
      
       (a) when we write it out.  We have "clear_page_dirty_for_io()" for
           this, and that function remains unchanged.
      
           In the "for IO" case it is not sufficient to just clear the dirty
           bit, you also have to mark the page as being under writeback etc.
      
       (b) when we actually remove a page due to it becoming inaccessible to
           users, notably because it was truncate()'d away or the file (or
           metadata) no longer exists, and we thus want to cancel any
           outstanding dirty state.
      
      For the (b) case, we now introduce "cancel_dirty_page()", which only
      touches the page state itself, and verifies that the page is not mapped
      (since cancelling writes on a mapped page would be actively wrong as it
      is still accessible to users).
      
      Some filesystems need to be fixed up for this: CIFS, FUSE, JFS,
      ReiserFS, XFS all use the old confusing functions, and will be fixed
      separately in subsequent commits (with some of them just removing the
      offending logic, and others using clear_page_dirty_for_io()).
      
      This was confirmed by Martin Michlmayr to fix the apt database
      corruption on ARM.
      
      Cc: Martin Michlmayr <tbm@cyrius.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Andrei Popa <andrei.popa@i-neo.ro>
      Cc: Andrew Morton <akpm@osdl.org>
      Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
      Cc: Gordon Farquharson <gordonfarquharson@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fba2591b
  6. 11 12月, 2006 1 次提交
  7. 17 10月, 2006 1 次提交
  8. 12 10月, 2006 2 次提交
  9. 01 10月, 2006 3 次提交
    • A
      [PATCH] invalidate_inode_pages2(): ignore page refcounts · bd4c8ce4
      Andrew Morton 提交于
      The recent fix to invalidate_inode_pages() (git commit 016eb4a0) managed to
      unfix invalidate_inode_pages2().
      
      The problem is that various bits of code in the kernel can take transient refs
      on pages: the page scanner will do this when inspecting a batch of pages, and
      the lru_cache_add() batching pagevecs also hold a ref.
      
      Net result is transient failures in invalidate_inode_pages2().  This affects
      NFS directory invalidation (observed) and presumably also block-backed
      direct-io (not yet reported).
      
      Fix it by reverting invalidate_inode_pages2() back to the old version which
      ignores the page refcounts.
      
      We may come up with something more clever later, but for now we need a 2.6.18
      fix for NFS.
      
      Cc: Chuck Lever <cel@citi.umich.edu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      bd4c8ce4
    • D
      [PATCH] BLOCK: Make it possible to disable the block layer [try #6] · 9361401e
      David Howells 提交于
      Make it possible to disable the block layer.  Not all embedded devices require
      it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
      the block layer to be present.
      
      This patch does the following:
      
       (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
           support.
      
       (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
           an item that uses the block layer.  This includes:
      
           (*) Block I/O tracing.
      
           (*) Disk partition code.
      
           (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
      
           (*) The SCSI layer.  As far as I can tell, even SCSI chardevs use the
           	 block layer to do scheduling.  Some drivers that use SCSI facilities -
           	 such as USB storage - end up disabled indirectly from this.
      
           (*) Various block-based device drivers, such as IDE and the old CDROM
           	 drivers.
      
           (*) MTD blockdev handling and FTL.
      
           (*) JFFS - which uses set_bdev_super(), something it could avoid doing by
           	 taking a leaf out of JFFS2's book.
      
       (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
           linux/elevator.h contingent on CONFIG_BLOCK being set.  sector_div() is,
           however, still used in places, and so is still available.
      
       (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
           parts of linux/fs.h.
      
       (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
      
       (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
      
       (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
           is not enabled.
      
       (*) fs/no-block.c is created to hold out-of-line stubs and things that are
           required when CONFIG_BLOCK is not set:
      
           (*) Default blockdev file operations (to give error ENODEV on opening).
      
       (*) Makes some /proc changes:
      
           (*) /proc/devices does not list any blockdevs.
      
           (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
      
       (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
      
       (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
           given command other than Q_SYNC or if a special device is specified.
      
       (*) In init/do_mounts.c, no reference is made to the blockdev routines if
           CONFIG_BLOCK is not defined.  This does not prohibit NFS roots or JFFS2.
      
       (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
           error ENOSYS by way of cond_syscall if so).
      
       (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
           CONFIG_BLOCK is not set, since they can't then happen.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9361401e
    • D
      [PATCH] BLOCK: Move functions out of buffer code [try #6] · cf9a2ae8
      David Howells 提交于
      Move some functions out of the buffering code that aren't strictly buffering
      specific.  This is a precursor to being able to disable the block layer.
      
       (*) Moved some stuff out of fs/buffer.c:
      
           (*) The file sync and general sync stuff moved to fs/sync.c.
      
           (*) The superblock sync stuff moved to fs/super.c.
      
           (*) do_invalidatepage() moved to mm/truncate.c.
      
           (*) try_to_release_page() moved to mm/filemap.c.
      
       (*) Moved some related declarations between header files:
      
           (*) declarations for do_invalidatepage() and try_to_release_page() moved
           	 to linux/mm.h.
      
           (*) __set_page_dirty_buffers() moved to linux/buffer_head.h.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cf9a2ae8
  10. 27 9月, 2006 1 次提交
  11. 09 9月, 2006 1 次提交
    • A
      [PATCH] invalidate_complete_page() race fix · 016eb4a0
      Andrew Morton 提交于
      If a CPU faults this page into pagetables after invalidate_mapping_pages()
      checked page_mapped(), invalidate_complete_page() will still proceed to remove
      the page from pagecache.  This leaves the page-faulting process with a
      detached page.  If it was MAP_SHARED then file data loss will ensue.
      
      Fix that up by checking the page's refcount after taking tree_lock.
      
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      016eb4a0
  12. 23 6月, 2006 1 次提交
    • N
      [PATCH] Remove semi-softlockup from invalidate_mapping_pages · e0f23603
      NeilBrown 提交于
      If invalidate_mapping_pages is called to invalidate a very large mapping
      (e.g.  a very large block device) and if the only active page in that
      device is near the end (or at least, at a very large index), such as, say,
      the superblock of an md array, and if that page happens to be locked when
      invalidate_mapping_pages is called, then
      
        pagevec_lookup will return this page and
        as it is locked, 'next' will be incremented and pagevec_lookup
        will be called again. and again. and again.
        while we count from 0 upto a very large number.
      
      We should really always set 'next' to 'page->index+1' before going around
      the loop again, not just if the page isn't locked.
      
      Cc: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e0f23603
  13. 10 1月, 2006 1 次提交
  14. 09 1月, 2006 1 次提交
    • A
      [PATCH] drop-pagecache · 9d0243bc
      Andrew Morton 提交于
      Add /proc/sys/vm/drop_caches.  When written to, this will cause the kernel to
      discard as much pagecache and/or reclaimable slab objects as it can.  THis
      operation requires root permissions.
      
      It won't drop dirty data, so the user should run `sync' first.
      
      Caveats:
      
      a) Holds inode_lock for exorbitant amounts of time.
      
      b) Needs to be taught about NUMA nodes: propagate these all the way through
         so the discarding can be controlled on a per-node basis.
      
      This is a debugging feature: useful for getting consistent results between
      filesystem benchmarks.  We could possibly put it under a config option, but
      it's less than 300 bytes.
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9d0243bc
  15. 07 1月, 2006 1 次提交
    • H
      [PATCH] reiser4: vfs: add truncate_inode_pages_range() · d7339071
      Hans Reiser 提交于
      This patch makes truncate_inode_pages_range from truncate_inode_pages.
      truncate_inode_pages became a one-liner call to truncate_inode_pages_range.
      
      Reiser4 needs truncate_inode_pages_ranges because it tries to keep
      correspondence between existences of metadata pointing to data pages and pages
      to which those metadata point to.  So, when metadata of certain part of file
      is removed from filesystem tree, only pages of corresponding range are to be
      truncated.
      
      (Needed by the madvise(MADV_REMOVE) patch)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d7339071
  16. 24 11月, 2005 1 次提交
  17. 31 10月, 2005 1 次提交
    • J
      [PATCH] ext3: Fix unmapped buffers in transaction's lists · aaa4059b
      Jan Kara 提交于
      Fix the problem (BUG 4964) with unmapped buffers in transaction's
      t_sync_data list.  The problem is we need to call filesystem's own
      invalidatepage() from block_write_full_page().
      
      block_write_full_page() must call filesystem's invalidatepage().  Otherwise
      following nasty race can happen:
      
         proc 1                                        proc 2
         ------                                        ------
      - write some new data to 'offset'
        => bh gets to the transactions data list
                                                    - starts truncate
                                                      => i_size set to new size
      - mpage_writepages()
        - ext3_ordered_writepage() to 'offset'
          - block_write_full_page()
            - page->index > end_index+1
              - block_invalidatepage()
                - discard_buffer()
                  - clear_buffer_mapped()
      
      - commit triggers and finds unmapped buffer - BOOM!
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      aaa4059b
  18. 01 5月, 2005 1 次提交
  19. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4