1. 22 9月, 2006 1 次提交
    • S
      [GFS2] Tidy up meta_io code · 7276b3b0
      Steven Whitehouse 提交于
      Fix a bug in the directory reading code, where we might have dereferenced
      a NULL pointer in case of OOM. Updated the directory code to use the new
      & improved version of gfs2_meta_ra() which now returns the first block
      that was being read. Previously it was releasing it requiring following
      code to grab the block again at each point it was called.
      
      Also turned off readahead on directory lookups since we are reading a
      hash table, and therefore reading the entries in order is very
      unlikely. Readahead is still used for all other calls to the
      directory reading function (e.g. when growing the hash table).
      
      Removed the DIO_START constant. Everywhere this was used, it was
      used to unconditionally start i/o aside from a couple of places, so
      I've removed it and made the couple of exceptions to this rule into
      separate functions.
      
      Also hunted through the other DIO flags and removed them as arguments
      from functions which were always called with the same combination of
      arguments.
      
      Updated gfs2_meta_indirect_buffer to be a bit more efficient and
      hopefully also be a bit easier to read.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7276b3b0
  2. 21 9月, 2006 1 次提交
  3. 20 9月, 2006 1 次提交
  4. 19 9月, 2006 10 次提交
  5. 18 9月, 2006 1 次提交
  6. 17 9月, 2006 4 次提交
  7. 16 9月, 2006 1 次提交
  8. 15 9月, 2006 1 次提交
  9. 13 9月, 2006 2 次提交
  10. 12 9月, 2006 2 次提交
    • S
      [GFS2] Use hlist for glock hash chains · b6397893
      Steven Whitehouse 提交于
      This results in smaller list heads, so that we can have more chains
      in the same amount of memory (twice as many). I've multiplied the
      size of the table by four though - this is because we are saving
      memory by not having one lock per chain any more. So we land up
      using about the same amount of memory for the hash table as we
      did before I started these changes, the difference being that we
      now have four times as many hash chains.
      
      The reason that I say "about the same amount of memory" is that the
      actual amount now depends upon the NR_CPUS and some of the config
      variables, so that its not exact and in some cases we do use more
      memory. Eventually we might want to scale the hash table size
      according to the size of physical ram as measured on module load.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b6397893
    • S
      [GFS2] Rewrite of examine_bucket() · 24264434
      Steven Whitehouse 提交于
      The existing implementation of this function in glock.c was not
      very efficient as it relied upon keeping a cursor element upon the
      hash chain in question and moving it along. This new version improves
      upon this by using the current element as a cursor. This is possible
      since we only look at the "next" element in the list after we've
      taken the read_lock() subsequent to calling the examiner function.
      Obviously we have to eventually drop the ref count that we are then
      left with and we cannot do that while holding the read_lock, so we
      do that next time we drop the lock. That means either just before
      we examine another glock, or when the loop has terminated.
      
      The new implementation has several advantages: it uses only a
      read_lock() rather than a write_lock(), so it can run simnultaneously
      with other code, it doesn't need a "plug" element, so that it removes
      a test not only from this list iterator, but from all the other glock
      list iterators too. So it makes things faster and smaller.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      24264434
  11. 10 9月, 2006 4 次提交
  12. 09 9月, 2006 4 次提交
    • D
      [DLM] confirm master for recovered waiting requests · fa9f0e49
      David Teigland 提交于
      Fixing the following scenario:
      - A request is on the waiters list waiting for a reply from a remote node.
      - The request is the first one on the resource, so first_lkid is set.
      - The remote node fails causing recovery.
      - During recovery the requesting node becomes master.
      - The request is now processed locally instead of being a remote operation.
      - At this point we need to call confirm_master() on the resource since
        we're certain we're now the master node.  This will clear first_lkid.
      - We weren't calling confirm_master(), so first_lkid was not being cleared
        causing subsequent requests on that resource to get stuck.
      Signed-off-by: NDavid Teigland <teigland@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      fa9f0e49
    • S
      [GFS2] Move rwlocks in glock.c into their own array · 37b2fa6a
      Steven Whitehouse 提交于
      This splits the rwlocks guarding the hash chains of the glock hash
      table into their own array. This will reduce memory usage in some
      cases due to better alignment, although the real reason for doing it
      is to allow the two tables to be different sizes in future (i.e.
      the locks will be sized proportionally with the max number of CPUs
      and the hash chains sized proportinally with the size of physical memory)
      
      In order to allow this, the gl_bucket member of struct gfs2_glock has
      now become gl_hash, so we record the hash rather than a pointer to the
      bucket itself.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      37b2fa6a
    • T
      [PATCH] NFS: large non-page-aligned direct I/O clobbers memory · e9f7bee1
      Trond Myklebust 提交于
      The logic in nfs_direct_read_schedule and nfs_direct_write_schedule can
      allow data->npages to be one larger than rpages.  This causes a page
      pointer to be written beyond the end of the pagevec in nfs_read_data (or
      nfs_write_data).
      
      Fix this by making nfs_(read|write)_alloc() calculate the size of the
      pagevec array, and initialise data->npages.
      
      Also get rid of the redundant argument to nfs_commit_alloc().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e9f7bee1
    • B
      [PATCH] ext3_getblk() should handle HOLE correctly · 3665d0e5
      Badari Pulavarty 提交于
      It has been reported that ext3_getblk() is not doing the right thing and
      triggering following WARN():
      
      BUG: warning at fs/ext3/inode.c:1016/ext3_getblk()
       <c01c5140> ext3_getblk+0x98/0x2a6  <c03b2806> md_wakeup_thread+0x26/0x2a
       <c01c536d> ext3_bread+0x1f/0x88  <c01cedf9> ext3_quota_read+0x136/0x1ae
       <c018b683> v1_read_dqblk+0x61/0xac  <c0188f32> dquot_acquire+0xf6/0x107
       <c01ceaba> ext3_acquire_dquot+0x46/0x68  <c01897d4> dqget+0x155/0x1e7
       <c018a97b> dquot_transfer+0x3e0/0x3e9  <c016fe52> dput+0x23/0x13e
       <c01c7986> ext3_setattr+0xc3/0x240  <c0120f66> current_fs_time+0x52/0x6a
       <c017320e> notify_change+0x2bd/0x30d  <c0159246> chown_common+0x9c/0xc5
       <c02a222c> strncpy_from_user+0x3b/0x68  <c0167fe6> do_path_lookup+0xdf/0x266
       <c016841b> __user_walk_fd+0x44/0x5a  <c01592b9> sys_chown+0x4a/0x55
       <c015a43c> vfs_write+0xe7/0x13c  <c01695d4> sys_mkdir+0x1f/0x23
       <c0102a97> syscall_call+0x7/0xb
      
      Looking at the code, it looks like it's not handle HOLE correctly.  It ends
      up returning -EIO.  Here is the patch to fix it.
      
      If we really want to be paranoid, we can allow return values 0 (HOLE), 1
      (we asked for one block) and return -EIO for more than 1 block.  But I
      really don't see a reason for doing it - all we need is the block# here.
      (doesn't matter how many blocks are mapped).
      
      ext3_get_blocks_handle() returns number of blocks it mapped.  It returns 0
      in case of HOLE.  ext3_getblk() should handle HOLE properly (currently its
      dumping warning stack and returning -EIO).
      Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com>
      Acked-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3665d0e5
  13. 08 9月, 2006 7 次提交
  14. 07 9月, 2006 1 次提交