1. 29 3月, 2009 4 次提交
    • H
      xfs: pagecache usage optimization · bddaafa1
      Hisashi Hifumi 提交于
      Hi.
      
      I introduced "is_partially_uptodate" aops for XFS.
      
      A page can have multiple buffers and even if a page is not uptodate,
      some buffers can be uptodate on pagesize != blocksize environment.
      
      This aops checks that all buffers which correspond to a part of a file
      that we want to read are uptodate. If so, we do not have to issue actual
      read IO to HDD even if a page is not uptodate because the portion we
      want to read are uptodate.
      
      "block_is_partially_uptodate" function is already used by ext2/3/4.
      With the following patch random read/write mixed workloads or random read
      after random write workloads can be optimized and we can get performance
      improvement.
      
      I did a performance test using the sysbench.
      
      #sysbench --num-threads=4 --max-requests=100000 --test=fileio --file-num=1 \
      --file-block-size=8K --file-total-size=1G --file-test-mode=rndrw \
      --file-fsync-freq=0 --file-rw-ratio=0.5 run
      
      -2.6.29-rc6
      Test execution summary:
          total time:                          123.8645s
          total number of events:              100000
          total time taken by event execution: 442.4994
          per-request statistics:
               min:                            0.0000s
               avg:                            0.0044s
               max:                            0.3387s
               approx.  95 percentile:         0.0118s
      
      -2.6.29-rc6-patched
      Test execution summary:
          total time:                          108.0757s
          total number of events:              100000
          total time taken by event execution: 417.7505
          per-request statistics:
               min:                            0.0000s
               avg:                            0.0042s
               max:                            0.3217s
               approx.  95 percentile:         0.0118s
      
      arch: ia64
      pagesize: 16k
      blocksize: 4k
      Signed-off-by: NHisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NFelix Blyakher <felixb@sgi.com>
      bddaafa1
    • C
      xfs: remove m_litino · 6447c362
      Christoph Hellwig 提交于
      With the upcoming v3 inodes the inode data/attr area size needs to be
      calculated for each specific inode, so we can't cache it in the superblock
      anymore.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      Reviewed-by: NFelix Blyakher <felixb@sgi.com>
      6447c362
    • C
      xfs: kill ino64 mount option · a19d9f88
      Christoph Hellwig 提交于
      The ino64 mount option adds a fixed offset to 32bit inode numbers
      to bring them into the 64bit range.  There's no need for this kind
      of debug tool given that it's easy to produce real 64bit inode numbers
      for testing.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      Reviewed-by: NFelix Blyakher <felixb@sgi.com>
      a19d9f88
    • C
      xfs: kill mutex_t typedef · a0b0b8a5
      Christoph Hellwig 提交于
      People continue to complain about this for weird reasons, but there's
      really no point in keeping this typedef for a couple of users anyway.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      Reviewed-by: NFelix Blyakher <felixb@sgi.com>
      a0b0b8a5
  2. 23 3月, 2009 3 次提交
  3. 20 3月, 2009 3 次提交
  4. 18 3月, 2009 2 次提交
    • B
      NFSD: provide encode routine for OP_OPENATTR · 84f09f46
      Benny Halevy 提交于
      Although this operation is unsupported by our implementation
      we still need to provide an encode routine for it to
      merely encode its (error) status back in the compound reply.
      
      Thanks for Bill Baker at sun.com for testing with the Sun
      OpenSolaris' client, finding, and reporting this bug at
      Connectathon 2009.
      
      This bug was introduced in 2.6.27
      Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      84f09f46
    • L
      Avoid 64-bit "switch()" statements on 32-bit architectures · ee568b25
      Linus Torvalds 提交于
      Commit ee6f779b ("filp->f_pos not
      correctly updated in proc_task_readdir") changed the proc code to use
      filp->f_pos directly, rather than through a temporary variable.  In the
      process, that caused the operations to be done on the full 64 bits, even
      though the offset is never that big.
      
      That's all fine and dandy per se, but for some unfathomable reason gcc
      generates absolutely horrid code when using 64-bit values in switch()
      statements.  To the point of actually calling out to gcc helper
      functions like __cmpdi2 rather than just doing the trivial comparisons
      directly the way gcc does for normal compares.  At which point we get
      link failures, because we really don't want to support that kind of
      crazy code.
      
      Fix this by just casting the f_pos value to "unsigned long", which
      is plenty big enough for /proc, and avoids the gcc code generation issue.
      Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Zhang Le <r0bertz@gentoo.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee568b25
  5. 17 3月, 2009 1 次提交
    • E
      ext4: fix bb_prealloc_list corruption due to wrong group locking · d33a1976
      Eric Sandeen 提交于
      This is for Red Hat bug 490026: EXT4 panic, list corruption in
      ext4_mb_new_inode_pa
      
      ext4_lock_group(sb, group) is supposed to protect this list for
      each group, and a common code flow to remove an album is like
      this:
      
          ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL);
          ext4_lock_group(sb, grp);
          list_del(&pa->pa_group_list);
          ext4_unlock_group(sb, grp);
      
      so it's critical that we get the right group number back for
      this prealloc context, to lock the right group (the one 
      associated with this pa) and prevent concurrent list manipulation.
      
      however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a 
      comment, "-1 is to protect from crossing allocation group".
      
      This makes sense for the group_pa, where pa_pstart is advanced
      by the length which has been used (in ext4_mb_release_context()),
      and when the entire length has been used, pa_pstart has been
      advanced to the first block of the next group.
      
      However, for inode_pa, pa_pstart is never advanced; it's just
      set once to the first block in the group and not moved after
      that.  So in this case, if we subtract one in ext4_mb_put_pa(),
      we are actually locking the *previous* group, and opening the
      race with the other threads which do not subtract off the extra
      block.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d33a1976
  6. 16 3月, 2009 8 次提交
  7. 15 3月, 2009 6 次提交
  8. 14 3月, 2009 1 次提交
    • E
      ext4: fix bogus BUG_ONs in in mballoc code · 8d03c7a0
      Eric Sandeen 提交于
      Thiemo Nagel reported that:
      
      # dd if=/dev/zero of=image.ext4 bs=1M count=2
      # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
        -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
      # mount -o loop image.ext4 mnt/
      # dd if=/dev/zero of=mnt/file
      
      oopsed, with a BUG_ON in ext4_mb_normalize_request because
      size == EXT4_BLOCKS_PER_GROUP
      
      It appears to me (esp. after talking to Andreas) that the BUG_ON
      is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
      be allowed, though larger sizes do indicate a problem.
      
      Fix that an another (apparently rare) codepath with a similar check.
      Reported-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8d03c7a0
  9. 13 3月, 2009 9 次提交
  10. 12 3月, 2009 2 次提交
  11. 11 3月, 2009 1 次提交