1. 13 5月, 2009 1 次提交
  2. 14 5月, 2009 3 次提交
    • T
      ext4: Add documentation to the ext4_*get_block* functions · b920c755
      Theodore Ts'o 提交于
      This adds more documentation to various internal functions in
      fs/ext4/inode.c, most notably ext4_ind_get_blocks(),
      ext4_da_get_block_write(), ext4_da_get_block_prep(),
      ext4_normal_get_block_write().
      
      In addition, the static function ext4_normal_get_block_write() has
      been renamed noalloc_get_block_write(), since it is used in many
      places far beyond ext4_normal_writepage().
      
      Plenty of warnings have been added to the noalloc_get_block_write()
      function, since the way it is used is amazingly fragile.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      b920c755
    • T
      ext4: Define a new set of flags for ext4_get_blocks() · c2177057
      Theodore Ts'o 提交于
      The functions ext4_get_blocks(), ext4_ext_get_blocks(), and
      ext4_ind_get_blocks() used an ad-hoc set of integer variables used as
      boolean flags passed in as arguments.  Use a single flags parameter
      and a setandard set of bitfield flags instead.  This saves space on
      the call stack, and it also makes the code a bit more understandable.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c2177057
    • T
      ext4: Rename ext4_get_blocks_wrap() to be ext4_get_blocks() · 12b7ac17
      Theodore Ts'o 提交于
      Another function rename for clarity's sake.  The _wrap prefix simply
      confuses people, and didn't add much people trying to follow the code
      paths.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      12b7ac17
  3. 12 5月, 2009 2 次提交
  4. 15 5月, 2009 1 次提交
  5. 03 5月, 2009 2 次提交
  6. 02 5月, 2009 1 次提交
  7. 14 5月, 2009 1 次提交
  8. 03 5月, 2009 1 次提交
  9. 02 5月, 2009 3 次提交
  10. 04 5月, 2009 1 次提交
  11. 02 5月, 2009 2 次提交
    • T
      ext4: Move the ext4_i.h header file into ext4.h · d444c3c3
      Theodore Ts'o 提交于
      There is no longer a reason for a separate ext4_i.h header file, so
      move it into ext4.h just to make life easier for developers to find
      the relevant data structures and typedefs.  Should also speed up
      compiles slightly, too.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d444c3c3
    • T
      ext4: Don't avoid using BLOCK_UNINIT block groups in mballoc · 75507efb
      Theodore Ts'o 提交于
      By avoiding the use of not-yet-used block groups (i.e., block groups
      with the BLOCK_UNINIT flag), mballoc had a tendency to create large
      files with large non-contiguous gaps.  In addition avoiding the use of
      new block groups had a tendency to push regular file data into the
      first block group in a flex_bg group, which slows down the speed of
      e2fsck pass 2, since it has a tendency to seek much more.  For
      example:
      
                     Before Patch                       After Patch
                    Time in seconds                   Time in seconds
                  Real /  User/  Sys   MB/s      Real /  User/  Sys    MB/s
      Pass 1      8.52 / 2.21 / 0.46  20.43      8.84 / 4.97 / 1.11   19.68
      Pass 2     21.16 / 1.02 / 1.86  11.30      6.54 / 1.77 / 1.78   36.39
      Pass 3      0.01 / 0.00 / 0.00 139.00      0.01 / 0.01 / 0.00  128.90
      Pass 4      0.16 / 0.15 / 0.00   0.00      0.17 / 0.17 / 0.00    0.00
      Pass 5      2.52 / 1.99 / 0.09   0.79      2.31 / 1.78 / 0.06    0.86
      Total      32.40 / 5.11 / 2.49  12.81     17.99 / 8.75 / 2.98   23.01
      
      This was on a sample 80 gig root filesystem which was approximately
      50% full.  Note the improved e2fsck pass 2 performance, by over a
      factor of 3, due to a decreased number of seeks.  (The total amount of
      I/O in pass 2 was unchanged; the layout of the directory blocks was
      simply much better from e2fsck's's perspective.)
      
      Other changes as a result of this patch on this sample filesystem:
      
                                   Before Patch    After Patch
      # of non-contig files           762             779
      # of non-contig directories     571             570
      # of BLOCK_UNINIT bg's          307             293
      # of INODE_UNINIT bg's          503             503
      
      Out of 640 block groups, of which 333 were in use, this patch caused
      an extra 14 block groups to be utilized.  The number of non-contiguous
      files did go up slightly, but when measured against the 99.9% of the
      files (603,154) which were contiguously allocated, this is pretty
      insignificant.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndreas Dilger <adilger@sun.com>
      75507efb
  12. 26 4月, 2009 2 次提交
  13. 01 5月, 2009 1 次提交
    • T
      ext4: ext4_mark_recovery_complete() doesn't need to use lock_super · a63c9eb2
      Theodore Ts'o 提交于
      The function ext4_mark_recovery_complete() is called from two call
      paths: either (a) while mounting the filesystem, in which case there's
      no danger of any other CPU calling write_super() until the mount is
      completed, and (b) while remounting the filesystem read-write, in
      which case the fs core has already locked the superblock.  This also
      allows us to take out a very vile unlock_super()/lock_super() pair in
      ext4_remount().
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a63c9eb2
  14. 26 4月, 2009 1 次提交
  15. 01 5月, 2009 1 次提交
    • T
      ext4: Avoid races caused by on-line resizing and SMP memory reordering · 8df9675f
      Theodore Ts'o 提交于
      Ext4's on-line resizing adds a new block group and then, only at the
      last step adjusts s_groups_count.  However, it's possible on SMP
      systems that another CPU could see the updated the s_group_count and
      not see the newly initialized data structures for the just-added block
      group.  For this reason, it's important to insert a SMP read barrier
      after reading s_groups_count and before reading any (for example) the
      new block group descriptors allowed by the increased value of
      s_groups_count.
      
      Unfortunately, we rather blatently violate this locking protocol
      documented in fs/ext4/resize.c.  Fortunately, (1) on-line resizes
      happen relatively rarely, and (2) it seems rare that the filesystem
      code will immediately try to use just-added block group before any
      memory ordering issues resolve themselves.  So apparently problems
      here are relatively hard to hit, since ext3 has been vulnerable to the
      same issue for years with no one apparently complaining.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      8df9675f
  16. 02 5月, 2009 1 次提交
  17. 01 5月, 2009 2 次提交
    • T
      ext4: Fix and simplify s_dirt handling · 7234ab2a
      Theodore Ts'o 提交于
      The s_dirt flag wasn't completely handled correctly, but it didn't
      really matter when journalling was enabled.  It turns out that when
      ext4 runs without a journal, we don't clear s_dirt in places where we
      should have, with the result that the high-level write_super()
      function was writing the superblock when it wasn't necessary.
      
      So we fix this by making ext4_commit_super() clear the s_dirt flag,
      and removing many of the other places where s_dirt is manipulated.
      When journalling is enabled, the s_dirt flag might be left set more
      often, but s_dirt really doesn't matter when journalling is enabled.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7234ab2a
    • T
      ext4: Simplify ext4_commit_super()'s function signature · e2d67052
      Theodore Ts'o 提交于
      The ext4_commit_super() function took both a struct super_block * and
      a struct ext4_super_block *, but the struct ext4_super_block can be
      derived from the struct super_block.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e2d67052
  18. 25 4月, 2009 1 次提交
  19. 28 4月, 2009 1 次提交
    • T
      ext4: Fallback to vmalloc if kmalloc can't allocate s_flex_groups array · c5ca7c76
      Theodore Ts'o 提交于
      For very large filesystems, the s_flex_groups array can get quite big.
      For example, a filesystem that can be resized up to 16TB will have
      8192 flex groups (assuming the default flex_bg size of 16), so the
      array is 96k, which is *very* marginal for kmalloc().  On the other
      hand, a 160GB filesystem without the resize_inode feature will only
      require 960 bytes.  So we try to allocate the array first using
      kmalloc(), and if that fails, we'll try to use vmalloc() instead.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c5ca7c76
  20. 13 5月, 2009 2 次提交
    • A
      ext4: Mark the unwritten buffer_head as mapped during write_begin · 29fa89d0
      Aneesh Kumar K.V 提交于
      Setting BH_Unwritten buffer_heads as BH_Mapped avoids multiple
      (unnecessary) calls to get_block() during the call to the write(2)
      system call.  Setting BH_Unwritten buffer heads as BH_Mapped requires
      that the writepages() functions can handle BH_Unwritten buffer_heads.
      
      After this commit, things work as follows:
      
      ext4_ext_get_block() returns unmapped, unwritten, buffer head when
      called with create = 0 for prealloc space. This makes sure we handle
      the read path and non-delayed allocation case correctly.  Even though
      the buffer head is marked unmapped we have valid b_blocknr and b_bdev
      values in the buffer_head.
      
      ext4_da_get_block_prep() called for block resrevation will now return
      mapped, unwritten, new buffer_head for prealloc space. This avoids
      multiple calls to get_block() for write to same offset. By making such
      buffers as BH_New, we also assure that sub-block zeroing of buffered
      writes happens correctly.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      29fa89d0
    • A
      vfs: Add BUG_ON for delayed and unwritten flags in submit_bh() · 8fb0e342
      Aneesh Kumar K.V 提交于
      The BH_Delay and BH_Unwritten flags should never leak out to
      submit_bh().  So add some BUG_ON() checks to submit_bh so we can get a
      stack trace and determine how and why this might have happened.
      
      (Note that only XFS and ext4 use these buffer head flags, and XFS does
      not use submit_bh().  So this patch should only modify behavior for
      ext4.)
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      8fb0e342
  21. 14 5月, 2009 1 次提交
  22. 03 6月, 2009 6 次提交
  23. 02 6月, 2009 3 次提交
    • M
      net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel... · 12186be7
      Minoru Usui 提交于
      net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup
      
      This patch fixes a bug which unconfigured struct tcf_proto keeps
      chaining in tc_ctl_tfilter(), and avoids kernel panic in
      cls_cgroup_classify() when we use cls_cgroup.
      
      When we execute 'tc filter add', tcf_proto is allocated, initialized
      by classifier's init(), and chained.  After it's chained,
      tc_ctl_tfilter() calls classifier's change().  When classifier's
      change() fails, tc_ctl_tfilter() does not free and keeps tcf_proto.
      
      In addition, cls_cgroup is initialized in change() not in init().  It
      accesses unconfigured struct tcf_proto which is chained before
      change(), then hits Oops.
      Signed-off-by: NMinoru Usui <usui@mxm.nes.nec.co.jp>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NJamal Hadi Salim <hadi@cyberus.ca>
      Tested-by: NMinoru Usui <usui@mxm.nes.nec.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12186be7
    • N
      e1000: add missing length check to e1000 receive routine · ea30e119
      Neil Horman 提交于
      	Patch to fix bad length checking in e1000.  E1000 by default does two
      things:
      
      1) Spans rx descriptors for packets that don't fit into 1 skb on recieve
      2) Strips the crc from a frame by subtracting 4 bytes from the length prior to
      doing an skb_put
      
      Since the e1000 driver isn't written to support receiving packets that span
      multiple rx buffers, it checks the End of Packet bit of every frame, and
      discards it if its not set.  This places us in a situation where, if we have a
      spanning packet, the first part is discarded, but the second part is not (since
      it is the end of packet, and it passes the EOP bit test).  If the second part of
      the frame is small (4 bytes or less), we subtract 4 from it to remove its crc,
      underflow the length, and wind up in skb_over_panic, when we try to skb_put a
      huge number of bytes into the skb.  This amounts to a remote DOS attack through
      careful selection of frame size in relation to interface MTU.  The fix for this
      is already in the e1000e driver, as well as the e1000 sourceforge driver, but no
      one ever pushed it to e1000.  This is lifted straight from e1000e, and prevents
      small frames from causing the underflow described above
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Tested-by: NAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea30e119
    • E
      forcedeth: add phy_power_down parameter, leave phy powered up by default (v2) · 5a9a8e32
      Ed Swierk 提交于
      Add a phy_power_down parameter to forcedeth: set to 1 to power down the
      phy and disable the link when an interface goes down; set to 0 to always
      leave the phy powered up.
      
      The phy power state persists across reboots; Windows, some BIOSes, and
      older versions of Linux don't bother to power up the phy again, forcing
      users to remove all power to get the interface working (see
      http://bugzilla.kernel.org/show_bug.cgi?id=13072).  Leaving the phy
      powered on is the safest default behavior.  Users accustomed to seeing
      the link state reflect the interface state and/or wanting to minimize
      power consumption can set phy_power_down=1 if compatibility with other
      OSes is not an issue.
      Signed-off-by: NEd Swierk <eswierk@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a9a8e32