1. 28 8月, 2009 1 次提交
    • T
      ext4: fix extent sanity checking code with AGGRESSIVE_TEST · 55ad63bf
      Theodore Ts'o 提交于
      The extents sanity-checking code depends on the ext4_ext_space_*()
      functions returning the maximum alloable size for eh_max; however,
      when the debugging #ifdef AGGRESSIVE_TEST is enabled to test the
      extent tree handling code, this prevents a normally created ext4
      filesystem from being mounted with the errors:
      
      Aug 26 15:43:50 bsd086 kernel: [   96.070277] EXT4-fs error (device sda8): ext4_ext_check_inode: bad header/extent in inode #8: too large eh_max - magic f30a, entries 1, max 4(3), depth 0(0)
      Aug 26 15:43:50 bsd086 kernel: [   96.070526] EXT4-fs (sda8): no journal found
      
      Bug reported by Akira Fujita.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      55ad63bf
  2. 26 8月, 2009 3 次提交
    • E
      ext4: use ext4_grpblk_t more extensively · a36b4498
      Eric Sandeen 提交于
      unsigned  short is potentially too small to track blocks within
      a group; today it is safe due to restrictions in e2fsprogs but
      we have _lo / _hi bits for group blocks with the intent to go
      up to 32 bits, so clean this up now.
      
      There are many more places where we use unsigned/int/unsigned int
      to contain a group block but this should at least fix all the
      short types.
      
      I added a few comments to the struct ext4_group_info definition
      as well.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a36b4498
    • E
      ext4: use variables not types in sizeofs() for allocations · 1927805e
      Eric Sandeen 提交于
      Precursor to changing some types; to keep things in sync, it 
      seems better to allocate/memset based on the size of the 
      variables we are using rather than on some disconnected 
      basic type like "unsigned short"
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      1927805e
    • A
      ext4: Add missing unlock_new_inode() call in extent migration code · a8526e84
      Aneesh Kumar K.V 提交于
      We need to unlock the new inode before iput.  This patch fixes the
      following warning when calling chattr +e to migrate a file to use
      extents.  It also fixes problems in when e4defrag attempts to
      defragment an inode.
      
      [  470.400044] ------------[ cut here ]------------
      [  470.400065] WARNING: at fs/inode.c:1210 generic_delete_inode+0x65/0x16a()
      [  470.400072] Hardware name: N/A
      .....
      ...
      [  470.400353] Pid: 4451, comm: chattr Not tainted 2.6.31-rc7-red-debug #4
      [  470.400359] Call Trace:
      [  470.400372]  [<ffffffff81037771>] warn_slowpath_common+0x77/0x8f
      [  470.400385]  [<ffffffff81037798>] warn_slowpath_null+0xf/0x11
      [  470.400395]  [<ffffffff810b7f28>] generic_delete_inode+0x65/0x16a
      [  470.400405]  [<ffffffff810b8044>] generic_drop_inode+0x17/0x1bd
      [  470.400413]  [<ffffffff810b7083>] iput+0x61/0x65
      [  470.400455]  [<ffffffffa003b229>] ext4_ext_migrate+0x5eb/0x66a [ext4]
      [  470.400492]  [<ffffffffa002b1f8>] ext4_ioctl+0x340/0x756 [ext4]
      [  470.400507]  [<ffffffff810b1a91>] vfs_ioctl+0x1d/0x82
      [  470.400517]  [<ffffffff810b1ff0>] do_vfs_ioctl+0x483/0x4c9
      [  470.400527]  [<ffffffff81059c30>] ? trace_hardirqs_on+0xd/0xf
      [  470.400537]  [<ffffffff810b2087>] sys_ioctl+0x51/0x74
      [  470.400549]  [<ffffffff8100ba6b>] system_call_fastpath+0x16/0x1b
      [  470.400557] ---[ end trace ab85723542352dac ]---
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a8526e84
  3. 18 8月, 2009 7 次提交
    • E
      ext4: Add feature set check helper for mount & remount paths · a13fb1a4
      Eric Sandeen 提交于
      A user reported that although his root ext4 filesystem was mounting
      fine, other filesystems would not mount, with the:
      
      "Filesystem with huge files cannot be mounted RDWR without CONFIG_LBDAF"
      
      error on his 32-bit box built without CONFIG_LBDAF.  This is because
      the test at mount time for this situation was not being re-checked
      on remount, and the normal boot process makes an ro->rw transition,
      so this was being missed.
      
      Refactor to make a common helper function to test the filesystem
      features against the type of mount request (RO vs. RW) so that we 
      stay consistent.
      
      Addresses Red-Hat-Bugzilla: #517650
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a13fb1a4
    • E
      simplify some logic in ext4_mb_normalize_request · 38877f4e
      Eric Sandeen 提交于
      While reading through some of the mballoc code it seems that a couple
      spots in the size normalization function could be streamlined.
      
      The test for non-overlapping PAs can be or'd for the start & end
      conditions, and the tests for adjacent PAs can be else-if'd - 
      it's essentially independently testing:
      
      	if (A + B <= C)
      		...
      	if (A > C)
      		...
      
      These cannot both be true so it seems like the else-if might
      be slightly more efficient and/or informative.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      38877f4e
    • E
      ext4: open-code ext4_mb_update_group_info · 0373130d
      Eric Sandeen 提交于
      ext4_mb_update_group_info is only called in one place, and it's
      extremely simple.  There's no reason to have it in a separate function
      in a separate file as far as I can tell, it just obfuscates what's
      really going on.
      
      Perhaps it was intended to keep the grp->bb_* manipulation local to
      mballoc.c but we're already accessing other grp-> fields in balloc.c
      directly so this seems ok.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0373130d
    • E
      ext4: reject too-large filesystems on 32-bit kernels · bf43d84b
      Eric Sandeen 提交于
      ext4 will happily mount a > 16T filesystem on a 32-bit box, but
      this is not safe; writes to the block device will wrap past 16T
      and the page cache can't index past 16T (232 index * 4k pages).
      
      Adding another test to the existing "too many sectors" test
      should do the trick.
      
      Add a comment, a relevant return value, and fix the reference
      to the CONFIG_LBD(AF) option as well.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bf43d84b
    • H
      jbd2: bitfields should be unsigned · 0ccff1a4
      H Hartley Sweeten 提交于
      This fixes sparse noise:
        error: dubious one-bit signed bitfield
      Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: Jan Kara <jack@ucw.cz>
      0ccff1a4
    • J
      ext4: Fix possible deadlock between ext4_truncate() and ext4_get_blocks() · 487caeef
      Jan Kara 提交于
      During truncate we are sometimes forced to start a new transaction as
      the amount of blocks to be journaled is both quite large and hard to
      predict. So far we restarted a transaction while holding i_data_sem
      and that violates lock ordering because i_data_sem ranks below a
      transaction start (and it can lead to a real deadlock with
      ext4_get_blocks() mapping blocks in some page while having a
      transaction open).
      
      We fix the problem by dropping the i_data_sem before restarting the
      transaction and acquire it afterwards. It's slightly subtle that this
      works:
      
      1) By the time ext4_truncate() is called, all the page cache for the
      truncated part of the file is dropped so get_block() should not be
      called on it (we only have to invalidate extent cache after we
      reacquire i_data_sem because some extent from not-truncated part could
      extend also into the part we are going to truncate).
      
      2) Writes, migrate or defrag hold i_mutex so they are stopped for all
      the time of the truncate.
      
      This bug has been found and analyzed by Theodore Tso <tytso@mit.edu>.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      487caeef
    • J
      jbd2: Annotate transaction start also for jbd2_journal_restart() · 9599b0e5
      Jan Kara 提交于
      lockdep annotation for a transaction start has been at the end of
      jbd2_journal_start(). But a transaction is also started from
      jbd2_journal_restart(). Move the lockdep annotation to start_this_handle()
      which covers both cases.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9599b0e5
  4. 19 9月, 2009 1 次提交
  5. 01 9月, 2009 1 次提交
    • M
      ext4: Compile warning fix when EXT_DEBUG enabled · 84fe3bef
      Mingming 提交于
      When EXT_DEBUG is enabled I received the following compile warning on
      PPC64:
      
        CC [M]  fs/ext4/inode.o
        CC [M]  fs/ext4/extents.o
      fs/ext4/extents.c: In function ‘ext4_ext_rm_leaf’:
      fs/ext4/extents.c:2097: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 2 has type ‘ext4_lblk_t’
      fs/ext4/extents.c: In function ‘ext4_ext_get_blocks’:
      fs/ext4/extents.c:2789: warning: format ‘%u’ expects type ‘unsigned int’, but argument 4 has type ‘long unsigned int’
      fs/ext4/extents.c:2852: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘ext4_lblk_t’
      fs/ext4/extents.c:2953: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 4 has type ‘unsigned int’
        CC [M]  fs/ext4/migrate.o
      
      The patch fixes compile warning.
      Signed-off-by: NMingming Cao <cmm@us.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      Index: linux-2.6.31-rc4/fs/ext4/extents.c
      ===================================================================
      84fe3bef
  6. 19 9月, 2009 1 次提交
    • T
      ext4: Avoid group preallocation for closed files · 50797481
      Theodore Ts'o 提交于
      Currently the group preallocation code tries to find a large (512)
      free block from which to do per-cpu group allocation for small files.
      The problem with this scheme is that it leaves the filesystem horribly
      fragmented.  In the worst case, if the filesystem is unmounted and
      remounted (after a system shutdown, for example) we forget the fact
      that wee were using a particular (now-partially filled) 512 block
      extent.  So the next time we try to allocate space for a small file,
      we will find *another* completely free 512 block chunk to allocate
      small files.  Given that there are 32,768 blocks in a block group,
      after 64 iterations of "mount, write one 4k file in a directory,
      unmount", the block group will have 64 files, each separated by 511
      blocks, and the block group will no longer have any free 512
      completely free chunks of blocks for group preallocation space.
      
      So if we try to allocate blocks for a file that has been closed, such
      that we know the final size of the file, and the filesystem is not
      busy, avoid using group preallocation.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      50797481
  7. 10 8月, 2009 2 次提交
    • T
      ext4: Fix bugs in mballoc's stream allocation mode · 4ba74d00
      Theodore Ts'o 提交于
      The logic around sbi->s_mb_last_group and sbi->s_mb_last_start was all
      screwed up.  These fields were getting unconditionally all the time,
      set even when stream allocation had not taken place, and if they were
      being used when the file was smaller than s_mb_stream_request, which
      is when the allocation should _not_ be doing stream allocation.
      
      Fix this by determining whether or not we stream allocation should
      take place once, in ext4_mb_group_or_file(), and setting a flag which
      gets used in ext4_mb_regular_allocator() and ext4_mb_use_best_found().
      This simplifies the code and assures that we are consistently using
      (or not using) the stream allocation logic.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4ba74d00
    • T
      ext4: Display the mballoc flags in mb_history in hex instead of decimal · 0ef90db9
      Theodore Ts'o 提交于
      Displaying the flags in base 16 makes it easier to see which flags
      have been set.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0ef90db9
  8. 19 9月, 2009 1 次提交
  9. 11 8月, 2009 3 次提交
  10. 28 7月, 2009 1 次提交
  11. 06 7月, 2009 2 次提交
  12. 17 7月, 2009 2 次提交
    • C
      ext4: More buffer head reference leaks · 6487a9d3
      Curt Wohlgemuth 提交于
      After the patch I posted last week regarding buffer head ref leaks in
      no-journal mode, I looked at all the code that uses buffer heads and
      searched for more potential leaks.
      
      The patch below fixes the issues I found; these can occur even when a
      journal is present.
      
      The change to inode.c fixes a double release if
      ext4_journal_get_create_access() fails.
      
      The changes to namei.c are more complicated.  add_dirent_to_buf() will
      release the input buffer head EXCEPT when it returns -ENOSPC.  There are
      some callers of this routine that don't always do the brelse() in the event
      that -ENOSPC is returned.  Unfortunately, to put this fix into ext4_add_entry()
      required capturing the return value of make_indexed_dir() and
      add_dirent_to_buf().
      Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      6487a9d3
    • J
      jbd2: Fail to load a journal if it is too short · f6f50e28
      Jan Kara 提交于
      Due to on disk corruption, it can happen that journal is too short. Fail
      to load it in such case so that we don't oops somewhere later.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f6f50e28
  13. 28 7月, 2009 2 次提交
  14. 17 7月, 2009 1 次提交
  15. 16 9月, 2009 12 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 · ab86e576
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
        Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev
        debugfs: Modify default debugfs directory for debugging pktcdvd.
        debugfs: Modified default dir of debugfs for debugging UHCI.
        debugfs: Change debugfs directory of IWMC3200
        debugfs: Change debuhgfs directory of trace-events-sample.h
        debugfs: Fix mount directory of debugfs by default in events.txt
        hpilo: add poll f_op
        hpilo: add interrupt handler
        hpilo: staging for interrupt handling
        driver core: platform_device_add_data(): use kmemdup()
        Driver core: Add support for compatibility classes
        uio: add generic driver for PCI 2.3 devices
        driver-core: move dma-coherent.c from kernel to driver/base
        mem_class: fix bug
        mem_class: use minor as index instead of searching the array
        driver model: constify attribute groups
        UIO: remove 'default n' from Kconfig
        Driver core: Add accessor for device platform data
        Driver core: move dev_get/set_drvdata to drivers/base/dd.c
        Driver core: add new device to bus's list before probing
      ab86e576
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 · 7ea61767
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (641 commits)
        Staging: remove sxg driver
        Staging: remove heci driver
        Staging: remove at76_usb wireless driver.
        Staging: rspiusb: remove the driver
        Staging: meilhaus: remove the drivers
        Staging: remove me4000 driver.
        Staging: line6: ffzb returns an unsigned integer
        Staging: line6: pod.c: style cleanups
        Staging: iio: introduce missing kfree
        Staging: dream: introduce missing kfree
        Staging: comedi: addi-data: NULL dereference of amcc in v_pci_card_list_init()
        Staging: vt665x: fix built-in compiling
        Staging: rt3090: enable NATIVE_WPA_SUPPLICANT_SUPPORT option
        Staging: rt3090: port changes in WPA_MIX_PAIR_CIPHER to rt3090
        Staging: rt3090: rename device from raX to wlanX
        Staging: rt3090: remove possible conflict with rt2860
        Staging: rt2860/rt2870/rt3070/rt3090: fix compiler warning on x86_64
        Staging: rt2860: add new device ids
        Staging: rt3090: add device id 1462:891a
        Staging: asus_oled: Cleaned up checkpatch issues.
        ...
      7ea61767
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pcmcia-2.6 · 0950efd1
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pcmcia-2.6:
        pcmcia: document return value of pcmcia_loop_config
        pcmcia: dtl1_cs: fix pcmcia_loop_config logic
        pcmcia: drop non-existant includes
        pcmcia: disable prefetch/burst for OZ6933
        pcmcia: fix incorrect argument order to list_add_tail()
        pcmcia: drivers/pcmcia/pcmcia_resource.c: Remove unnecessary semicolons
        pcmcia: Use phys_addr_t for physical addresses
        pcmcia: drivers/pcmcia: Make static
      0950efd1
    • L
      Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 · 4406c56d
      Linus Torvalds 提交于
      * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits)
        PCI hotplug: clean up acpi_run_hpp()
        PCI hotplug: acpiphp: use generic pci_configure_slot()
        PCI hotplug: shpchp: use generic pci_configure_slot()
        PCI hotplug: pciehp: use generic pci_configure_slot()
        PCI hotplug: add pci_configure_slot()
        PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface
        PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge
        PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation
        PCI: Clear saved_state after the state has been restored
        PCI PM: Return error codes from pci_pm_resume()
        PCI: use dev_printk in quirk messages
        PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset()
        PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle
        PCI Hotplug: acpiphp: find bridges the easy way
        PCI: pcie portdrv: remove unused variable
        PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support
        ACPI PM: Replace wakeup.prepared with reference counter
        PCI PM: Introduce device flag wakeup_prepared
        PCI / ACPI PM: Rework some debug messages
        PCI PM: Simplify PCI wake-up code
        ...
      
      Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree
      scanning having been moved and merged for the 32- and 64-bit cases.  The
      'needs_freset' initialization added in 6e19314c ("PCI/powerpc: support
      PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.
      4406c56d
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 6b7b352f
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
        block: fix linkage problem with blk_iopoll and !CONFIG_BLOCK
      6b7b352f
    • L
      Merge branch 'writeback' of git://git.kernel.dk/linux-2.6-block · a3eb51ec
      Linus Torvalds 提交于
      * 'writeback' of git://git.kernel.dk/linux-2.6-block:
        writeback: fix possible bdi writeback refcounting problem
        writeback: Fix bdi use after free in wb_work_complete()
        writeback: improve scalability of bdi writeback work queues
        writeback: remove smp_mb(), it's not needed with list_add_tail_rcu()
        writeback: use schedule_timeout_interruptible()
        writeback: add comments to bdi_work structure
        writeback: splice dirty inode entries to default bdi on bdi_destroy()
        writeback: separate starting of sync vs opportunistic writeback
        writeback: inline allocation failure handling in bdi_alloc_queue_work()
        writeback: use RCU to protect bdi_list
        writeback: only use bdi_writeback_all() for WB_SYNC_NONE writeout
        fs: Assign bdi in super_block
        writeback: make wb_writeback() take an argument structure
        writeback: merely wakeup flusher thread if work allocation fails for WB_SYNC_NONE
        writeback: get rid of wbc->for_writepages
        fs: remove bdev->bd_inode_backing_dev_info
      a3eb51ec
    • N
      writeback: fix possible bdi writeback refcounting problem · 1ef7d9aa
      Nick Piggin 提交于
      wb_clear_pending AFAIKS should not be called after the item has been
      put on the list, except by the worker threads. It could lead to the
      situation where the refcount is decremented below 0 and cause lots of
      problems.
      
      Presumably the !wb_has_dirty_io case is not a common one, so it can
      be discovered when the thread wakes up to check?
      
      Also add a comment in bdi_work_clear.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      1ef7d9aa
    • N
      writeback: Fix bdi use after free in wb_work_complete() · 77b9d059
      Nick Piggin 提交于
      By the time bdi_work_on_stack gets evaluated again in bdi_work_free, it
      can already have been deallocated and used for something else in the
      !on stack case, giving a false positive in this test and causing
      corruption.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      77b9d059
    • N
      writeback: improve scalability of bdi writeback work queues · 77fad5e6
      Nick Piggin 提交于
      If you're going to do an atomic RMW on each list entry, there's not much
      point in all the RCU complexities of the list walking. This is only going
      to help the multi-thread case I guess, but it doesn't hurt to do now.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      77fad5e6
    • N
      writeback: remove smp_mb(), it's not needed with list_add_tail_rcu() · deed62ed
      Nick Piggin 提交于
      list_add_tail_rcu contains required barriers.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      deed62ed
    • J
      writeback: use schedule_timeout_interruptible() · 49db0414
      Jens Axboe 提交于
      Gets rid of a manual set_current_state().
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      49db0414
    • J
      writeback: add comments to bdi_work structure · 8010c3b6
      Jens Axboe 提交于
      And document its retriever, get_next_work_item().
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      8010c3b6