1. 28 10月, 2010 40 次提交
    • S
      Fixed Regression in NFS Direct I/O path · 568a810d
      Steve Dickson 提交于
      A typo, introduced by commit f11ac8db, in the nfs_direct_write()
      routine causes writes with O_DIRECT set to fail with a ENOMEM error.
      Found-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve Dickson <steved@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      568a810d
    • L
      Merge branch 'upstream-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 81280572
      Linus Torvalds 提交于
      * 'upstream-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (50 commits)
        ext4,jbd2: convert tracepoints to use major/minor numbers
        ext4: optimize orphan_list handling for ext4_setattr
        ext4: fix unbalanced mutex unlock in error path of ext4_li_request_new
        ext4: fix compile error in ext4_fallocate()
        ext4: move ext4_mb_{get,put}_buddy_cache_lock and make them static
        ext4: rename mark_bitmap_end() to ext4_mark_bitmap_end()
        ext4: move flush_completed_IO to fs/ext4/fsync.c and make it static
        ext4: rename {ext,idx}_pblock and inline small extent functions
        ext4: make various ext4 functions be static
        ext4: rename {exit,init}_ext4_*() to ext4_{exit,init}_*()
        ext4: fix kernel oops if the journal superblock has a non-zero j_errno
        ext4: update writeback_index based on last page scanned
        ext4: implement writeback livelock avoidance using page tagging
        ext4: tidy up a void argument in inode.c
        ext4: add batched_discard into ext4 feature list
        ext4: Add batched discard support for ext4
        fs: Add FITRIM ioctl
        ext4: Use return value from sb_issue_discard()
        ext4: Check return value of sb_getblk() and friends
        ext4: use bio layer instead of buffer layer in mpage_da_submit_io
        ...
      81280572
    • T
      Merge branch 'next' into upstream-merge · a107e5a3
      Theodore Ts'o 提交于
      Conflicts:
      	fs/ext4/inode.c
      	fs/ext4/mballoc.c
      	include/trace/events/ext4.h
      a107e5a3
    • L
      Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · b83db1de
      Linus Torvalds 提交于
      * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
        drm/radeon/kms: enable unmappable vram for evergreen
        drm/radeon/kms: fix tiled db height calculation on 6xx/7xx
        drm/radeon/kms: fix handling of tex lookup disable in cs checker on r2xx
      b83db1de
    • L
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 · 7d2f280e
      Linus Torvalds 提交于
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (24 commits)
        quota: Fix possible oops in __dquot_initialize()
        ext3: Update kernel-doc comments
        jbd/2: fixed typos
        ext2: fixed typo.
        ext3: Fix debug messages in ext3_group_extend()
        jbd: Convert atomic_inc() to get_bh()
        ext3: Remove misplaced BUFFER_TRACE() in ext3_truncate()
        jbd: Fix debug message in do_get_write_access()
        jbd: Check return value of __getblk()
        ext3: Use DIV_ROUND_UP() on group desc block counting
        ext3: Return proper error code on ext3_fill_super()
        ext3: Remove unnecessary casts on bh->b_data
        ext3: Cleanup ext3_setup_super()
        quota: Fix issuing of warnings from dquot_transfer
        quota: fix dquot_disable vs dquot_transfer race v2
        jbd: Convert bitops to buffer fns
        ext3/jbd: Avoid WARN() messages when failing to write the superblock
        jbd: Use offset_in_page() instead of manual calculation
        jbd: Remove unnecessary goto statement
        jbd: Use printk_ratelimited() in journal_alloc_journal_head()
        ...
      7d2f280e
    • T
      ext4,jbd2: convert tracepoints to use major/minor numbers · a269029d
      Theodore Ts'o 提交于
      Unfortunately perf can't deal with anything other than direct structure
      accesses in the TP_printk() section.  It will drop dead when it sees
      jbd2_dev_to_name() in the "print fmt" section of the tracepoint.
      
      Addresses-Google-Bug: 3138508
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a269029d
    • D
      ext4: optimize orphan_list handling for ext4_setattr · 3d287de3
      Dmitry Monakhov 提交于
      Surprisingly chown() on ext4 is not SMP scalable operation. 
      Due to unconditional orphan_del(NULL, inode) in ext4_setattr()
      result in significant performance overhead because of global orphan
      mutex, especially in no-journal mode (where orphan_add() is noop).
      It is possible to skip explicit orphan_del if possible.
      Results of fchown() micro-benchmark in no-journal mode
      while (1) {
         iteration++;
         fchown(fd, uid, gid);
         fchown(fd, uid + 1, gid + 1)
      }
      measured: iterations per millisecond
      | nr_tasks | w/o patch | with patch |
      |        1 |       142 |        185 |
      |        4 |       109 |        642 |
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3d287de3
    • N
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx · e3e1288e
      Linus Torvalds 提交于
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (48 commits)
        DMAENGINE: move COH901318 to arch_initcall
        dma: imx-dma: fix signedness bug
        dma/timberdale: simplify conditional
        ste_dma40: remove channel_type
        ste_dma40: remove enum for endianess
        ste_dma40: remove TIM_FOR_LINK option
        ste_dma40: move mode_opt to separate config
        ste_dma40: move channel mode to a separate field
        ste_dma40: move priority to separate field
        ste_dma40: add variable to indicate valid dma_cfg
        async_tx: make async_tx channel switching opt-in
        move async raid6 test to lib/Kconfig.debug
        dmaengine: Add Freescale i.MX1/21/27 DMA driver
        intel_mid_dma: change the slave interface
        intel_mid_dma: fix the WARN_ONs
        intel_mid_dma: Add sg list support to DMA driver
        intel_mid_dma: Allow DMAC2 to share interrupt
        intel_mid_dma: Allow IRQ sharing
        intel_mid_dma: Add runtime PM support
        DMAENGINE: define a dummy filter function for ste_dma40
        ...
      e3e1288e
    • L
      Merge branch 'viafb-next' of git://github.com/schandinat/linux-2.6 · 9ae6d039
      Linus Torvalds 提交于
      * 'viafb-next' of git://github.com/schandinat/linux-2.6: (29 commits)
        viafb: add initial VX900 support
        viafb: fix hardware acceleration for suspend & resume
        viafb: make suspend and resume work (on all machines?)
        viafb: restore display on resume
        Minimal support for viafb suspend/resume
        viafb: use proper register for colour when doing fill ops
        viafb: add documentation for proc interface
        viafb: rename output devices
        viafb: add a mapping of supported output devices
        viafb: set sync polarity for all output devices
        viafb: add function to change sync polarity per device
        viafb: reduce I2C timeout and delay
        viafb: enable I2C for CRT
        viafb: fix i2c_transfer error handling
        viafb: vt1636 cleanup
        viafb: introduce per output device power management
        viafb: limit LCD code impact
        viafb: add interface for output device configuration
        viafb: merge the remaining output path with enable functions
        viafb: use new device routing
        ...
      9ae6d039
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300 · bdab2250
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300: (44 commits)
        MN10300: Save frame pointer in thread_info struct rather than global var
        MN10300: Change "Matsushita" to "Panasonic".
        MN10300: Create a defconfig for the ASB2364 board
        MN10300: Update the ASB2303 defconfig
        MN10300: ASB2364: Add support for SMSC911X and SMC911X
        MN10300: ASB2364: Handle the IRQ multiplexer in the FPGA
        MN10300: Generic time support
        MN10300: Specify an ELF HWCAP flag for MN10300 Atomic Operations Unit support
        MN10300: Map userspace atomic op regs as a vmalloc page
        MN10300: And Panasonic AM34 subarch and implement SMP
        MN10300: Delete idle_timestamp from irq_cpustat_t
        MN10300: Make various interrupt priority settings configurable
        MN10300: Optimise do_csum()
        MN10300: Implement atomic ops using atomic ops unit
        MN10300: Make the FPU operate in non-lazy mode under SMP
        MN10300: SMP TLB flushing
        MN10300: Use the [ID]PTEL2 registers rather than [ID]PTEL for TLB control
        MN10300: Make the use of PIDR to mark TLB entries controllable
        MN10300: Rename __flush_tlb*() to local_flush_tlb*()
        MN10300: AM34 erratum requires MMUCTR read and write on exception entry
        ...
      bdab2250
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · 7c5814c7
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
        ALSA: usb-audio: automatically detect feedback format
        ASoC: sound/wm9090: add missing __devexit marker
        ASoC: sound/max98088: add missing __devexit marker
        ASoC: sound/ad73311: add missing __devexit marker
        ASoC: fsl - fix build error in pcm030-audio-fabric.c
        sound/oss/sb_ess.c: delete double assignment
        ALSA: hda - Change BTL amp level on some HP notebooks
      7c5814c7
    • L
      Merge branch 'perf-fixes-for-linus' of... · a042e261
      Linus Torvalds 提交于
      Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
        perf python scripting: Add futex-contention script
        perf python scripting: Fixup cut'n'paste error in sctop script
        perf scripting: Shut up 'perf record' final status
        perf record: Remove newline character from perror() argument
        perf python scripting: Support fedora 11 (audit 1.7.17)
        perf python scripting: Improve the syscalls-by-pid script
        perf python scripting: print the syscall name on sctop
        perf python scripting: Improve the syscalls-counts script
        perf python scripting: Improve the failed-syscalls-by-pid script
        kprobes: Remove redundant text_mutex lock in optimize
        x86/oprofile: Fix uninitialized variable use in debug printk
        tracing: Fix 'faild' -> 'failed' typo
        perf probe: Fix format specified for Dwarf_Off parameter
        perf trace: Fix detection of script extension
        perf trace: Use $PERF_EXEC_PATH in canned report scripts
        perf tools: Document event modifiers
        perf tools: Remove direct slang.h include
        perf_events: Fix for transaction recovery in group_sched_in()
        perf_events: Revert: Fix transaction recovery in group_sched_in()
        perf, x86: Use NUMA aware allocations for PEBS/BTS/DS allocations
        ...
      a042e261
    • L
      Merge branch 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus · f66dd539
      Linus Torvalds 提交于
      * 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
        NULL-terminate all pci_device_id tables
        (trivial) Fix compiler warning in kernel/modules.c
      f66dd539
    • L
      Merge branch 'akpm-incoming-2' · 17bb51d5
      Linus Torvalds 提交于
      * akpm-incoming-2: (139 commits)
        epoll: make epoll_wait() use the hrtimer range feature
        select: rename estimate_accuracy() to select_estimate_accuracy()
        Remove duplicate includes from many files
        ramoops: use the platform data structure instead of module params
        kernel/resource.c: handle reinsertion of an already-inserted resource
        kfifo: fix kfifo_alloc() to return a signed int value
        w1: don't allow arbitrary users to remove w1 devices
        alpha: remove dma64_addr_t usage
        mips: remove dma64_addr_t usage
        sparc: remove dma64_addr_t usage
        fuse: use release_pages()
        taskstats: use real microsecond granularity for CPU times
        taskstats: split fill_pid function
        taskstats: separate taskstats commands
        delayacct: align to 8 byte boundary on 64-bit systems
        delay-accounting: reimplement -c for getdelays.c to report information on a target command
        namespaces Kconfig: move namespace menu location after the cgroup
        namespaces Kconfig: remove the cgroup device whitelist experimental tag
        namespaces Kconfig: remove pointless cgroup dependency
        namespaces Kconfig: make namespace a submenu
        ...
      17bb51d5
    • L
      Merge branch 'x86-fixes-for-linus' of... · 0671b767
      Linus Torvalds 提交于
      Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        percpu: Remove the multi-page alignment facility
        x86-32: Allocate irq stacks seperate from percpu area
        x86-32, mm: Remove duplicated #include
        x86, printk: Get rid of <0> from stack output
        x86, kexec: Make sure to stop all CPUs before exiting the kernel
        x86/vsmp: Eliminate kconfig dependency warning
      0671b767
    • L
      proc_bus_pci_ioctl: remove pointless BKL usage · 0b2d8d9e
      Linus Torvalds 提交于
      The BKL was pushed into this function when it was converted to use the
      unlocked_ioctl interface, but nothing that the function touches is
      actually protected by the BKL.  So just remove the BKL entirely, so that
      we finally can get a realistic system build without the BKL being
      enabled at all.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b2d8d9e
    • K
      ext4: fix compile error in ext4_fallocate() · a6371b63
      Kazuya Mio 提交于
      When I compiled 2.6.36-rc3 kernel with EXT4FS_DEBUG definition, I got
      the following compile error.
      
        CC [M]  fs/ext4/extents.o
      fs/ext4/extents.c: In function 'ext4_fallocate':
      fs/ext4/extents.c:3772: error: 'block' undeclared (first use in this function)
      fs/ext4/extents.c:3772: error: (Each undeclared identifier is reported only once
      fs/ext4/extents.c:3772: error: for each function it appears in.)
      make[2]: *** [fs/ext4/extents.o] Error 1
      
      The patch fixes this problem.
      Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a6371b63
    • E
      ext4: move ext4_mb_{get,put}_buddy_cache_lock and make them static · eee4adc7
      Eric Sandeen 提交于
      These functions are only used within fs/ext4/mballoc.c, so move them
      so they are used after they are defined, and then make them be static.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      eee4adc7
    • T
      ext4: rename mark_bitmap_end() to ext4_mark_bitmap_end() · 61d08673
      Theodore Ts'o 提交于
      Fix a namespace leak from fs/ext4
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      61d08673
    • T
      ext4: move flush_completed_IO to fs/ext4/fsync.c and make it static · 4a873a47
      Theodore Ts'o 提交于
      Fix a namespace leak by moving the function to the file where it is
      used and making it static.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      4a873a47
    • T
      ext4: rename {ext,idx}_pblock and inline small extent functions · bf89d16f
      Theodore Ts'o 提交于
      Cleanup namespace leaks from fs/ext4 and the inline trivial functions
      ext4_{ext,idx}_pblock() and ext4_{ext,idx}_store_pblock() since the
      code size actually shrinks when we make these functions inline,
      they're so trivial.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bf89d16f
    • T
      ext4: make various ext4 functions be static · 1f109d5a
      Theodore Ts'o 提交于
      These functions have no need to be exported beyond file context.
      
      No functions needed to be moved for this commit; just some function
      declarations changed to be static and removed from header files.
      
      (A similar patch was submitted by Eric Sandeen, but I wanted to handle
      code movement in separate patches to make sure code changes didn't
      accidentally get dropped.)
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1f109d5a
    • T
      ext4: rename {exit,init}_ext4_*() to ext4_{exit,init}_*() · 5dabfc78
      Theodore Ts'o 提交于
      This is a cleanup to avoid namespace leaks out of fs/ext4
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      5dabfc78
    • T
      ext4: fix kernel oops if the journal superblock has a non-zero j_errno · 7f93cff9
      Theodore Ts'o 提交于
      Commit 84061e07 fixed an accounting bug only to introduce the
      possibility of a kernel OOPS if the journal has a non-zero j_errno
      field indicating that the file system had detected a fs inconsistency.
      After the journal replay, if the journal superblock indicates that the
      file system has an error, this indication is transfered to the file
      system and then ext4_commit_super() is called to write this to the
      disk.
      
      But since the percpu counters are now initialized after the journal
      replay, the call to ext4_commit_super() will cause a kernel oops since
      it needs to use the percpu counters the ext4 superblock structure.
      
      The fix is to skip setting the ext4 free block and free inode fields
      if the percpu counter has not been set.
      
      Thanks to Ken Sumrall for reporting and analyzing the root causes of
      this bug.
      
      Addresses-Google-Bug: #3054080
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7f93cff9
    • E
      ext4: update writeback_index based on last page scanned · 72f84e65
      Eric Sandeen 提交于
      As pointed out in a prior patch, updating the mapping's
      writeback_index based on pages written isn't quite right;
      what the writeback index is really supposed to reflect is
      the next page which should be scanned for writeback during
      periodic flush.
      
      As in write_cache_pages(), write_cache_pages_da() does
      this scanning for us as we assemble the mpd for later
      writeout.  If we keep track of the next page after the
      current scan, we can easily update writeback_index without
      worrying about pages written vs. pages skipped, etc.
      
      Without this, an fsync will reset writeback_index to
      0 (its starting index) + however many pages it wrote, which
      can mess up the progress of periodic flush.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      72f84e65
    • E
      ext4: implement writeback livelock avoidance using page tagging · 5b41d924
      Eric Sandeen 提交于
      This is analogous to Jan Kara's commit,
      f446daae
      mm: implement writeback livelock avoidance using page tagging
      
      but since we forked write_cache_pages, we need to reimplement
      it there (and in ext4_da_writepages, since range_cyclic handling
      was moved to there)
      
      If you start a large buffered IO to a file, and then set
      fsync after it, you'll find that fsync does not complete
      until the other IO stops.
      
      If you continue re-dirtying the file (say, putting dd
      with conv=notrunc in a loop), when fsync finally completes
      (after all IO is done), it reports via tracing that
      it has written many more pages than the file contains;
      in other words it has synced and re-synced pages in
      the file multiple times.
      
      This then leads to problems with our writeback_index
      update, since it advances it by pages written, and
      essentially sets writeback_index off the end of the
      file...
      
      With the following patch, we only sync as much as was
      dirty at the time of the sync.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      5b41d924
    • E
      ext4: tidy up a void argument in inode.c · bbd08344
      Eric Sandeen 提交于
      This doesn't fix anything at all, it just removes a vestige
      of prior use from __mpage_da_writepage()
      
      __mpage_da_writepage() had a *void argument leftover from
      its previous life as a callback; make it reflect the actual type.
      
      Fixing this up makes it slightly more obvious to read, and 
      enables proper typechecking.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bbd08344
    • L
      ext4: add batched_discard into ext4 feature list · 27ee40df
      Lukas Czerner 提交于
      Should be applied on the top of "lazy inode table initialization"
      and "batched discard support" patch-sets.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      27ee40df
    • L
      ext4: Add batched discard support for ext4 · 7360d173
      Lukas Czerner 提交于
      Walk through allocation groups and trim all free extents. It can be
      invoked through FITRIM ioctl on the file system. The main idea is to
      provide a way to trim the whole file system if needed, since some SSD's
      may suffer from performance loss after the whole device was filled (it
      does not mean that fs is full!).
      
      It search for free extents in allocation groups specified by Byte range
      start -> start+len. When the free extent is within this range, blocks
      are marked as used and then trimmed. Afterwards these blocks are marked
      as free in per-group bitmap.
      
      Since fstrim is a long operation it is good to have an ability to
      interrupt it by a signal. This was added by Dmitry Monakhov.
      Thanks Dimitry.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      7360d173
    • L
      fs: Add FITRIM ioctl · 367a51a3
      Lukas Czerner 提交于
      Adds an filesystem independent ioctl to allow implementation of file
      system batched discard support. I takes fstrim_range structure as an
      argument. fstrim_range is definec in the include/fs.h and its
      definition is as follows.
      
      struct fstrim_range {
      	start;
      	len;
      	minlen;
      }
      
      start	- first Byte to trim
      len	- number of Bytes to trim from start
      minlen	- minimum extent length to trim, free extents shorter than this
      	  number of Bytes will be ignored. This will be rounded up to fs
      	  block size.
      
      It is also possible to specify NULL as an argument. In this case the
      arguments will set itself as follows:
      
      start = 0;
      len = ULLONG_MAX;
      minlen = 0;
      
      So it will trim the whole file system at one run.
      
      After the FITRIM is done, the number of actually discarded Bytes is stored
      in fstrim_range.len to give the user better insight on how much storage
      space has been really released for wear-leveling.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      367a51a3
    • L
      ext4: Use return value from sb_issue_discard() · 77ca6cdf
      Lukas Czerner 提交于
      Use return value from sb_issue_discard() as return value in
      ext4_issue_discard(). Since sb_issue_discard() may result in more
      serious errors than just -EOPNOTSUPP it is worth to inform user of this
      function about them to handle error cases properly.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      77ca6cdf
    • N
      ext4: Check return value of sb_getblk() and friends · 87783690
      Namhyung Kim 提交于
      Fail block allocation if sb_getblk() returns NULL. In that case,
      sb_find_get_block() also likely to fail so that it should skip
      calling ext4_forget().
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      87783690
    • T
      ext4: use bio layer instead of buffer layer in mpage_da_submit_io · bd2d0210
      Theodore Ts'o 提交于
      Call the block I/O layer directly instad of going through the buffer
      layer.  This should give us much better performance and scalability,
      as well as lowering our CPU utilization when doing buffered writeback.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      bd2d0210
    • T
      ext4: move mpage_put_bnr_to_bhs()'s functionality to mpage_da_submit_io() · 1de3e3df
      Theodore Ts'o 提交于
      This massively simplifies the ext4_da_writepages() code path by
      completely removing mpage_put_bnr_bhs(), which is almost 100 lines of
      code iterating over a set of pages using pagevec_lookup(), and folds
      that functionality into mpage_da_submit_io()'s existing
      pagevec_lookup() loop.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      1de3e3df
    • T
      ext4: inline walk_page_buffers() into mpage_da_submit_io · 3ecdb3a1
      Theodore Ts'o 提交于
      Expand the call:
      
        if (walk_page_buffers(NULL, page_bufs, 0, len, NULL,
                              ext4_bh_delay_or_unwritten))
      	goto redirty_page
      
      into mpage_da_submit_io().
      
      This will allow us to merge in mpage_put_bnr_to_bhs() in the next
      patch.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3ecdb3a1
    • T
      ext4: inline ext4_writepage() into mpage_da_submit_io() · cb20d518
      Theodore Ts'o 提交于
      As a prepratory step to switching to bio_submit, inline
      ext4_writepage() into mpage_da_submit() and then simplify things a
      bit.  This makes it clearer what mpage_da_submit needs to do.
      
      Also, move the ClearPageChecked(page) call into
      __ext4_journalled_writepage(), as a minor bit of cleanup refactoring.
      
      This also allows us to pull i_size_read() and
      ext4_should_journal_data() out of the loop, which should be a very
      minor CPU savings.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cb20d518
    • T
      ext4: simplify ext4_writepage() · a42afc5f
      Theodore Ts'o 提交于
      The actual code in ext4_writepage() is unnecessarily convoluted.
      Simplify it so it is easier to understand, but otherwise logically
      equivalent.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a42afc5f
    • T
      ext4: call mpage_da_submit_io() from mpage_da_map_blocks() · 5a87b7a5
      Theodore Ts'o 提交于
      Eventually we need to completely reorganize the ext4 writepage
      callpath, but for now, we simplify things a little by calling
      mpage_da_submit_io() from mpage_da_map_blocks(), since all of the
      places where we call mpage_da_map_blocks() it is followed up by a call
      to mpage_da_submit_io().
      
      We're also a wee bit better with respect to error handling, but there
      are still a number of issues where it's not clear what the right thing
      is to do with ext4 functions deep in the writeback codepath fails.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      5a87b7a5
    • T
      ext4: use KMEM_CACHE instead of kmem_cache_create · 16828088
      Theodore Ts'o 提交于
      Also remove the SLAB_RECLAIM_ACCOUNT flag from the system zone kmem
      cache.  This slab tends to be fairly static, so it shouldn't be marked
      as likely to have free pages that can be reclaimed.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      16828088