1. 12 8月, 2010 21 次提交
    • A
      mmc: add erase, secure erase, trim and secure trim operations · dfe86cba
      Adrian Hunter 提交于
      SD/MMC cards tend to support an erase operation.  In addition, eMMC v4.4
      cards can support secure erase, trim and secure trim operations that are
      all variants of the basic erase command.
      
      SD/MMC device attributes "erase_size" and "preferred_erase_size" have been
      added.
      
      "erase_size" is the minimum size, in bytes, of an erase operation.  For
      MMC, "erase_size" is the erase group size reported by the card.  Note that
      "erase_size" does not apply to trim or secure trim operations where the
      minimum size is always one 512 byte sector.  For SD, "erase_size" is 512
      if the card is block-addressed, 0 otherwise.
      
      SD/MMC cards can erase an arbitrarily large area up to and
      including the whole card.  When erasing a large area it may
      be desirable to do it in smaller chunks for three reasons:
      
          1. A single erase command will make all other I/O on the card
             wait.  This is not a problem if the whole card is being erased, but
             erasing one partition will make I/O for another partition on the
             same card wait for the duration of the erase - which could be a
             several minutes.
      
          2. To be able to inform the user of erase progress.
      
          3. The erase timeout becomes too large to be very useful.
             Because the erase timeout contains a margin which is multiplied by
             the size of the erase area, the value can end up being several
             minutes for large areas.
      
      "erase_size" is not the most efficient unit to erase (especially for SD
      where it is just one sector), hence "preferred_erase_size" provides a good
      chunk size for erasing large areas.
      
      For MMC, "preferred_erase_size" is the high-capacity erase size if a card
      specifies one, otherwise it is based on the capacity of the card.
      
      For SD, "preferred_erase_size" is the allocation unit size specified by
      the card.
      
      "preferred_erase_size" is in bytes.
      Signed-off-by: NAdrian Hunter <adrian.hunter@nokia.com>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      Cc: Kyungmin Park <kmpark@infradead.org>
      Cc: Madhusudhan Chikkature <madhu.cr@ti.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ben Gardiner <bengardiner@nanometrics.ca>
      Cc: <linux-mmc@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfe86cba
    • J
      mm: fix writeback_in_progress() · 81d73a32
      Jan Kara 提交于
      Commit 83ba7b07 ("writeback: simplify the write back thread queue")
      broke writeback_in_progress() as in that commit we started to remove work
      items from the list at the moment we start working on them and not at the
      moment they are finished.  Thus if the flusher thread was doing some work
      but there was no other work queued, writeback_in_progress() returned
      false.  This could in particular cause unnecessary queueing of background
      writeback from balance_dirty_pages() or writeout work from
      writeback_sb_if_idle().
      
      This patch fixes the problem by introducing a bit in the bdi state which
      indicates that the flusher thread is processing some work and uses this
      bit for writeback_in_progress() test.
      
      NOTE: Both callsites of writeback_in_progress() (namely,
      writeback_inodes_sb_if_idle() and balance_dirty_pages()) would actually
      need a different information than what writeback_in_progress() provides.
      They would need to know whether *the kind of writeback they are going to
      submit* is already queued.  But this information isn't that simple to
      provide so let's fix writeback_in_progress() for the time being.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Acked-by: NJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81d73a32
    • W
      writeback: merge for_kupdate and !for_kupdate cases · a50aeb40
      Wu Fengguang 提交于
      Unify the logic for kupdate and non-kupdate cases.  There won't be
      starvation because the inodes requeued into b_more_io will later be
      spliced _after_ the remaining inodes in b_io, hence won't stand in the way
      of other inodes in the next run.
      
      It avoids unnecessary redirty_tail() calls, hence the update of
      i_dirtied_when.  The timestamp update is undesirable because it could
      later delay the inode's periodic writeback, or may exclude the inode from
      the data integrity sync operation (which checks timestamp to avoid extra
      work and livelock).
      
      ===
      How the redirty_tail() comes about:
      
      It was a long story..  This redirty_tail() was introduced with
      wbc.more_io.  The initial patch for more_io actually does not have the
      redirty_tail(), and when it's merged, several 100% iowait bug reports
      arised:
      
      reiserfs:
              http://lkml.org/lkml/2007/10/23/93
      
      jfs:
              commit 29a424f2
              JFS: clear PAGECACHE_TAG_DIRTY for no-write pages
      
      ext2:
              http://www.spinics.net/linux/lists/linux-ext4/msg04762.html
      
      They are all old bugs hidden in various filesystems that become "visible"
      with the more_io patch.  At the time, the ext2 bug is thought to be
      "trivial", so not fixed.  Instead the following updated more_io patch with
      redirty_tail() is merged:
      
      	http://www.spinics.net/linux/lists/linux-ext4/msg04507.html
      
      This will in general prevent 100% on ext2 and possibly other unknown FS bugs.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a50aeb40
    • W
      writeback: fix queue_io() ordering · 4ea879b9
      Wu Fengguang 提交于
      This was not a bug, since b_io is empty for kupdate writeback.  The next
      patch will do requeue_io() for non-kupdate writeback, so let's fix it.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ea879b9
    • W
      writeback: don't redirty tail an inode with dirty pages · 23539afc
      Wu Fengguang 提交于
      Avoid delaying writeback for an expire inode with lots of dirty pages, but
      no active dirtier at the moment.  Previously we only do that for the
      kupdate case.
      
      Any filesystem that does delayed allocation or unwritten extent conversion
      after IO completion will cause this - for example, XFS.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23539afc
    • W
      writeback: add comment to the dirty limit functions · 1babe183
      Wu Fengguang 提交于
      Document global_dirty_limits() and bdi_dirty_limit().
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1babe183
    • W
      writeback: avoid unnecessary calculation of bdi dirty thresholds · 16c4042f
      Wu Fengguang 提交于
      Split get_dirty_limits() into global_dirty_limits()+bdi_dirty_limit(), so
      that the latter can be avoided when under global dirty background
      threshold (which is the normal state for most systems).
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16c4042f
    • W
      writeback: balance_dirty_pages(): reduce calls to global_page_state · e50e3720
      Wu Fengguang 提交于
      Reducing the number of times balance_dirty_pages calls global_page_state
      reduces the cache references and so improves write performance on a
      variety of workloads.
      
      'perf stats' of simple fio write tests shows the reduction in cache
      access.  Where the test is fio 'write,mmap,600Mb,pre_read' on AMD AthlonX2
      with 3Gb memory (dirty_threshold approx 600 Mb) running each test 10
      times, dropping the fasted & slowest values then taking the average &
      standard deviation
      
      		average (s.d.) in millions (10^6)
      2.6.31-rc8	648.6 (14.6)
      +patch		620.1 (16.5)
      
      Achieving this reduction is by dropping clip_bdi_dirty_limit as it rereads
      the counters to apply the dirty_threshold and moving this check up into
      balance_dirty_pages where it has already read the counters.
      
      Also by rearrange the for loop to only contain one copy of the limit tests
      allows the pdflush test after the loop to use the local copies of the
      counters rather than rereading them.
      
      In the common case with no throttling it now calls global_page_state 5
      fewer times and bdi_stat 2 fewer.
      
      Fengguang:
      
      This patch slightly changes behavior by replacing clip_bdi_dirty_limit()
      with the explicit check (nr_reclaimable + nr_writeback >= dirty_thresh) to
      avoid exceeding the dirty limit.  Since the bdi dirty limit is mostly
      accurate we don't need to do routinely clip.  A simple dirty limit check
      would be enough.
      
      The check is necessary because, in principle we should throttle everything
      calling balance_dirty_pages() when we're over the total limit, as said by
      Peter.
      
      We now set and clear dirty_exceeded not only based on bdi dirty limits,
      but also on the global dirty limit.  The global limit check is added in
      place of clip_bdi_dirty_limit() for safety and not intended as a behavior
      change.  The bdi limits should be tight enough to keep all dirty pages
      under the global limit at most time; occasional small exceeding should be
      OK though.  The change makes the logic more obvious: the global limit is
      the ultimate goal and shall be always imposed.
      
      We may now start background writeback work based on outdated conditions.
      That's safe because the bdi flush thread will (and have to) double check
      the states.  It reduces overall overheads because the test based on old
      states still have good chance to be right.
      
      [akpm@linux-foundation.org] fix uninitialized dirty_exceeded
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e50e3720
    • F
      parisc: fix wrong page aligned size calculation in ioremapping code · a292dfa0
      Florian Zumbiehl 提交于
      parisc __ioremap(): fix off-by-one error in page alignment of allocation
      size for sizes where size%PAGE_SIZE==1.
      Signed-off-by: NFlorian Zumbiehl <florz@florz.de>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Acked-by: NHelge Deller <deller@gmx.de>
      Tested-by: NHelge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a292dfa0
    • R
      score: fix dereference of NULL pointer in local_flush_tlb_page() · 17e46503
      Roel Kluin 提交于
      Don't dereference vma if it's NULL.
      Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      17e46503
    • R
      pc8736x_gpio: depends on X86_32 · 7b958090
      Randy Dunlap 提交于
      Fix kconfig dependency warning for PC8736x_GPIO by restricting it to
      X86_32.
      
        warning: (SCx200_GPIO && SCx200 || PC8736x_GPIO && X86) selects NSC_GPIO which has unmet direct dependencies (X86_32)
      
      NSC_GPIO is X86_32 only.  The other driver (SCx200_GPIO) that selects
      NSC_GPIO is X86_32 only (indirectly, since SCx200 depends on X86_32), so
      limit this driver also.
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Jordan Crouse <jordan.crouse@amd.com>
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b958090
    • R
      mm: fix fatal kernel-doc error · 3c111a07
      Randy Dunlap 提交于
      Fix a fatal kernel-doc error due to a #define coming between a function's
      kernel-doc notation and the function signature.  (kernel-doc cannot handle
      this)
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c111a07
    • T
      acpi: fix bogus preemption logic · 0a7992c9
      Thomas Gleixner 提交于
      The ACPI_PREEMPTION_POINT() logic was introduced in commit 8bd108d1
      (ACPICA: add preemption point after each opcode parse).  The follow up
      commits abe1dfab, 138d1569, c084ca70 tried to fix the preemption logic
      back and forth, but nobody noticed that the usage of
      in_atomic_preempt_off() in that context is wrong.
      
      The check which guards the call of cond_resched() is:
      
          if (!in_atomic_preempt_off() && !irqs_disabled())
      
      in_atomic_preempt_off() is not intended for general use as the comment
      above the macro definition clearly says:
      
       * Check whether we were atomic before we did preempt_disable():
       * (used by the scheduler, *after* releasing the kernel lock)
      
      On a CONFIG_PREEMPT=n kernel the usage of in_atomic_preempt_off() works by
      accident, but with CONFIG_PREEMPT=y it's just broken.
      
      The whole purpose of the ACPI_PREEMPTION_POINT() is to reduce the latency
      on a CONFIG_PREEMPT=n kernel, so make ACPI_PREEMPTION_POINT() depend on
      CONFIG_PREEMPT=n and remove the in_atomic_preempt_off() check.
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16210
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Francois Valenduc <francois.valenduc@tvcablenet.be>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a7992c9
    • S
      kernel/kfifo.c: add handling of chained scatterlists · d78a3eda
      Stefani Seibold 提交于
      The current kfifo scatterlist implementation will not work with chained
      scatterlists.  It assumes that struct scatterlist arrays are allocated
      contiguously, which is not the case when chained scatterlists (struct
      sg_table) are in use.
      Signed-off-by: NStefani Seibold <stefani@seibold.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d78a3eda
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 · 5af568cb
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
        isofs: Fix lseek() to position beyond 4 GB
        vfs: remove unused MNT_STRICTATIME
        vfs: show unreachable paths in getcwd and proc
        vfs: only add " (deleted)" where necessary
        vfs: add prepend_path() helper
        vfs: __d_path: dont prepend the name of the root dentry
        ia64: perfmon: add d_dname method
        vfs: add helpers to get root and pwd
        cachefiles: use path_get instead of lone dget
        fs/sysv/super.c: add support for non-PDP11 v7 filesystems
        V7: Adjust sanity checks for some volumes
        Add v7 alias
        v9fs: fixup for inode_setattr being removed
      
      Manual merge to take Al's version of the fs/sysv/super.c file: it merged
      cleanly, but Al had removed an unnecessary header include, so his side
      was better.
      5af568cb
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus · 062e27ec
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
        Squashfs: fix checkpatch.pl warnings
        Squashfs: fix filename typo
        Squashfs: update Kconfig and documentation for LZO
        Squashfs: fix block size use in LZO decompressor
        Squashfs: Add LZO compression support
        squashfs: fix filename in header comment
        Squashfs: Make XATTR config name consistent with other file systems
        squashfs: fix compiler inline warning
      062e27ec
    • L
      Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · bf25db36
      Linus Torvalds 提交于
      * 'for-linus' of git://git.open-osd.org/linux-open-osd:
        exofs: Fix groups code when num_devices is not divisible by group_width
        exofs: Remove useless optimization
        exofs: exofs_file_fsync and exofs_file_flush correctness
        exofs: Remove superfluous dependency on buffer_head and writeback
      bf25db36
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 682c30ed
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (39 commits)
        ceph: generalize mon requests, add pool op support
        ceph: only queue async writeback on cap revocation if there is dirty data
        ceph: do not ignore osd_idle_ttl mount option
        ceph: constify dentry_operations
        ceph: whitespace cleanup
        ceph: add flock/fcntl lock support
        ceph: define on-wire types, constants for file locking support
        ceph: add CEPH_FEATURE_FLOCK to the supported feature bits
        ceph: support v2 reconnect encoding
        ceph: support v2 client_caps encoding
        ceph: move AES iv definition to shared header
        ceph: fix decoding of pool snap info
        ceph: make ->sync_fs not wait if wait==0
        ceph: warn on missing snap realm
        ceph: print useful error message when crush rule not found
        ceph: use %pU to print uuid (fsid)
        ceph: sync header defs with server code
        ceph: clean up header guards
        ceph: strip misleading/obsolete version, feature info
        ceph: specify supported features in super.h
        ...
      682c30ed
    • L
      Merge branch 'msm-video' of git://codeaurora.org/quic/kernel/dwalker/linux-msm · 84479f3c
      Linus Torvalds 提交于
      * 'msm-video' of git://codeaurora.org/quic/kernel/dwalker/linux-msm:
        video: msm: Fix section mismatch in mddi.c.
        drivers: video: msm: drop some unused variables
      84479f3c
    • L
      Merge branch 'ixp4xx' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6 · 946880fa
      Linus Torvalds 提交于
      * 'ixp4xx' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6:
        IXP4xx: Fix LL debugging on little-endian CPU.
        IXP4xx: Fix sparse warnings in I/O primitives.
        IXP4xx: Make mdio_bus struct static in the Ethernet driver.
        IXP4xx: Fix ixp4xx_crypto little-endian operation.
        IXP4xx: Prevent HSS transmitter lockup by disabling FRaMe signals.
        ixp4xx/vulcan: add PCI support
        ixp4xx: base support for Arcom Vulcan
      946880fa
    • L
      Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm · 636d1742
      Linus Torvalds 提交于
      * 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (226 commits)
        ARM: 6323/1: cam60: don't use __init for cam60_spi_{flash_platform_data,partitions}
        ARM: 6324/1: cam60: move cam60_spi_devices to .init.data
        ARM: 6322/1: imx/pca100: Fix name of spi platform data
        ARM: 6321/1: fix syntax error in main Kconfig file
        ARM: 6297/1: move U300 timer to dynamic clock lookup
        ARM: 6296/1: clock U300 intcon and timer properly
        ARM: 6295/1: fix U300 apb_pclk split
        ARM: 6306/1: fix inverted MMC card detect in U300
        ARM: 6299/1: errata: TLBIASIDIS and TLBIMVAIS operations can broadcast a faulty ASID
        ARM: 6294/1: etm: do a dummy read from OSSRR during initialization
        ARM: 6292/1: coresight: add ETM management registers
        ARM: 6288/1: ftrace: document mcount formats
        ARM: 6287/1: ftrace: clean up mcount assembly indentation
        ARM: 6286/1: fix Thumb-2 decompressor broken by "Auto calculate ZRELADDR"
        ARM: 6281/1: video/imxfb.c: allow usage without BACKLIGHT_CLASS_DEVICE
        ARM: 6280/1: imx: Fix build failure when including <mach/gpio.h> without <linux/spinlock.h>
        ARM: S5PV210: Fix on missing s3c-sdhci card detection method for hsmmc3
        ARM: S5P: Fix on missing S5P_DEV_FIMC in plat-s5p/Kconfig
        ARM: S5PV210: Override FIMC driver name on Aquila board
        ARM: S5PC100: enable FIMC on SMDKC100
        ...
      
      Fix up conflicts in arch/arm/mach-{s5pc100,s5pv210}/cpu.c due to
      different subsystem 'setname' calls, and trivial port types in
      include/linux/serial_core.h
      636d1742
  2. 11 8月, 2010 19 次提交