1. 10 9月, 2010 8 次提交
    • T
      block: implement REQ_FLUSH/FUA based interface for FLUSH/FUA requests · 4fed947c
      Tejun Heo 提交于
      Now that the backend conversion is complete, export sequenced
      FLUSH/FUA capability through REQ_FLUSH/FUA flags.  REQ_FLUSH means the
      device cache should be flushed before executing the request.  REQ_FUA
      means that the data in the request should be on non-volatile media on
      completion.
      
      Block layer will choose the correct way of implementing the semantics
      and execute it.  The request may be passed to the device directly if
      the device can handle it; otherwise, it will be sequenced using one or
      more proxy requests.  Devices will never see REQ_FLUSH and/or FUA
      which it doesn't support.
      
      Also, unlike the original REQ_HARDBARRIER, REQ_FLUSH/FUA requests are
      never failed with -EOPNOTSUPP.  If the underlying device doesn't
      support FLUSH/FUA, the block layer simply make those noop.  IOW, it no
      longer distinguishes between writeback cache which doesn't support
      cache flush and writethrough/no cache.  Devices which have WB cache
      w/o flush are very difficult to come by these days and there's nothing
      much we can do anyway, so it doesn't make sense to require everyone to
      implement -EOPNOTSUPP handling.  This will simplify filesystems and
      block drivers as they can drop -EOPNOTSUPP retry logic for barriers.
      
      * QUEUE_ORDERED_* are removed and QUEUE_FSEQ_* are moved into
        blk-flush.c.
      
      * REQ_FLUSH w/o data can also be directly passed to drivers without
        sequencing but some drivers assume that zero length requests don't
        have rq->bio which isn't true for these requests requiring the use
        of proxy requests.
      
      * REQ_COMMON_MASK now includes REQ_FLUSH | REQ_FUA so that they are
        copied from bio to request.
      
      * WRITE_BARRIER is marked deprecated and WRITE_FLUSH, WRITE_FUA and
        WRITE_FLUSH_FUA are added.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4fed947c
    • T
      block: rename barrier/ordered to flush · dd4c133f
      Tejun Heo 提交于
      With ordering requirements dropped, barrier and ordered are misnomers.
      Now all block layer does is sequencing FLUSH and FUA.  Rename them to
      flush.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      dd4c133f
    • T
      block: drop barrier ordering by queue draining · 28e7d184
      Tejun Heo 提交于
      Filesystems will take all the responsibilities for ordering requests
      around commit writes and will only indicate how the commit writes
      themselves should be handled by block layers.  This patch drops
      barrier ordering by queue draining from block layer.  Ordering by
      draining implementation was somewhat invasive to request handling.
      List of notable changes follow.
      
      * Each queue has 1 bit color which is flipped on each barrier issue.
        This is used to track whether a given request is issued before the
        current barrier or not.  REQ_ORDERED_COLOR flag and coloring
        implementation in __elv_add_request() are removed.
      
      * Requests which shouldn't be processed yet for draining were stalled
        by returning -EAGAIN from blk_do_ordered() according to the test
        result between blk_ordered_req_seq() and blk_blk_ordered_cur_seq().
        This logic is removed.
      
      * Draining completion logic in elv_completed_request() removed.
      
      * All barrier sequence requests were queued to request queue and then
        trckled to lower layer according to progress and thus maintaining
        request orders during requeue was necessary.  This is replaced by
        queueing the next request in the barrier sequence only after the
        current one is complete from blk_ordered_complete_seq(), which
        removes the need for multiple proxy requests in struct request_queue
        and the request sorting logic in the ELEVATOR_INSERT_REQUEUE path of
        elv_insert().
      
      * As barriers no longer have ordering constraints, there's no need to
        dump the whole elevator onto the dispatch queue on each barrier.
        Insert barriers at the front instead.
      
      * If other barrier requests come to the front of the dispatch queue
        while one is already in progress, they are stored in
        q->pending_barriers and restored to dispatch queue one-by-one after
        each barrier completion from blk_ordered_complete_seq().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      28e7d184
    • T
      block: misc cleanups in barrier code · dd831006
      Tejun Heo 提交于
      Make the following cleanups in preparation of barrier/flush update.
      
      * blk_do_ordered() declaration is moved from include/linux/blkdev.h to
        block/blk.h.
      
      * blk_do_ordered() now returns pointer to struct request, with %NULL
        meaning "try the next request" and ERR_PTR(-EAGAIN) "try again
        later".  The third case will be dropped with further changes.
      
      * In the initialization of proxy barrier request, data direction is
        already set by init_request_from_bio().  Drop unnecessary explicit
        REQ_WRITE setting and move init_request_from_bio() above REQ_FUA
        flag setting.
      
      * add_request() is collapsed into __make_request().
      
      These changes don't make any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      dd831006
    • T
      block: remove spurious uses of REQ_HARDBARRIER · 9cbbdca4
      Tejun Heo 提交于
      REQ_HARDBARRIER is deprecated.  Remove spurious uses in the following
      users.  Please note that other than osdblk, all other uses were
      already spurious before deprecation.
      
      * osdblk: osdblk_rq_fn() won't receive any request with
        REQ_HARDBARRIER set.  Remove the test for it.
      
      * pktcdvd: use of REQ_HARDBARRIER in pkt_generic_packet() doesn't mean
        anything.  Removed.
      
      * aic7xxx_old: Setting MSG_ORDERED_Q_TAG on REQ_HARDBARRIER is
        spurious.  Removed.
      
      * sas_scsi_host: Setting TASK_ATTR_ORDERED on REQ_HARDBARRIER is
        spurious.  Removed.
      
      * scsi_tcq: The ordered tag path wasn't being used anyway.  Removed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: James Bottomley <James.Bottomley@suse.de>
      Cc: Peter Osterlund <petero2@telia.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      9cbbdca4
    • T
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo 提交于
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4913efe4
    • T
      block: kill QUEUE_ORDERED_BY_TAG · 6958f145
      Tejun Heo 提交于
      Nobody is making meaningful use of ORDERED_BY_TAG now and queue
      draining for barrier requests will be removed soon which will render
      the advantage of tag ordering moot.  Kill ORDERED_BY_TAG.  The
      following users are affected.
      
      * brd: converted to ORDERED_DRAIN.
      * virtio_blk: ORDERED_TAG path was already marked deprecated.  Removed.
      * xen-blkfront: ORDERED_TAG case dropped.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6958f145
    • T
      ide: remove unnecessary blk_queue_flushing() test in do_ide_request() · 0da2f509
      Tejun Heo 提交于
      Unplugging from a request function doesn't really help much (it's
      already in the request_fn) and soon block layer will be updated to mix
      barrier sequence with other commands, so there's no need to treat
      queue flushing any differently.
      
      ide was the only user of blk_queue_flushing().  Remove it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      0da2f509
  2. 22 8月, 2010 1 次提交
    • A
      workqueue: Add basic tracepoints to track workqueue execution · e36c886a
      Arjan van de Ven 提交于
      With the introduction of the new unified work queue thread pools,
      we lost one feature: It's no longer possible to know which worker
      is causing the CPU to wake out of idle. The result is that PowerTOP
      now reports a lot of "kworker/a:b" instead of more readable results.
      
      This patch adds a pair of tracepoints to the new workqueue code,
      similar in style to the timer/hrtimer tracepoints.
      
      With this pair of tracepoints, the next PowerTOP can correctly
      report which work item caused the wakeup (and how long it took):
      
      Interrupt (43)            i915      time   3.51ms    wakeups 141
      Work      ieee80211_iface_work      time   0.81ms    wakeups  29
      Work              do_dbs_timer      time   0.55ms    wakeups  24
      Process                   Xorg      time  21.36ms    wakeups   4
      Timer    sched_rt_period_timer      time   0.01ms    wakeups   1
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e36c886a
  3. 21 8月, 2010 2 次提交
  4. 19 8月, 2010 1 次提交
  5. 18 8月, 2010 11 次提交
    • J
      ALSA: emu10k1 - delay the PCM interrupts (add pcm_irq_delay parameter) · 56385a12
      Jaroslav Kysela 提交于
      With some hardware combinations, the PCM interrupts are acknowledged
      before the period boundary from the emu10k1 chip. The midlevel PCM code
      gets confused and the playback stream is interrupted.
      
      It seems that the interrupt processing shift by 2 samples is enough
      to fix this issue. This default value does not harm other,
      non-affected hardware.
      
      More information: Kernel bugzilla bug#16300
      
      [A copmile warning fixed by tiwai]
      Signed-off-by: NJaroslav Kysela <perex@perex.cz>
      Cc: <stable@kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      56385a12
    • N
      fs: scale files_lock · 6416ccb7
      Nick Piggin 提交于
      fs: scale files_lock
      
      Improve scalability of files_lock by adding per-cpu, per-sb files lists,
      protected with an lglock. The lglock provides fast access to the per-cpu lists
      to add and remove files. It also provides a snapshot of all the per-cpu lists
      (although this is very slow).
      
      One difficulty with this approach is that a file can be removed from the list
      by another CPU. We must track which per-cpu list the file is on with a new
      variale in the file struct (packed into a hole on 64-bit archs). Scalability
      could suffer if files are frequently removed from different cpu's list.
      
      However loads with frequent removal of files imply short interval between
      adding and removing the files, and the scheduler attempts to avoid moving
      processes too far away. Also, even in the case of cross-CPU removal, the
      hardware has much more opportunity to parallelise cacheline transfers with N
      cachelines than with 1.
      
      A worst-case test of 1 CPU allocating files subsequently being freed by N CPUs
      degenerates to contending on a single lock, which is no worse than before. When
      more than one CPU are allocating files, even if they are always freed by
      different CPUs, there will be more parallelism than the single-lock case.
      
      Testing results:
      
      On a 2 socket, 8 core opteron, I measure the number of times the lock is taken
      to remove the file, the number of times it is removed by the same CPU that
      added it, and the number of times it is removed by the same node that added it.
      
      Booting:    locks=  25049 cpu-hits=  23174 (92.5%) node-hits=  23945 (95.6%)
      kbuild -j16 locks=2281913 cpu-hits=2208126 (96.8%) node-hits=2252674 (98.7%)
      dbench 64   locks=4306582 cpu-hits=4287247 (99.6%) node-hits=4299527 (99.8%)
      
      So a file is removed from the same CPU it was added by over 90% of the time.
      It remains within the same node 95% of the time.
      
      Tim Chen ran some numbers for a 64 thread Nehalem system performing a compile.
      
                      throughput
      2.6.34-rc2      24.5
      +patch          24.9
      
                      us      sys     idle    IO wait (in %)
      2.6.34-rc2      51.25   28.25   17.25   3.25
      +patch          53.75   18.5    19      8.75
      
      So significantly less CPU time spent in kernel code, higher idle time and
      slightly higher throughput.
      
      Single threaded performance difference was within the noise of microbenchmarks.
      That is not to say penalty does not exist, the code is larger and more memory
      accesses required so it will be slightly slower.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6416ccb7
    • N
      lglock: introduce special lglock and brlock spin locks · 2dc91abe
      Nick Piggin 提交于
      lglock: introduce special lglock and brlock spin locks
      
      This patch introduces "local-global" locks (lglocks). These can be used to:
      
      - Provide fast exclusive access to per-CPU data, with exclusive access to
        another CPU's data allowed but possibly subject to contention, and to provide
        very slow exclusive access to all per-CPU data.
      - Or to provide very fast and scalable read serialisation, and to provide
        very slow exclusive serialisation of data (not necessarily per-CPU data).
      
      Brlocks are also implemented as a short-hand notation for the latter use
      case.
      
      Thanks to Paul for local/global naming convention.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2dc91abe
    • N
      tty: fix fu_list abuse · d996b62a
      Nick Piggin 提交于
      tty: fix fu_list abuse
      
      tty code abuses fu_list, which causes a bug in remount,ro handling.
      
      If a tty device node is opened on a filesystem, then the last link to the inode
      removed, the filesystem will be allowed to be remounted readonly. This is
      because fs_may_remount_ro does not find the 0 link tty inode on the file sb
      list (because the tty code incorrectly removed it to use for its own purpose).
      This can result in a filesystem with errors after it is marked "clean".
      
      Taking idea from Christoph's initial patch, allocate a tty private struct
      at file->private_data and put our required list fields in there, linking
      file and tty. This makes tty nodes behave the same way as other device nodes
      and avoid meddling with the vfs, and avoids this bug.
      
      The error handling is not trivial in the tty code, so for this bugfix, I take
      the simple approach of using __GFP_NOFAIL and don't worry about memory errors.
      This is not a problem because our allocator doesn't fail small allocs as a rule
      anyway. So proper error handling is left as an exercise for tty hackers.
      
      [ Arguably filesystem's device inode would ideally be divorced from the
      driver's pseudo inode when it is opened, but in practice it's not clear whether
      that will ever be worth implementing. ]
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d996b62a
    • N
      fs: cleanup files_lock locking · ee2ffa0d
      Nick Piggin 提交于
      fs: cleanup files_lock locking
      
      Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
      manipulate the per-sb files list; unexport the files_lock spinlock.
      
      Cc: linux-kernel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ee2ffa0d
    • N
      fs: fs_struct rwlock to spinlock · 2a4419b5
      Nick Piggin 提交于
      fs: fs_struct rwlock to spinlock
      
      struct fs_struct.lock is an rwlock with the read-side used to protect root and
      pwd members while taking references to them. Taking a reference to a path
      typically requires just 2 atomic ops, so the critical section is very small.
      Parallel read-side operations would have cacheline contention on the lock, the
      dentry, and the vfsmount cachelines, so the rwlock is unlikely to ever give a
      real parallelism increase.
      
      Replace it with a spinlock to avoid one or two atomic operations in typical
      path lookup fastpath.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2a4419b5
    • C
      remove SWRITE* I/O types · 9cb569d6
      Christoph Hellwig 提交于
      These flags aren't real I/O types, but tell ll_rw_block to always
      lock the buffer instead of giving up on a failed trylock.
      
      Instead add a new write_dirty_buffer helper that implements this semantic
      and use it from the existing SWRITE* callers.  Note that the ll_rw_block
      code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which
      this patch fixes.
      
      In the ufs code clean up the helper that used to call ll_rw_block
      to mirror sync_dirty_buffer, which is the function it implements for
      compound buffers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9cb569d6
    • C
      kill BH_Ordered flag · 87e99511
      Christoph Hellwig 提交于
      Instead of abusing a buffer_head flag just add a variant of
      sync_dirty_buffer which allows passing the exact type of write
      flag required.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      87e99511
    • E
      spi.h: missing kernel-doc notation, please fix · 5c79a5ae
      Ernst Schwab 提交于
      Added comments in kernel-doc notation for previously added struct fields.
      Signed-off-by: NErnst Schwab <eschwab@online.de>
      Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      5c79a5ae
    • D
      Make do_execve() take a const filename pointer · d7627467
      David Howells 提交于
      Make do_execve() take a const filename pointer so that kernel_execve() compiles
      correctly on ARM:
      
      arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
      
      This also requires the argv and envp arguments to be consted twice, once for
      the pointer array and once for the strings the array points to.  This is
      because do_execve() passes a pointer to the filename (now const) to
      copy_strings_kernel().  A simpler alternative would be to cast the filename
      pointer in do_execve() when it's passed to copy_strings_kernel().
      
      do_execve() may not change any of the strings it is passed as part of the argv
      or envp lists as they are some of them in .rodata, so marking these strings as
      const should be fine.
      
      Further kernel_execve() and sys_execve() need to be changed to match.
      
      This has been test built on x86_64, frv, arm and mips.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7627467
    • R
      VIDEO: amba clcd: don't disable an already disabled clock · 99c796df
      Russell King 提交于
      Fix the clock enable/disable tracking in the AMBA CLCD driver so
      that the driver doesn't try to disable an already disabled clock,
      thereby causing the clock (if shared) to become unbalanced.
      
      This resolves a problem with CLCD on LPC32xx ARM platforms.
      Reported-by: NKevin Wells <wellsk40@gmail.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      99c796df
  6. 15 8月, 2010 4 次提交
  7. 14 8月, 2010 3 次提交
  8. 13 8月, 2010 4 次提交
  9. 12 8月, 2010 6 次提交
    • A
      block: add secure discard · 8d57a98c
      Adrian Hunter 提交于
      Secure discard is the same as discard except that all copies of the
      discarded sectors (perhaps created by garbage collection) must also be
      erased.
      Signed-off-by: NAdrian Hunter <adrian.hunter@nokia.com>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      Cc: Kyungmin Park <kmpark@infradead.org>
      Cc: Madhusudhan Chikkature <madhu.cr@ti.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ben Gardiner <bengardiner@nanometrics.ca>
      Cc: <linux-mmc@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d57a98c
    • A
      mmc: add erase, secure erase, trim and secure trim operations · dfe86cba
      Adrian Hunter 提交于
      SD/MMC cards tend to support an erase operation.  In addition, eMMC v4.4
      cards can support secure erase, trim and secure trim operations that are
      all variants of the basic erase command.
      
      SD/MMC device attributes "erase_size" and "preferred_erase_size" have been
      added.
      
      "erase_size" is the minimum size, in bytes, of an erase operation.  For
      MMC, "erase_size" is the erase group size reported by the card.  Note that
      "erase_size" does not apply to trim or secure trim operations where the
      minimum size is always one 512 byte sector.  For SD, "erase_size" is 512
      if the card is block-addressed, 0 otherwise.
      
      SD/MMC cards can erase an arbitrarily large area up to and
      including the whole card.  When erasing a large area it may
      be desirable to do it in smaller chunks for three reasons:
      
          1. A single erase command will make all other I/O on the card
             wait.  This is not a problem if the whole card is being erased, but
             erasing one partition will make I/O for another partition on the
             same card wait for the duration of the erase - which could be a
             several minutes.
      
          2. To be able to inform the user of erase progress.
      
          3. The erase timeout becomes too large to be very useful.
             Because the erase timeout contains a margin which is multiplied by
             the size of the erase area, the value can end up being several
             minutes for large areas.
      
      "erase_size" is not the most efficient unit to erase (especially for SD
      where it is just one sector), hence "preferred_erase_size" provides a good
      chunk size for erasing large areas.
      
      For MMC, "preferred_erase_size" is the high-capacity erase size if a card
      specifies one, otherwise it is based on the capacity of the card.
      
      For SD, "preferred_erase_size" is the allocation unit size specified by
      the card.
      
      "preferred_erase_size" is in bytes.
      Signed-off-by: NAdrian Hunter <adrian.hunter@nokia.com>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      Cc: Kyungmin Park <kmpark@infradead.org>
      Cc: Madhusudhan Chikkature <madhu.cr@ti.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ben Gardiner <bengardiner@nanometrics.ca>
      Cc: <linux-mmc@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfe86cba
    • J
      mm: fix writeback_in_progress() · 81d73a32
      Jan Kara 提交于
      Commit 83ba7b07 ("writeback: simplify the write back thread queue")
      broke writeback_in_progress() as in that commit we started to remove work
      items from the list at the moment we start working on them and not at the
      moment they are finished.  Thus if the flusher thread was doing some work
      but there was no other work queued, writeback_in_progress() returned
      false.  This could in particular cause unnecessary queueing of background
      writeback from balance_dirty_pages() or writeout work from
      writeback_sb_if_idle().
      
      This patch fixes the problem by introducing a bit in the bdi state which
      indicates that the flusher thread is processing some work and uses this
      bit for writeback_in_progress() test.
      
      NOTE: Both callsites of writeback_in_progress() (namely,
      writeback_inodes_sb_if_idle() and balance_dirty_pages()) would actually
      need a different information than what writeback_in_progress() provides.
      They would need to know whether *the kind of writeback they are going to
      submit* is already queued.  But this information isn't that simple to
      provide so let's fix writeback_in_progress() for the time being.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Acked-by: NJens Axboe <jaxboe@fusionio.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81d73a32
    • W
      writeback: avoid unnecessary calculation of bdi dirty thresholds · 16c4042f
      Wu Fengguang 提交于
      Split get_dirty_limits() into global_dirty_limits()+bdi_dirty_limit(), so
      that the latter can be avoided when under global dirty background
      threshold (which is the normal state for most systems).
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16c4042f
    • T
      acpi: fix bogus preemption logic · 0a7992c9
      Thomas Gleixner 提交于
      The ACPI_PREEMPTION_POINT() logic was introduced in commit 8bd108d1
      (ACPICA: add preemption point after each opcode parse).  The follow up
      commits abe1dfab, 138d1569, c084ca70 tried to fix the preemption logic
      back and forth, but nobody noticed that the usage of
      in_atomic_preempt_off() in that context is wrong.
      
      The check which guards the call of cond_resched() is:
      
          if (!in_atomic_preempt_off() && !irqs_disabled())
      
      in_atomic_preempt_off() is not intended for general use as the comment
      above the macro definition clearly says:
      
       * Check whether we were atomic before we did preempt_disable():
       * (used by the scheduler, *after* releasing the kernel lock)
      
      On a CONFIG_PREEMPT=n kernel the usage of in_atomic_preempt_off() works by
      accident, but with CONFIG_PREEMPT=y it's just broken.
      
      The whole purpose of the ACPI_PREEMPTION_POINT() is to reduce the latency
      on a CONFIG_PREEMPT=n kernel, so make ACPI_PREEMPTION_POINT() depend on
      CONFIG_PREEMPT=n and remove the in_atomic_preempt_off() check.
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16210
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Francois Valenduc <francois.valenduc@tvcablenet.be>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a7992c9
    • M
      mfd: Add TPS6586x driver · c6c19332
      Mike Rapoport 提交于
      Add mfd core driver for TPS6586x PMICs family.
      The driver provides I/O access for the sub-device drivers and performs
      regstration of the sub-devices based on the platform requirements.
      In addition it implements GPIOlib interface for the chip GPIOs.
      
      TODO:
              - add interrupt support
              - add platform data for PWM, backlight leds and charger
      Signed-off-by: NMike Rapoport <mike@compulab.co.il>
      Signed-off-by: NMike Rapoport <mike.rapoport@gmail.com>
      Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>
      c6c19332