1. 31 1月, 2018 1 次提交
    • M
      blk-mq: introduce BLK_STS_DEV_RESOURCE · 86ff7c2a
      Ming Lei 提交于
      This status is returned from driver to block layer if device related
      resource is unavailable, but driver can guarantee that IO dispatch
      will be triggered in future when the resource is available.
      
      Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
      returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
      a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
      3 ms because both scsi-mq and nvmefc are using that magic value.
      
      If a driver can make sure there is in-flight IO, it is safe to return
      BLK_STS_DEV_RESOURCE because:
      
      1) If all in-flight IOs complete before examining SCHED_RESTART in
      blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
      is run immediately in this case by blk_mq_dispatch_rq_list();
      
      2) if there is any in-flight IO after/when examining SCHED_RESTART
      in blk_mq_dispatch_rq_list():
      - if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
      - otherwise, this request will be dispatched after any in-flight IO is
        completed via blk_mq_sched_restart()
      
      3) if SCHED_RESTART is set concurently in context because of
      BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
      cases and make sure IO hang can be avoided.
      
      One invariant is that queue will be rerun if SCHED_RESTART is set.
      Suggested-by: NJens Axboe <axboe@kernel.dk>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86ff7c2a
  2. 26 1月, 2018 3 次提交
    • C
      regulator: add PM suspend and resume hooks · f7efad10
      Chunyan Zhang 提交于
      In this patch, consumers are allowed to set suspend voltage, and this
      actually just set the "uV" in constraint::regulator_state, when the
      regulator_suspend_late() was called by PM core through callback when
      the system is entering into suspend, the regulator device would act
      suspend activity then.
      
      And it assumes that if any consumer set suspend voltage, the regulator
      device should be enabled in the suspend state.  And if the suspend
      voltage of a regulator device for all consumers was set zero, the
      regulator device would be off in the suspend state.
      
      This patch also provides a new function hook to regulator devices for
      resuming from suspend states.
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      f7efad10
    • C
      regulator: empty the old suspend functions · aa27bbc6
      Chunyan Zhang 提交于
      Regualtor suspend/resume functions should only be called by PM suspend
      core via registering dev_pm_ops, and regulator devices should implement
      the callback functions.  Thus, any regulator consumer shouldn't call
      the regulator suspend/resume functions directly.
      
      In order to avoid compile errors, two empty functions with the same name
      still be left for the time being.
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      aa27bbc6
    • C
      regulator: leave one item to record whether regulator is enabled · 72069f99
      Chunyan Zhang 提交于
      The items "disabled" and "enabled" are a little redundant, since only one
      of them would be set to record if the regulator device should keep on
      or be switched to off in suspend states.
      
      So in this patch, the "disabled" was removed, only leave the "enabled":
        - enabled == 1 for regulator-on-in-suspend
        - enabled == 0 for regulator-off-in-suspend
        - enabled == -1 means do nothing when entering suspend mode.
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Signed-off-by: NMark Brown <broonie@kernel.org>
      72069f99
  3. 25 1月, 2018 1 次提交
  4. 24 1月, 2018 2 次提交
  5. 22 1月, 2018 1 次提交
  6. 20 1月, 2018 4 次提交
  7. 19 1月, 2018 1 次提交
  8. 18 1月, 2018 2 次提交
  9. 17 1月, 2018 5 次提交
  10. 16 1月, 2018 4 次提交
    • A
      blkcg: simplify statistic accumulation code · ddc21231
      Arnd Bergmann 提交于
      Some older compilers (gcc-4.4 through 4.6 in particular) struggle
      with the way that blkg_rwstat_read() returns a structure, leading
      to excessive stack usage and rather inefficient code:
      
      block/blk-cgroup.c: In function 'blkg_destroy':
      block/blk-cgroup.c:354:1: error: the frame size of 1296 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      block/cfq-iosched.c: In function 'cfqg_stats_add_aux':
      block/cfq-iosched.c:753:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      block/bfq-cgroup.c: In function 'bfqg_stats_add_aux':
      block/bfq-cgroup.c:299:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      
      I also notice that there is no point in using atomic accesses
      for the local variables, so storing the temporaries in simple 'u64'
      variables not only avoids the stack usage on older compilers but
      also improves the object code on modern versions.
      
      Fixes: e6269c44 ("blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it")
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ddc21231
    • J
      delayacct: Account blkio completion on the correct task · c96f5471
      Josh Snyder 提交于
      Before commit:
      
        e33a9bba ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
      
      delayacct_blkio_end() was called after context-switching into the task which
      completed I/O.
      
      This resulted in double counting: the task would account a delay both waiting
      for I/O and for time spent in the runqueue.
      
      With e33a9bba, delayacct_blkio_end() is called by try_to_wake_up().
      In ttwu, we have not yet context-switched. This is more correct, in that
      the delay accounting ends when the I/O is complete.
      
      But delayacct_blkio_end() relies on 'get_current()', and we have not yet
      context-switched into the task whose I/O completed. This results in the
      wrong task having its delay accounting statistics updated.
      
      Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(),
      so that it can update the statistics of the correct task.
      Signed-off-by: NJosh Snyder <joshs@netflix.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: Brendan Gregg <bgregg@netflix.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-block@vger.kernel.org
      Fixes: e33a9bba ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
      Link: http://lkml.kernel.org/r/1513613712-571-1-git-send-email-joshs@netflix.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c96f5471
    • J
      netlink: extack: avoid parenthesized string constant warning · 6311b7ce
      Johannes Berg 提交于
      NL_SET_ERR_MSG() and NL_SET_ERR_MSG_ATTR() lead to the following warning
      in newer versions of gcc:
        warning: array initialized from parenthesized string constant
      
      Just remove the parentheses, they're not needed in this context since
      anyway since there can be no operator precendence issues or similar.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6311b7ce
    • M
      ptr_ring: document usage around __ptr_ring_peek · 66940f35
      Michael S. Tsirkin 提交于
      This explains why is the net usage of __ptr_ring_peek
      actually ok without locks.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66940f35
  11. 15 1月, 2018 1 次提交
    • M
      block: allow gendisk's request_queue registration to be deferred · fa70d2e2
      Mike Snitzer 提交于
      Since I can remember DM has forced the block layer to allow the
      allocation and initialization of the request_queue to be distinct
      operations.  Reason for this is block/genhd.c:add_disk() has requires
      that the request_queue (and associated bdi) be tied to the gendisk
      before add_disk() is called -- because add_disk() also deals with
      exposing the request_queue via blk_register_queue().
      
      DM's dynamic creation of arbitrary device types (and associated
      request_queue types) requires the DM device's gendisk be available so
      that DM table loads can establish a master/slave relationship with
      subordinate devices that are referenced by loaded DM tables -- using
      bd_link_disk_holder().  But until these DM tables, and their associated
      subordinate devices, are known DM cannot know what type of request_queue
      it needs -- nor what its queue_limits should be.
      
      This chicken and egg scenario has created all manner of problems for DM
      and, at times, the block layer.
      
      Summary of changes:
      
      - Add device_add_disk_no_queue_reg() and add_disk_no_queue_reg() variant
        that drivers may use to add a disk without also calling
        blk_register_queue().  Driver must call blk_register_queue() once its
        request_queue is fully initialized.
      
      - Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
        is not set.  It won't be set if driver used add_disk_no_queue_reg()
        but driver encounters an error and must del_gendisk() before calling
        blk_register_queue().
      
      - Export blk_register_queue().
      
      These changes allow DM to use add_disk_no_queue_reg() to anchor its
      gendisk as the "master" for master/slave relationships DM must establish
      with subordinate devices referenced in DM tables that get loaded.  Once
      all "slave" devices for a DM device are known its request_queue can be
      properly initialized and then advertised via sysfs -- important
      improvement being that no request_queue resource initialization
      performed by blk_register_queue() is missed for DM devices anymore.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      fa70d2e2
  12. 14 1月, 2018 1 次提交
  13. 13 1月, 2018 1 次提交
    • K
      kdump: Write the correct address of mem_section into vmcoreinfo · 9f15b912
      Kirill A. Shutemov 提交于
      Depending on configuration mem_section can now be an array or a pointer
      to an array allocated dynamically. In most cases, we can continue to refer
      to it as 'mem_section' regardless of what it is.
      
      But there's one exception: '&mem_section' means "address of the array" if
      mem_section is an array, but if mem_section is a pointer, it would mean
      "address of the pointer".
      
      We've stepped onto this in the kdump code: VMCOREINFO_SYMBOL(mem_section)
      writes down the address of pointer into vmcoreinfo, not the array as we wanted,
      breaking kdump.
      
      Let's introduce VMCOREINFO_SYMBOL_ARRAY() that would handle the
      situation correctly for both cases.
      
      Mike Galbraith <efault@gmx.de>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NBaoquan He <bhe@redhat.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org
      Fixes: 83e3c487 ("mm/sparsemem: Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y")
      Link: http://lkml.kernel.org/r/20180112162532.35896-1-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9f15b912
  14. 12 1月, 2018 5 次提交
  15. 11 1月, 2018 5 次提交
  16. 10 1月, 2018 3 次提交