1. 22 9月, 2018 1 次提交
    • O
      block: use nanosecond resolution for iostat · b57e99b4
      Omar Sandoval 提交于
      Klaus Kusche reported that the I/O busy time in /proc/diskstats was not
      updating properly on 4.18. This is because we started using ktime to
      track elapsed time, and we convert nanoseconds to jiffies when we update
      the partition counter. However, this gets rounded down, so any I/Os that
      take less than a jiffy are not accounted for. Previously in this case,
      the value of jiffies would sometimes increment while we were doing I/O,
      so at least some I/Os were accounted for.
      
      Let's convert the stats to use nanoseconds internally. We still report
      milliseconds as before, now more accurately than ever. The value is
      still truncated to 32 bits for backwards compatibility.
      
      Fixes: 522a7775 ("block: consolidate struct request timestamp fields")
      Cc: stable@vger.kernel.org
      Reported-by: NKlaus Kusche <klaus.kusche@computerix.info>
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b57e99b4
  2. 20 9月, 2018 3 次提交
  3. 17 9月, 2018 1 次提交
  4. 13 9月, 2018 1 次提交
    • J
      null_blk: fix zoned support for non-rq based operation · b228ba1c
      Jens Axboe 提交于
      The supported added for zones in null_blk seem to assume that only rq
      based operation is possible. But this depends on the queue_mode setting,
      if this is set to 0, then cmd->bio is what we need to be operating on.
      Right now any attempt to load null_blk with queue_mode=0 will
      insta-crash, since cmd->rq is NULL and null_handle_cmd() assumes it to
      always be set.
      
      Make the zoned code deal with bio's instead, or pass in the
      appropriate sector/nr_sectors instead.
      
      Fixes: ca4b2a01 ("null_blk: add zone support")
      Tested-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b228ba1c
  5. 12 9月, 2018 1 次提交
  6. 10 9月, 2018 1 次提交
  7. 07 9月, 2018 1 次提交
  8. 06 9月, 2018 2 次提交
  9. 05 9月, 2018 1 次提交
  10. 01 9月, 2018 3 次提交
    • D
      blkcg: use tryget logic when associating a blkg with a bio · 31118850
      Dennis Zhou (Facebook) 提交于
      There is a very small change a bio gets caught up in a really
      unfortunate race between a task migration, cgroup exiting, and itself
      trying to associate with a blkg. This is due to css offlining being
      performed after the css->refcnt is killed which triggers removal of
      blkgs that reach their blkg->refcnt of 0.
      
      To avoid this, association with a blkg should use tryget and fallback to
      using the root_blkg.
      
      Fixes: 08e18eab ("block: add bi_blkg to the bio for cgroups")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      31118850
    • D
      blkcg: delay blkg destruction until after writeback has finished · 59b57717
      Dennis Zhou (Facebook) 提交于
      Currently, blkcg destruction relies on a sequence of events:
        1. Destruction starts. blkcg_css_offline() is called and blkgs
           release their reference to the blkcg. This immediately destroys
           the cgwbs (writeback).
        2. With blkgs giving up their reference, the blkcg ref count should
           become zero and eventually call blkcg_css_free() which finally
           frees the blkcg.
      
      Jiufei Xue reported that there is a race between blkcg_bio_issue_check()
      and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent
      on the completion of all writeback associated with the blkcg. A count of
      the number of cgwbs is maintained and once that goes to zero, blkg
      destruction can follow. This should prevent premature blkg destruction
      related to writeback.
      
      The new process for blkcg cleanup is as follows:
        1. Destruction starts. blkcg_css_offline() is called which offlines
           writeback. Blkg destruction is delayed on the cgwb_refcnt count to
           avoid punting potentially large amounts of outstanding writeback
           to root while maintaining any ongoing policies. Here, the base
           cgwb_refcnt is put back.
        2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called
           and handles destruction of blkgs. This is where the css reference
           held by each blkg is released.
        3. Once the blkcg ref count goes to zero, blkcg_css_free() is called.
           This finally frees the blkg.
      
      It seems in the past blk-throttle didn't do the most understandable
      things with taking data from a blkg while associating with current. So,
      the simplification and unification of what blk-throttle is doing caused
      this.
      
      Fixes: 08e18eab ("block: add bi_blkg to the bio for cgroups")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      59b57717
    • D
      Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()" · 6b065462
      Dennis Zhou (Facebook) 提交于
      This reverts commit 4c699480.
      
      Destroying blkgs is tricky because of the nature of the relationship. A
      blkg should go away when either a blkcg or a request_queue goes away.
      However, blkg's pin the blkcg to ensure they remain valid. To break this
      cycle, when a blkcg is offlined, blkgs put back their css ref. This
      eventually lets css_free() get called which frees the blkcg.
      
      The above commit (4c699480) breaks this order of events by trying to
      destroy blkgs in css_free(). As the blkgs still hold references to the
      blkcg, css_free() is never called.
      
      The race between blkcg_bio_issue_check() and cgroup_rmdir() will be
      addressed in the following patch by delaying destruction of a blkg until
      all writeback associated with the blkcg has been finished.
      
      Fixes: 4c699480 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com>
      Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b065462
  11. 30 8月, 2018 1 次提交
  12. 29 8月, 2018 1 次提交
  13. 28 8月, 2018 15 次提交
  14. 26 8月, 2018 6 次提交
  15. 25 8月, 2018 2 次提交
    • L
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 05193597
      Linus Torvalds 提交于
      Pull libata updates from Tejun Heo:
       "Nothing too interesting. Mostly ahci and ahci_platform changes, many
        around power management"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
        ata: ahci_platform: enable to get and control reset
        ata: libahci_platform: add reset control support
        ata: add an extra argument to ahci_platform_get_resources()
        ata: sata_rcar: Add r8a77965 support
        ata: sata_rcar: exclude setting of PHY registers in Gen3
        ata: sata_rcar: really mask all interrupts on Gen2 and later
        Revert "ata: ahci_platform: allow disabling of hotplug to save power"
        ata: libahci: Allow reconfigure of DEVSLP register
        ata: libahci: Correct setting of DEVSLP register
        ata: ahci: Enable DEVSLP by default on x86 with SLP_S0
        ata: ahci: Support state with min power but Partial low power state
        Revert "ata: ahci_platform: convert kcalloc to devm_kcalloc"
        ata: sata_rcar: Add rudimentary Runtime PM support
        ata: sata_rcar: Provide a short-hand for &pdev->dev
        ata: Only output sg element mapped number in verbose debug
        ata: Guard ata_scsi_dump_cdb() by ATA_VERBOSE_DEBUG
        ata: ahci_platform: convert kcalloc to devm_kcalloc
        ata: ahci_platform: convert kzallloc to kcalloc
        ata: ahci_platform: correct parameter documentation for ahci_platform_shutdown
        libata: remove ata_sff_data_xfer_noirq()
        ...
      05193597
    • L
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 59676610
      Linus Torvalds 提交于
      Pull cgroup updates from Tejun Heo:
       "Just one commit from Steven to take out spin lock from trace event
        handlers"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup/tracing: Move taking of spin lock out of trace event handlers
      59676610