1. 21 4月, 2017 16 次提交
  2. 20 4月, 2017 13 次提交
  3. 19 4月, 2017 11 次提交
    • C
      block: remove the osdblk driver · 10081552
      Christoph Hellwig 提交于
      This was just a proof of concept user for the SCSI OSD library, and
      never had any real users.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Acked-by: NBoaz Harrosh <ooo@electrozaur.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      10081552
    • J
      block: Make writeback throttling defaults consistent for SQ devices · 8330cdb0
      Jan Kara 提交于
      When CFQ is used as an elevator, it disables writeback throttling
      because they don't play well together. Later when a different elevator
      is chosen for the device, writeback throttling doesn't get enabled
      again as it should. Make sure CFQ enables writeback throttling (if it
      should be enabled by default) when we switch from it to another IO
      scheduler.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      8330cdb0
    • P
      block, bfq: split bfq-iosched.c into multiple source files · ea25da48
      Paolo Valente 提交于
      The BFQ I/O scheduler features an optimal fair-queuing
      (proportional-share) scheduling algorithm, enriched with several
      mechanisms to boost throughput and reduce latency for interactive and
      real-time applications. This makes BFQ a large and complex piece of
      code. This commit addresses this issue by splitting BFQ into three
      main, independent components, and by moving each component into a
      separate source file:
      1. Main algorithm: handles the interaction with the kernel, and
      decides which requests to dispatch; it uses the following two further
      components to achieve its goals.
      2. Scheduling engine (Hierarchical B-WF2Q+ scheduling algorithm):
      computes the schedule, using weights and budgets provided by the above
      component.
      3. cgroups support: handles group operations (creation, destruction,
      move, ...).
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      ea25da48
    • P
      block, bfq: remove all get and put of I/O contexts · 6fa3e8d3
      Paolo Valente 提交于
      When a bfq queue is set in service and when it is merged, a reference
      to the I/O context associated with the queue is taken. This reference
      is then released when the queue is deselected from service or
      split. More precisely, the release of the reference is postponed to
      when the scheduler lock is released, to avoid nesting between the
      scheduler and the I/O-context lock. In fact, such nesting would lead
      to deadlocks, because of other code paths that take the same locks in
      the opposite order. This postponing of I/O-context releases does
      complicate code.
      
      This commit addresses these issue by modifying involved operations in
      such a way to not need to get the above I/O-context references any
      more. Then it also removes any get and release of these references.
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      6fa3e8d3
    • A
      block, bfq: handle bursts of queue activations · e1b2324d
      Arianna Avanzini 提交于
      Many popular I/O-intensive services or applications spawn or
      reactivate many parallel threads/processes during short time
      intervals. Examples are systemd during boot or git grep.  These
      services or applications benefit mostly from a high throughput: the
      quicker the I/O generated by their processes is cumulatively served,
      the sooner the target job of these services or applications gets
      completed. As a consequence, it is almost always counterproductive to
      weight-raise any of the queues associated to the processes of these
      services or applications: in most cases it would just lower the
      throughput, mainly because weight-raising also implies device idling.
      
      To address this issue, an I/O scheduler needs, first, to detect which
      queues are associated with these services or applications. In this
      respect, we have that, from the I/O-scheduler standpoint, these
      services or applications cause bursts of activations, i.e.,
      activations of different queues occurring shortly after each
      other. However, a shorter burst of activations may be caused also by
      the start of an application that does not consist in a lot of parallel
      I/O-bound threads (see the comments on the function bfq_handle_burst
      for details).
      
      In view of these facts, this commit introduces:
      1) an heuristic to detect (only) bursts of queue activations caused by
         services or applications consisting in many parallel I/O-bound
         threads;
      2) the prevention of device idling and weight-raising for the queues
         belonging to these bursts.
      Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e1b2324d
    • P
      block, bfq: boost the throughput with random I/O on NCQ-capable HDDs · e01eff01
      Paolo Valente 提交于
      This patch is basically the counterpart, for NCQ-capable rotational
      devices, of the previous patch. Exactly as the previous patch does on
      flash-based devices and for any workload, this patch disables device
      idling on rotational devices, but only for random I/O. In fact, only
      with these queues disabling idling boosts the throughput on
      NCQ-capable rotational devices. To not break service guarantees,
      idling is disabled for NCQ-enabled rotational devices only when the
      same symmetry conditions considered in the previous patches hold.
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e01eff01
    • P
      block, bfq: boost the throughput on NCQ-capable flash-based devices · bf2b79e7
      Paolo Valente 提交于
      This patch boosts the throughput on NCQ-capable flash-based devices,
      while still preserving latency guarantees for interactive and soft
      real-time applications. The throughput is boosted by just not idling
      the device when the in-service queue remains empty, even if the queue
      is sync and has a non-null idle window. This helps to keep the drive's
      internal queue full, which is necessary to achieve maximum
      performance. This solution to boost the throughput is a port of
      commits a68bbddb and f7d7b7a7 for CFQ.
      
      As already highlighted in a previous patch, allowing the device to
      prefetch and internally reorder requests trivially causes loss of
      control on the request service order, and hence on service guarantees.
      Fortunately, as discussed in detail in the comments on the function
      bfq_bfqq_may_idle(), if every process has to receive the same
      fraction of the throughput, then the service order enforced by the
      internal scheduler of a flash-based device is relatively close to that
      enforced by BFQ. In particular, it is close enough to let service
      guarantees be substantially preserved.
      
      Things change in an asymmetric scenario, i.e., if not every process
      has to receive the same fraction of the throughput. In this case, to
      guarantee the desired throughput distribution, the device must be
      prevented from prefetching requests. This is exactly what this patch
      does in asymmetric scenarios.
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bf2b79e7
    • A
      block, bfq: reduce idling only in symmetric scenarios · 1de0c4cd
      Arianna Avanzini 提交于
      A seeky queue (i..e, a queue containing random requests) is assigned a
      very small device-idling slice, for throughput issues. Unfortunately,
      given the process associated with a seeky queue, this behavior causes
      the following problem: if the process, say P, performs sync I/O and
      has a higher weight than some other processes doing I/O and associated
      with non-seeky queues, then BFQ may fail to guarantee to P its
      reserved share of the throughput. The reason is that idling is key
      for providing service guarantees to processes doing sync I/O [1].
      
      This commit addresses this issue by allowing the device-idling slice
      to be reduced for a seeky queue only if the scenario happens to be
      symmetric, i.e., if all the queues are to receive the same share of
      the throughput.
      
      [1] P. Valente, A. Avanzini, "Evolution of the BFQ Storage I/O
          Scheduler", Proceedings of the First Workshop on Mobile System
          Technologies (MST-2015), May 2015.
          http://algogroup.unimore.it/people/paolo/disk_sched/mst-2015.pdfSigned-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NRiccardo Pizzetti <riccardo.pizzetti@gmail.com>
      Signed-off-by: NSamuele Zecchini <samuele.zecchini92@gmail.com>
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1de0c4cd
    • A
      block, bfq: add Early Queue Merge (EQM) · 36eca894
      Arianna Avanzini 提交于
      A set of processes may happen to perform interleaved reads, i.e.,
      read requests whose union would give rise to a sequential read pattern.
      There are two typical cases: first, processes reading fixed-size chunks
      of data at a fixed distance from each other; second, processes reading
      variable-size chunks at variable distances. The latter case occurs for
      example with QEMU, which splits the I/O generated by a guest into
      multiple chunks, and lets these chunks be served by a pool of I/O
      threads, iteratively assigning the next chunk of I/O to the first
      available thread. CFQ denotes as 'cooperating' a set of processes that
      are doing interleaved I/O, and when it detects cooperating processes,
      it merges their queues to obtain a sequential I/O pattern from the union
      of their I/O requests, and hence boost the throughput.
      
      Unfortunately, in the following frequent case, the mechanism
      implemented in CFQ for detecting cooperating processes and merging
      their queues is not responsive enough to handle also the fluctuating
      I/O pattern of the second type of processes. Suppose that one process
      of the second type issues a request close to the next request to serve
      of another process of the same type. At that time the two processes
      would be considered as cooperating. But, if the request issued by the
      first process is to be merged with some other already-queued request,
      then, from the moment at which this request arrives, to the moment
      when CFQ controls whether the two processes are cooperating, the two
      processes are likely to be already doing I/O in distant zones of the
      disk surface or device memory.
      
      CFQ uses however preemption to get a sequential read pattern out of
      the read requests performed by the second type of processes too.  As a
      consequence, CFQ uses two different mechanisms to achieve the same
      goal: boosting the throughput with interleaved I/O.
      
      This patch introduces Early Queue Merge (EQM), a unified mechanism to
      get a sequential read pattern with both types of processes. The main
      idea is to immediately check whether a newly-arrived request lets some
      pair of processes become cooperating, both in the case of actual
      request insertion and, to be responsive with the second type of
      processes, in the case of request merge. Both types of processes are
      then handled by just merging their queues.
      Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NMauro Andreolini <mauro.andreolini@unimore.it>
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      36eca894
    • P
      block, bfq: reduce latency during request-pool saturation · cfd69712
      Paolo Valente 提交于
      This patch introduces an heuristic that reduces latency when the
      I/O-request pool is saturated. This goal is achieved by disabling
      device idling, for non-weight-raised queues, when there are weight-
      raised queues with pending or in-flight requests. In fact, as
      explained in more detail in the comment on the function
      bfq_bfqq_may_idle(), this reduces the rate at which processes
      associated with non-weight-raised queues grab requests from the pool,
      thereby increasing the probability that processes associated with
      weight-raised queues get a request immediately (or at least soon) when
      they need one. Along the same line, if there are weight-raised queues,
      then this patch halves the service rate of async (write) requests for
      non-weight-raised queues.
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      cfd69712
    • P
      block, bfq: preserve a low latency also with NCQ-capable drives · bcd56426
      Paolo Valente 提交于
      I/O schedulers typically allow NCQ-capable drives to prefetch I/O
      requests, as NCQ boosts the throughput exactly by prefetching and
      internally reordering requests.
      
      Unfortunately, as discussed in detail and shown experimentally in [1],
      this may cause fairness and latency guarantees to be violated. The
      main problem is that the internal scheduler of an NCQ-capable drive
      may postpone the service of some unlucky (prefetched) requests as long
      as it deems serving other requests more appropriate to boost the
      throughput.
      
      This patch addresses this issue by not disabling device idling for
      weight-raised queues, even if the device supports NCQ. This allows BFQ
      to start serving a new queue, and therefore allows the drive to
      prefetch new requests, only after the idling timeout expires. At that
      time, all the outstanding requests of the expired queue have been most
      certainly served.
      
      [1] P. Valente and M. Andreolini, "Improving Application
          Responsiveness with the BFQ Disk I/O Scheduler", Proceedings of
          the 5th Annual International Systems and Storage Conference
          (SYSTOR '12), June 2012.
          Slightly extended version:
          http://algogroup.unimore.it/people/paolo/disk_sched/bfq-v1-suite-
      							results.pdf
      Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bcd56426