提交 · 4ece44af733ff63a7cd12aaa8c85afb6d9fdc664 · openeuler / Kernel

04 3月, 2016 16 次提交

lightnvm: rename ->nr_pages to ->nr_sects · 4ece44af

由 Matias Bjørling 提交于 2月 20, 2016

The struct rrpc->nr_pages can easily be interpreted as the number of
flash pages allocated to rrpc, while it is the nr_sects. Make sure that
this is reflected from the variable name.
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

4ece44af

lightnvm: update closed list outside of intr context · 6adb03de

由 Javier González 提交于 2月 20, 2016

When an I/O finishes, full blocks are moved from the open to the closed
list - a lock is taken to protect the list. This happens at the moment
in the interrupt context, which is not correct.

This patch moves this logic to the block workqueue instead, avoiding
holding a spinlock without interrupt save in an interrupt context.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Fixes: ff0e498b ("lightnvm: manage open and closed blocks sepa...")
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

6adb03de

xen/blback: Fit the important information of the thread in 17 characters · fa3184b8

由 Konrad Rzeszutek Wilk 提交于 2月 03, 2016

The processes names are truncated to 17, while we had the length
of the process as name 20 - which meant that while we filled
it out with various details - the last 3 characters (which had
the queue number) never surfaced to the user-space.

To simplify this and be able to fit the device name, domain id,
and the queue number we remove the 'blkback' from the name.

Prior to this patch the device name is "blkback.<domid>.<name>"
for example: blkback.8.xvda, blkback.11.hda.

With the multiqueue block backend we add "-%d" for the queue.
But sadly this is already way past the limit so it gets stripped.

Possible solution had been identified by Ian:
http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg03516.html

  "
  If you are pressed for space then the "xvd" is probably a bit redundant
  in a string which starts blkbk.

  The guest may not even call the device xvdN (iirc BSD has another
  prefix) any how, so having blkback say so seems of limited use anyway.

  Since this seems to not include a partition number how does this work in
  the split partition scheme? (i.e. one where the guest is given xvda1 and
  xvda2 rather than xvda with a partition table)

[It will be 'blkback.8.xvda1', and 'blkback.11.xvda2']

  Perhaps something derived from one of the schemes in
  http://xenbits.xen.org/docs/unstable/misc/vbd-interface.txt might be a
  better fit?

After a bit of discussion (see
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg01588.html)
we settled on dropping the "blback" part.

This will make it possible to have the <domid>.<name>-<queue>:

 [1.xvda-0]
 [1.xvda-1]

And we enough space to make it go up to:

 [32100.xvdfg9-5]
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

fa3184b8

lightnvm: fold get bb tbl when using dual/quad plane mode · d5bdec8d

由 Matias Bjørling 提交于 2月 19, 2016

When the media manager runs in dual or quad plane mode, lightnvm
abstracts away plane specific commands. This poses a problem for
get bad block table, as it reports bad blocks per plane, making the
table either two or four times bigger than expected. Fold the bad block
list before returning.
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

d5bdec8d

lightnvm: fix up nonsensical configure overrun checking · 5e422cff

由 Alan 提交于 2月 19, 2016

Instead of checking a constant 0 actually check the space available. Even
better remember to allow for the header and also check the right amount of
space is needed.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

5e422cff

xen-blkback: advertise indirect segment support earlier · 5a705845

由 Jan Beulich 提交于 2月 10, 2016

There's no reason to defer this until the connect phase, and in fact
there are frontend implementations expecting this to be available
earlier. Move it into the probe function.
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5a705845

xen-blkfront: rename indirect descriptor parameter · 14e710fe

由 Jan Beulich 提交于 2月 10, 2016

"max" is rather ambiguous and carries pretty little meaning, the more
that there are also "max_queues" and "max_ring_page_order". Make this
"max_indirect_segments" instead, and at once change the type from int
to uint (to match the respective variable's type).
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

14e710fe

mtip32xx: Cleanup queued requests after surprise removal · 008e56d2

由 Asai Thambi SP 提交于 2月 24, 2016

Fail all pending requests after surprise removal of a drive.
Signed-off-by: NVignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

008e56d2

mtip32xx: Implement timeout handler · abb0ccd1

由 Asai Thambi SP 提交于 2月 24, 2016

Added timeout handler. Replaced blk_mq_end_request() with
blk_mq_complete_request() to avoid double completion of a request.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

abb0ccd1

mtip32xx: Handle FTL rebuild failure state during device initialization · aae4a033

由 Asai Thambi SP 提交于 2月 24, 2016

Allow device initialization to finish gracefully when it is in
FTL rebuild failure state. Also, recover device out of this state
after successfully secure erasing it.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NVignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

aae4a033

mtip32xx: Handle safe removal during IO · 51c6570e

由 Asai Thambi SP 提交于 2月 24, 2016

Flush inflight IOs using fsync_bdev() when the device is safely
removed. Also, block further IOs in device open function.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

51c6570e

mtip32xx: Fix for rmmod crash when drive is in FTL rebuild · 59cf70e2

由 Asai Thambi SP 提交于 2月 24, 2016

When FTL rebuild is in progress, alloc_disk() initializes the disk
but device node will be created by add_disk() only after successful
completion of FTL rebuild. So, skip deletion of device node in
removal path when FTL rebuild is in progress.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

59cf70e2

mtip32xx: Avoid issuing standby immediate cmd during FTL rebuild · d8a18d2d

由 Asai Thambi SP 提交于 2月 24, 2016

Prevent standby immediate command from being issued in remove,
suspend and shutdown paths, while drive is in FTL rebuild process.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NVignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

d8a18d2d

mtip32xx: Print exact time when an internal command is interrupted · 5b7e0a8a

由 Asai Thambi SP 提交于 2月 24, 2016

Print exact time when an internal command is interrupted.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

5b7e0a8a

mtip32xx: Remove unwanted code from taskfile error handler · e35b9473

由 Asai Thambi SP 提交于 2月 24, 2016

Remove setting and clearing MTIP_PF_EH_ACTIVE_BIT flag in
mtip_handle_tfe() as they are redundant. Also avoid waking
up service thread from mtip_handle_tfe() because it is
already woken up in case of taskfile error.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

e35b9473

mtip32xx: Fix broken service thread handling · cfc05bd3

由 Asai Thambi SP 提交于 2月 24, 2016

Service thread does not detect the need for taskfile error hanlding. Fixed the
flag condition to process taskfile error.
Signed-off-by: NSelvan Mani <smani@micron.com>
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@fb.com>

cfc05bd3

03 3月, 2016 1 次提交

Merge tag 'nbd-for-4.6' of git://git.pengutronix.de/git/mpa/linux-nbd into for-4.6/drivers · ff482f7f

由 Jens Axboe 提交于 3月 03, 2016

NBD for 4.6

Markus writes:

This pull request contains 7 patches for 4.6.

Patch 1 fixes some unnecessarily complicated code I introduced some versions
ago for debugfs.

Patch 2 removes the criticised signal usage within NBD to kill the NBD threads
after a timeout. This code was used for the last years and is now replaced by
simply killing the tcp connection.

Patches 3-6 are some smaller cleanups.

Patch 7 uevents for the userspace. This way udev/systemd can react on connected
NBD devices.

ff482f7f

01 3月, 2016 1 次提交

nvme: expose cntlid in sysfs · 931e1c22

由 Ming Lin 提交于 2月 26, 2016

For NVMe over Fabrics, the cntlid will be used by systemd/udev to
create link to the device, for example,

/dev/disk/by-path/<fabrics-info>-<cntlid>-<namespace> -> /dev/nvme0n1
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

931e1c22

29 2月, 2016 4 次提交

nvme: return the whole CQE through the request passthrough interface · 1cb3cce5

由 Christoph Hellwig 提交于 2月 29, 2016

Both LighNVM and NVMe over Fabrics need to look at more than just the
status and result field.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMatias Bj?rling <m@bjorling.me>
Reviewed-by: NJay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1cb3cce5

nvme: replace the kthread with a per-device watchdog timer · 2d55cd5f

由 Christoph Hellwig 提交于 2月 29, 2016

The only work left in the kthread is the periodic health check for each
controller.  There is no need to run this from process context or keep
a thread context around for it, so replace it with a simpler timer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2d55cd5f

nvme: don't poll the CQ from the kthread · 79f2b358

由 Christoph Hellwig 提交于 2月 29, 2016

There is no reason to do unconditional polling of CQs per the NVMe
spec.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

79f2b358

nvme: use a work item to submit async event requests · 9396dec9

由 Christoph Hellwig 提交于 2月 29, 2016

Use a dedicated work item to submit async event requests instead of the
global kthread.  This simplifies the code and reduces the latencies to
resubmit a request once an even notification happened.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

9396dec9

15 2月, 2016 1 次提交

nbd: Create size change events for userspace · 37091fdd

由 Markus Pargmann 提交于 7月 27, 2015

The userspace needs to know when nbd devices are ready for use.
Currently no events are created for the userspace which doesn't work for
systemd.

See the discussion here: https://github.com/systemd/systemd/pull/358

This patch uses a central point to setup the nbd-internal sizes. A ioctl
to set a size does not lead to a visible size change. The size of the
block device will be kept at 0 until nbd is connected. As soon as it
connects, the size will be changed to the real value and a uevent is
created. When disconnecting, the blockdevice is set to 0 size and
another uevent is generated.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>

37091fdd

11 2月, 2016 4 次提交

nvme: split pci module out of core module · 576d55d6

由 Ming Lin 提交于 2月 10, 2016

NVMe over Fabrics drivers are going to reuse the core,
so splits nvme.ko into 2 modules:

nvme-core.ko: the core part
nvme.ko: the PCI driver

Export symbols from nvme-core.ko.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

576d55d6

nvme: split dev_list_lock · 9f2482b9

由 Ming Lin 提交于 2月 10, 2016

Split dev_list_lock into one in the core and one in the PCI driver.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9f2482b9

nvme: move timeout variables to core.c · ba0ba7d3

由 Ming Lin 提交于 2月 10, 2016

These variables are used by PCI driver and will also be used in the
forthcoming NVMe over Fabrics drivers.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ba0ba7d3

nvme/host: reference the fabric module for each bdev open callout · e439bb12

由 Sagi Grimberg 提交于 2月 10, 2016

We don't want to be able to unload the fabric driver when we have
openened referenced to our namespaces. Thus, for each nvme_open we
take a reference on the fabric driver and put it in nvme_release.
This behavior is consistent with the scsi model.

This resolves the panic when unloading a fabric module with
mpath holders.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NIan Bakshan <ianb@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e439bb12

10 2月, 2016 4 次提交

nvme: Log the ctrl device name instead of the underlying pci device name · 1b3c47c1

由 Sagi Grimberg 提交于 2月 10, 2016

Having the ctrl name "nvmeX" seems much more friendly than
the underlying device name. Also, with other nvme transports
such as the soon to come nvme-loop we don't have an underlying
device so it doesn't makes sense to make up one.

In order to help matching an instance name to a pci function,
we add a info print in nvme_probe.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Acked-by: NKeith Busch <keith.busch@intel.com>

Manually fixed up the hunk in nvme_cancel_queue_ios().
Signed-off-by: NJens Axboe <axboe@fb.com>

1b3c47c1

nvme: fix drvdata setup for the nvme device · f4f0f63e

由 Christoph Hellwig 提交于 2月 09, 2016

Pass the right private data to device_create_with_groups from the
beginning, and remove the superflous call to dev_set_drvdata.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJon Derrick <jonathan.derrick@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f4f0f63e

NVMe: Fix possible queue use after freed · 949928c1

由 Keith Busch 提交于 12月 17, 2015

This notifies blk-mq when the tag set contains a different number of
queues prior to freeing unused ones that the request queue points to.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

949928c1

blk-mq: dynamic h/w context count · 868f2f0b

由 Keith Busch 提交于 12月 17, 2015

The hardware's provided queue count may change at runtime with resource
provisioning. This patch allows a block driver to alter the number of
h/w queues available when its resource count changes.

The main part is a new blk-mq API to request a new number of h/w queues
for a given live tag set. The new API freezes all queues using that set,
then adjusts the allocated count prior to remapping these to CPUs.

The bulk of the rest just shifts where h/w contexts and all their
artifacts are allocated and freed.

The number of max h/w contexts is capped to the number of possible cpus
since there is no use for more than that. As such, all pre-allocated
memory for pointers need to account for the max possible rather than
the initial number of queues.

A side effect of this is that the blk-mq will proceed successfully as
long as it can allocate at least one h/w context. Previously it would
fail request queue initialization if less than the requested number
was allocated.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NJon Derrick <jonathan.derrick@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

868f2f0b

05 2月, 2016 9 次提交

nbd: ratelimit error msgs after socket close · da6ccaaa

由 Dan Streetman 提交于 1月 14, 2016

Make the "Attempted send on closed socket" error messages generated in
nbd_request_handler() ratelimited.

When the nbd socket is shutdown, the nbd_request_handler() function emits
an error message for every request remaining in its queue.  If the queue
is large, this will spam a large amount of messages to the log.  There's
no need for a separate error message for each request, so this patch
ratelimits it.

In the specific case this was found, the system was virtual and the error
messages were logged to the serial port, which overwhelmed it.

Fixes: 4d48a542 ("nbd: fix I/O hang on disconnected nbds")
Signed-off-by: NDan Streetman <dan.streetman@canonical.com>
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>

da6ccaaa

nbd: Move flag parsing to a function · d02cf531

由 Markus Pargmann 提交于 10月 29, 2015

nbd changes properties of the blockdevice depending on flags that were
received. This patch moves this flag parsing into a separate function
nbd_parse_flags().
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>

d02cf531

nbd: Cleanup reset of nbd and bdev after a disconnect · 0e4f0f6f

由 Markus Pargmann 提交于 10月 29, 2015

Group all variables that are reset after a disconnect into reset
functions. This patch adds two of these functions, nbd_reset() and
nbd_bdev_reset().
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>

0e4f0f6f

nbd: Timeouts are not user requested disconnects · 1f7b5cf1

由 Markus Pargmann 提交于 10月 29, 2015

It may be useful to know in the client that a connection timed out. The
current code returns success for a timeout.

This patch reports the error code -ETIMEDOUT for a timeout.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>

1f7b5cf1

nbd: Remove signal usage · 23272a67

由 Markus Pargmann 提交于 10月 29, 2015

As discussed on the mailing list, the usage of signals for timeout
handling has a lot of potential issues. The nbd driver used for some
time signals for timeouts. These signals where able to get the threads
out of the blocking socket operations.

This patch removes all signal usage and uses a socket shutdown instead.
The socket descriptor itself is cleared later when the whole nbd device
is closed.

The tasks_lock is removed as we do not depend on this anymore. Instead
a new lock for the socket is introduced so we can safely work with the
socket in the timeout handler outside of the two main threads.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

23272a67

cfq-iosched: Allow parent cgroup to preempt its child · 3984aa55

由 Jan Kara 提交于 1月 12, 2016

Currently we don't allow sync workload of one cgroup to preempt sync
workload of any other cgroup. This is because we want to achieve service
separation between cgroups. However in cases where cgroup preempting is
ancestor of the current cgroup, there is no need of separation and
idling introduces unnecessary overhead. This hurts for example the case
when workload is isolated within a cgroup but journalling threads are in
root cgroup. Simple way to demostrate the issue is using:

dbench4 -c /usr/share/dbench4/client.txt -t 10 -D /mnt 1

on ext4 filesystem on plain SATA drive (mounted with barrier=0 to make
difference more visible). When all processes are in the root cgroup,
reported throughput is 153.132 MB/sec. When dbench process gets its own
blkio cgroup, reported throughput drops to 26.1006 MB/sec.

Fix the problem by making check in cfq_should_preempt() more benevolent
and allow preemption by ancestor cgroup. This improves the throughput
reported by dbench4 to 48.9106 MB/sec.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

3984aa55

cfq-iosched: Allow sync noidle workloads to preempt each other · a257ae3e

由 Jan Kara 提交于 1月 12, 2016

The original idea with preemption of sync noidle queues (introduced in
commit 718eee05 "cfq-iosched: fairness for sync no-idle queues") was
that we service all sync noidle queues together, we don't idle on any of
the queues individually and we idle only if there is no sync noidle
queue to be served. This intention also matches the original test:

	if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
	   && new_cfqq->service_tree == cfqq->service_tree)
		return true;

However since at that time cfqq->service_tree was not set for idling
queues, this test was unreliable and was replaced in commit e4a22919
"cfq-iosched: fix no-idle preemption logic" by:

	if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD &&
	    cfqq_type(new_cfqq) == SYNC_NOIDLE_WORKLOAD &&
	    new_cfqq->service_tree->count == 1)
		return true;

That was a reliable test but was actually doing something different -
now we preempt sync noidle queue only if the new queue is the only one
busy in the service tree.

These days cfq queue is kept in service tree even if it is idling and
thus the original check would be safe again. But since we actually check
that cfq queues are in the same cgroup, of the same priority class and
workload type (sync noidle), we know that new_cfqq is fine to preempt
cfqq. So just remove the service tree check.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a257ae3e

cfq-iosched: Reorder checks in cfq_should_preempt() · 6c80731c

由 Jan Kara 提交于 1月 12, 2016

Move check for preemption by rt class up. There is no functional change
but it makes arguing about conditions simpler since we can be sure both
cfq queues are from the same ioprio class.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6c80731c

cfq-iosched: Don't group_idle if cfqq has big thinktime · e795421e

由 Jan Kara 提交于 1月 12, 2016

There is no point in idling on a cfq group if the only cfq queue that is
there has too big thinktime.
Signed-off-by: NJan Kara <jack@suse.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@fb.com>

e795421e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功