提交 · ce79bfc2c7b415f0f296adc59a54c9c781d63617 · openeuler / Kernel

21 4月, 2017 40 次提交

nvme_fcloop: split job struct from transport for req_release · ce79bfc2

由 James Smart 提交于 4月 11, 2017

Current design has the fcloop job struct, used for both initiator and
target processing, allocated as part of the initiator request structure.
On aborts, the initiator side (based on the request) may terminate, yet
the target side wants to continue processing. the target side can't do
that if the initiator side goes away.
Revise fcloop to allocate an independent target side structure when it
starts an io from the initiator.

Added a lock to the request struct as well to synchronize pointer updates
on abort calls.

Modified target downcalls to recognize conditions where initiator has
aborted the io (thus nulled the pointer between job structs), thus
avoid referencing sgl lists which are gone and no longer making upcalls
to the initiator.

In conditions where the targetport is no longer connected, have the
initiator return an access failure rather than simulating a command
completion.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

ce79bfc2

nvmet_fc: add req_release to lldd api · 19b58d94

由 James Smart 提交于 4月 11, 2017

With the advent of the opdone calls changing context, the lldd can no
longer assume that once the op->done call returns for RSP operations
that the request struct is no longer being accessed.

As such, revise the lldd api for a req_release callback that the
transport will call when the job is complete. This will also be used
with abort cases.

Fixed text in api header for change in io complete semantics.

Revised lpfc to support the new req_release api.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

19b58d94

nvmet_fc: add target feature flags for upcall isr contexts · 39498fae

由 James Smart 提交于 4月 11, 2017

Two new feature flags were added to control whether upcalls to the
transport result in context switches or stay in the calling context.

NVMET_FCTGTFEAT_CMD_IN_ISR:
  By default, if the flag is not set, the transport assumes the
  lldd is in a non-isr context and in the cpu context it should be
  for the io queue. As such, the cmd handler is called directly in the
  calling context.
  If the flag is set, indicating the upcall is an isr context, the
  transport mandates a transition to a workqueue. The workqueue assigned
  to the queue is used for the context.
NVMET_FCTGTFEAT_OPDONE_IN_ISR
  By default, if the flag is not set, the transport assumes the
  lldd is in a non-isr context and in the cpu context it should be
  for the io queue. As such, the fcp operation done callback is called
  directly in the calling context.
  If the flag is set, indicating the upcall is an isr context, the
  transport mandates a transition to a workqueue. The workqueue assigned
  to the queue is used for the context.

Updated lpfc for flags
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

39498fae

nvmet: convert from kmap to nvmet_copy_from_sgl · 1c05cf90

由 Logan Gunthorpe 提交于 4月 18, 2017

This is safer as it doesn't rely on the data being stored in
a single page in an sgl.

It also aids our effort to start phasing out users of sg_page. See [1].

For this we kmalloc some memory, copy to it and free at the end. Note:
we can't allocate this memory on the stack as the kbuild test robot
reports some frame size overflows on i386.

[1] https://lwn.net/Articles/720053/Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

1c05cf90

nvme: improve performance for virtual NVMe devices · f9f38e33

由 Helen Koike 提交于 4月 10, 2017

This change provides a mechanism to reduce the number of MMIO doorbell
writes for the NVMe driver. When running in a virtualized environment
like QEMU, the cost of an MMIO is quite hefy here. The main idea for
the patch is provide the device two memory location locations:
 1) to store the doorbell values so they can be lookup without the doorbell
    MMIO write
 2) to store an event index.
I believe the doorbell value is obvious, the event index not so much.
Similar to the virtio specification, the virtual device can tell the
driver (guest OS) not to write MMIO unless you are writing past this
value.

FYI: doorbell values are written by the nvme driver (guest OS) and the
event index is written by the virtual device (host OS).

The patch implements a new admin command that will communicate where
these two memory locations reside. If the command fails, the nvme
driver will work as before without any optimizations.

Contributions:
  Eric Northup <digitaleric@google.com>
  Frank Swiderski <fes@google.com>
  Ted Tso <tytso@mit.edu>
  Keith Busch <keith.busch@intel.com>

Just to give an idea on the performance boost with the vendor
extension: Running fio [1], a stock NVMe driver I get about 200K read
IOPs with my vendor patch I get about 1000K read IOPs. This was
running with a null device i.e. the backing device simply returned
success on every read IO request.

[1] Running on a 4 core machine:
  fio --time_based --name=benchmark --runtime=30
  --filename=/dev/nvme0n1 --nrfiles=1 --ioengine=libaio --iodepth=32
  --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4
  --rw=randread --blocksize=4k --randrepeat=false
Signed-off-by: NRob Nelson <rlnelson@google.com>
[mlin: port for upstream]
Signed-off-by: NMing Lin <mlin@kernel.org>
[koike: updated for upstream]
Signed-off-by: NHelen Koike <helen.koike@collabora.co.uk>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

f9f38e33

nvme/pci: Don't set reserved SQ create flags · 81c1cd98

由 Keith Busch 提交于 4月 04, 2017

The QPRIO field is only valid if weighted round robin arbitration is used,
and this driver doesn't enable that controller configuration option.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

81c1cd98

blk-stat: kill blk_stat_rq_ddir() · 99c749a4

由 Jens Axboe 提交于 4月 21, 2017

No point in providing and exporting this helper. There's just
one (real) user of it, just use rq_data_dir().
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

99c749a4

nbd: set the max segments to USHRT_MAX · 1cc1f17a

由 Josef Bacik 提交于 4月 20, 2017

I lack the basic understanding of what segments mean, so we were being
limited to 512kib requests even with higher max_sectors sizes set.
Setting the maximum number of segments to unlimited allows us to
actually have arbitrarily large IO's go through NBD.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1cc1f17a

blk-mq: Remove blk_mq_sched_move_to_dispatch() · 246665db

由 Bart Van Assche 提交于 4月 20, 2017

commit c13660a0 ("blk-mq-sched: change ->dispatch_requests()
to ->dispatch_request()") removed the last user of this function.
Hence also remove the function itself.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

246665db

blk-mq: add might_sleep check to blk_mq_get_driver_tag() · 5feeacdd

由 Jens Axboe 提交于 4月 20, 2017

If the caller passes in wait=true, it has to be able to block
for a driver tag. We just had a bug where flush insertion
would block on tag allocation, while we had preempt disabled.
Ensure that we catch cases like that earlier next time.
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5feeacdd

blk-mq: Fix poll_stat for new size-based bucketing. · 0206319f

由 Stephen Bates 提交于 4月 20, 2017

Fixes an issue where the size of the poll_stat array in request_queue
does not match the size expected by the new size based bucketing for
IO completion polling.

Fixes: 720b8ccc ("blk-mq: Add a polling specific stats function")
Signed-off-by: NStephen Bates <sbates@raithlin.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0206319f

blk-mq: fix schedule-while-atomic with scheduler attached · b00c53e8

由 Jens Axboe 提交于 4月 20, 2017

We must have dropped the ctx before we call
blk_mq_sched_insert_request() with can_block=true, otherwise we risk
that a flush request can block on insertion if we are currently out of
tags.

[   47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
[   47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
[   47.690572] Preemption disabled at:
[   47.690584] [<ffffffff81326c7c>] blk_mq_sched_get_request+0x6c/0x280
[   47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
[   47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
[   47.718081] Call Trace:
[   47.720903]  dump_stack+0x4f/0x73
[   47.724694]  ? blk_mq_sched_get_request+0x6c/0x280
[   47.730137]  __schedule_bug+0x6c/0xc0
[   47.734314]  __schedule+0x559/0x780
[   47.738302]  schedule+0x3b/0x90
[   47.741899]  io_schedule+0x11/0x40
[   47.745788]  blk_mq_get_tag+0x167/0x2a0
[   47.750162]  ? remove_wait_queue+0x70/0x70
[   47.754901]  blk_mq_get_driver_tag+0x92/0xf0
[   47.759758]  blk_mq_sched_insert_request+0x134/0x170
[   47.765398]  ? blk_account_io_start+0xd0/0x270
[   47.770679]  blk_mq_make_request+0x1b2/0x850
[   47.775766]  generic_make_request+0xf7/0x2d0
[   47.780860]  submit_bio+0x5f/0x120
[   47.784979]  ? submit_bio+0x5f/0x120
[   47.789631]  submit_bh_wbc.isra.46+0x10d/0x130
[   47.794902]  submit_bh+0xb/0x10
[   47.798719]  journal_submit_commit_record+0x190/0x210
[   47.804686]  ? _raw_spin_unlock+0x13/0x30
[   47.809480]  jbd2_journal_commit_transaction+0x180a/0x1d00
[   47.815925]  kjournald2+0xb6/0x250
[   47.820022]  ? kjournald2+0xb6/0x250
[   47.824328]  ? remove_wait_queue+0x70/0x70
[   47.829223]  kthread+0x10e/0x140
[   47.833147]  ? commit_timeout+0x10/0x10
[   47.837742]  ? kthread_create_on_node+0x40/0x40
[   47.843122]  ret_from_fork+0x29/0x40

Fixes: a4d907b6 ("blk-mq: streamline blk_mq_make_request")
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b00c53e8

blk-mq: Add a polling specific stats function · 720b8ccc

由 Stephen Bates 提交于 4月 07, 2017

Rather than bucketing IO statisics based on direction only we also
bucket based on the IO size. This leads to improved polling
performance. Update the bucket callback function and use it in the
polling latency estimation.
Signed-off-by: NStephen Bates <sbates@raithlin.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

720b8ccc

blk-stat: convert blk-stat bucket callback to signed · a37244e4

由 Stephen Bates 提交于 4月 20, 2017

In order to allow for filtering of IO based on some other properties
of the request than direction we allow the bucket function to return
an int.

If the bucket callback returns a negative do no count it in the stats
accumulation.
Signed-off-by: NStephen Bates <sbates@raithlin.com>

Fixed up Kyber scheduler stat callback.
Signed-off-by: NJens Axboe <axboe@fb.com>

a37244e4

block: remove the errors field from struct request · caf7df12

由 Christoph Hellwig 提交于 4月 20, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

caf7df12

blktrace: remove the unused block_rq_abort tracepoint · cee4b7ce

由 Christoph Hellwig 提交于 4月 20, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

cee4b7ce

swim3: remove (commented out) printing of req->errors · c877f424

由 Christoph Hellwig 提交于 4月 20, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

c877f424

C
ataflop: switch from req->errors to req->error_count · c8e90782
由 Christoph Hellwig 提交于 4月 20, 2017
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
c8e90782
C
floppy: switch from req->errors to req->error_count · 45908795
由 Christoph Hellwig 提交于 4月 20, 2017
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
45908795

block: add a error_count field to struct request · e26738e0

由 Christoph Hellwig 提交于 4月 20, 2017

This is for the legacy floppy and ataflop drivers that currently abuse
->errors for this purpose.  It's stashed away in a union to not grow
the struct size, the other fields are either used by modern drivers
for different purposes or the I/O scheduler before queing the I/O
to drivers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

e26738e0

blk-mq: simplify __blk_mq_complete_request · 453f8341

由 Christoph Hellwig 提交于 4月 20, 2017

Merge blk_mq_ipi_complete_request and blk_mq_stat_add into their only
caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

453f8341

blk-mq: remove the error argument to blk_mq_complete_request · 08e0029a

由 Christoph Hellwig 提交于 4月 20, 2017

Now that all drivers that call blk_mq_complete_requests have a
->complete callback we can remove the direct call to blk_mq_end_request,
as well as the error argument to blk_mq_complete_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

08e0029a

xen-blkfront: don't use req->errors · 2609587c

由 Christoph Hellwig 提交于 4月 20, 2017

xen-blkfron is the last users using rq->errros for passing back error to
blk-mq, and I'd like to get rid of that.  In the longer run the driver
should be moving more of the completion processing into .complete, but
this is the minimal change to move forward for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

2609587c

mtip32xx: add a status field to struct mtip_cmd · 4dda4735

由 Christoph Hellwig 提交于 4月 20, 2017

Instead of using req->errors, which will go away.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

4dda4735

nbd: don't use req->errors · 1e388ae0

由 Christoph Hellwig 提交于 4月 20, 2017

Add a nbd-specific field instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1e388ae0

dm mpath: don't check for req->errors · 8fc77980

由 Christoph Hellwig 提交于 4月 20, 2017

We'll get all proper errors reported through ->end_io and ->errors will
go away soon.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

8fc77980

dm rq: don't pass irrelevant error code to blk_mq_complete_request · e0af413a

由 Christoph Hellwig 提交于 4月 20, 2017

dm never uses rq->errors, so there is no need to pass an error argument
to blk_mq_complete_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e0af413a

null_blk: don't pass always-0 req->errors to blk_mq_complete_request · eb1a61a3

由 Christoph Hellwig 提交于 4月 20, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

eb1a61a3

loop: zero-fill bio on the submitting cpu · fe2cb290

由 Christoph Hellwig 提交于 4月 20, 2017

In thruth I've just audited which blk-mq drivers don't currently have a
complete callback, but I think this change is at least borderline useful.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

fe2cb290

scsi: introduce a result field in struct scsi_request · 17d5363b

由 Christoph Hellwig 提交于 4月 20, 2017

This passes on the scsi_cmnd result field to users of passthrough
requests.  Currently we abuse req->errors for this purpose, but that
field will go away in its current form.

Note that the old IDE code abuses the errors field in very creative
ways and stores all kinds of different values in it.  I didn't dare
to touch this magic, so the abuses are brought forward 1:1.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

17d5363b

virtio_blk: don't use req->errors · d19633d5

由 Christoph Hellwig 提交于 4月 20, 2017

Remove passing req->errors (which at that point is always 0) to
blk_mq_complete_request, and rely on the virtio status code for the
serial number passthrough request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d19633d5

virtio: fix spelling of virtblk_scsi_request_done · a1a6e62b

由 Christoph Hellwig 提交于 4月 20, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a1a6e62b

nvme: make nvme_error_status private · 65ba6b54

由 Christoph Hellwig 提交于 4月 20, 2017

Currently it's used by the lighnvm passthrough ioctl, but we'd like to make
it private in preparation of block layer specific error code.  Lighnvm already
returns the real NVMe status anyway, so I think we can just limit it to
returning -EIO for any status set.

This will need a careful audit from the lightnvm folks, though.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

65ba6b54

nvme: split nvme status from block req->errors · 27fa9bc5

由 Christoph Hellwig 提交于 4月 20, 2017

We want our own clearly defined error field for NVMe passthrough commands,
and the request errors field is going away in its current form.

Just store the status and result field in the nvme_request field from
hardirq completion context (using a new helper) and then generate a
Linux errno for the block layer only when we actually need it.

Because we can't overload the status value with a negative error code
for cancelled command we now have a flags filed in struct nvme_request
that contains a bit for this condition.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

27fa9bc5

nvme-fc: fix status code handling in nvme_fc_fcpio_done · d663b69f

由 Christoph Hellwig 提交于 4月 20, 2017

nvme_complete_async_event expects the little endian status code
including the phase bit, and a new completion handler I plan to
introduce will do so as well.

Change the status variable into the little endian format with the
phase bit used in the NVMe CQE to fix / enable this.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d663b69f

block: remove the blk_execute_rq return value · b7819b92

由 Christoph Hellwig 提交于 4月 20, 2017

The function only returns -EIO if rq->errors is non-zero, which is not
very useful and lets a large number of callers ignore the return value.

Just let the callers figure out their error themselves.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b7819b92

pd: don't check blk_execute_rq return value. · 75a500ef

由 Christoph Hellwig 提交于 4月 20, 2017

The driver never sets req->errors, so blk_execute_rq will always return 0.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

75a500ef

bdi: Drop 'parent' argument from bdi_register[_va]() · 7c4cc300

由 Jan Kara 提交于 4月 12, 2017

Drop 'parent' argument of bdi_register() and bdi_register_va().  It is
always NULL.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

7c4cc300

block: Remove unused functions · 2e82b84c

由 Jan Kara 提交于 4月 12, 2017

Now that all backing_dev_info structure are allocated separately, we can
drop some unused functions.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

2e82b84c

fs: Remove SB_I_DYNBDI flag · c1844d53

由 Jan Kara 提交于 4月 12, 2017

Now that all bdi structures filesystems use are properly refcounted, we
can remove the SB_I_DYNBDI flag.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

c1844d53

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功