提交 · 8685699c28d7452ff06d62b4692df985eb7301f0 · openeuler / Kernel

27 10月, 2020 1 次提交

nvme: ignore zone validate errors on subsequent scans · 8685699c

由 Keith Busch 提交于 10月 23, 2020

Revalidating nvme zoned namespaces requires IO commands, and there are
controller states that prevent IO. For example, a sanitize in progress
is required to fail all IO, but we don't want to remove a namespace
we've previously added just because the controller is in such a state.
Suppress the error in this case.
Reported-by: NMichael Nguyen <michael.nguyen@wdc.com>
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

8685699c

23 10月, 2020 4 次提交

nvme-fc: shorten reconnect delay if possible for FC · f673714a

由 James Smart 提交于 10月 16, 2020

We've had several complaints about a 10s reconnect delay (the default)
when there was an error while there is connectivity to a subsystem.
The max_reconnects and reconnect_delay are set in common code prior to
calling the transport to create the controller.

This change checks if the default reconnect delay is being used, and if
so, it adjusts it to a shorter period (2s) for the nvme-fc transport.
It does so by calculating the controller loss tmo window, changing the
value of the reconnect delay, and then recalculating the maximum number
of reconnect attempts allowed.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f673714a

nvme-fc: wait for queues to freeze before calling update_hr_hw_queues · 88e837ed

由 James Smart 提交于 10月 16, 2020

On reconnect, the code currently does not freeze the controller before
possibly updating the number hw queues for the controller.

Add the freeze before updating the number of hw queues.  Note: the queues
are already started and remain started through the reconnect.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

88e837ed

nvme-fc: fix error loop in create_hw_io_queues · 514a6dc9

由 James Smart 提交于 10月 16, 2020

The loop that backs out of hw io queue creation continues through index
0, which corresponds to the admin queue as well.

Fix the loop so it only proceeds through indexes 1..n which correspond to
I/O queues.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

514a6dc9

nvme-fc: fix io timeout to abort I/O · 52793d62

由 James Smart 提交于 10月 16, 2020

Currently, an I/O timeout unconditionally invokes
nvme_fc_error_recovery() which checks for LIVE or CONNECTING state.  If
live, the routine resets the controller which initiates a reconnect -
which is valid.  If CONNECTING, err_work is scheduled.  Err_work then
calls the terminate_io routine, which also checks for CONNECTING and
noops any further action on outstanding I/O.  The result is nothing
happened to the timed out io.  As such, if the command was dropped on
the wire, it will never timeout / complete, and the connect process
will hang.

Change the behavior of the io timeout routine to unconditionally abort
the I/O.  I/O completion handling will note that an io failed due to an
abort and will terminate the connection / association as needed.  If the
abort was unable to happen, continue with a call to
nvme_fc_error_recovery(). To ensure something different happens in
nvme_fc_error_recovery() rework it so at it will abort all I/Os on the
association to force a failure.

As I/O aborts now may occur outside of delete_association, counting for
completion must be wary and only count those aborted during
delete_association when TERMIO is set on the controller.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

52793d62

22 10月, 2020 8 次提交

nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru · 150dfb6c

由 Chaitanya Kulkarni 提交于 10月 20, 2020

By default, we set the passthru request allocation flag such that it
returns the error in the following code path and we fail the I/O when
BLK_MQ_REQ_NOWAIT is used for request allocation :-

nvme_alloc_request()
 blk_mq_alloc_request()
  blk_mq_queue_enter()
   if (flag & BLK_MQ_REQ_NOWAIT)
        return -EBUSY; <-- return if busy.

On some controllers using BLK_MQ_REQ_NOWAIT ends up in I/O error where
the controller is perfectly healthy and not in a degraded state.

Block layer request allocation does allow us to wait instead of
immediately returning the error when we BLK_MQ_REQ_NOWAIT flag is not
used. This has shown to fix the I/O error problem reported under
heavy random write workload.

Remove the BLK_MQ_REQ_NOWAIT parameter for passthru request allocation
which resolves this issue.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

150dfb6c

nvmet: cleanup nvmet_passthru_map_sg() · 5e063101

由 Logan Gunthorpe 提交于 10月 16, 2020

Clean up some confusing elements of nvmet_passthru_map_sg() by returning
early if the request is greater than the maximum bio size. This allows
us to drop the sg_cnt variable.

This should not result in any functional change but makes the code
clearer and more understandable. The original code allocated a truncated
bio then would return EINVAL when bio_add_pc_page() filled that bio. The
new code just returns EINVAL early if this would happen.

Fixes: c1fef73f ("nvmet: add passthru code to process commands")
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Suggested-by: NDouglas Gilbert <dgilbert@interlog.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5e063101

nvmet: limit passthru MTDS by BIO_MAX_PAGES · df06047d

由 Logan Gunthorpe 提交于 10月 16, 2020

nvmet_passthru_map_sg() only supports mapping a single BIO, not a chain
so the effective maximum transfer should also be limitted by
BIO_MAX_PAGES (presently this works out to 1MB).

For PCI passthru devices the max_sectors would typically be more
limitting than BIO_MAX_PAGES, but this may not be true for all passthru
devices.

Fixes: c1fef73f ("nvmet: add passthru code to process commands")
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

df06047d

nvmet: fix uninitialized work for zero kato · 85bd23f3

由 zhenwei pi 提交于 10月 15, 2020

When connecting a controller with a zero kato value using the following
command line

   nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0

the warning below can be reproduced:

WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90
with trace:
  mod_delayed_work_on+0x59/0x90
  nvmet_update_cc+0xee/0x100 [nvmet]
  nvmet_execute_prop_set+0x72/0x80 [nvmet]
  nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp]
  nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp]
  ...

This is caused by queuing up an uninitialized work.  Althrough the
keep-alive timer is disabled during allocating the controller (fixed in
0d3b6a8d), ka_work still has a chance to run (called by
nvmet_start_ctrl).

Fixes: 0d3b6a8d ("nvmet: Disable keep-alive timer when kato is cleared to 0h")
Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

85bd23f3

nvme-pci: disable Write Zeroes on Sandisk Skyhawk · 02ca079c

由 Kai-Heng Feng 提交于 10月 13, 2020

Like commit 5611ec2b ("nvme-pci: prevent SK hynix PC400 from using
Write Zeroes command"), Sandisk Skyhawk has the same issue:
[ 6305.633887] blk_update_request: operation not supported error, dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0

So also disable Write Zeroes command on Sandisk Skyhawk.

BugLink: https://bugs.launchpad.net/bugs/1899503Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

02ca079c

nvme: use queuedata for nvme_req_qid · 643c476d

由 Keith Busch 提交于 10月 15, 2020

The request's rq_disk isn't set for passthrough IO commands, so tracing
uses qid 0 for these which incorrectly decodes as an admin command. Use
the request_queue's queuedata instead since that value is always set for
the IO queues, and never set for the admin queue.
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

643c476d

nvme-rdma: fix crash due to incorrect cqe · a87da50f

由 Chao Leng 提交于 10月 12, 2020

A crash happened due to injecting error test.
When a CQE has incorrect command id due do an error injection, the host
may find a request which is already freed.  Dereferencing req->mr->rkey
causes a crash in nvme_rdma_process_nvme_rsp because the mr is already
freed.

Add a check for the mr to fix it.
Signed-off-by: NChao Leng <lengchao@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a87da50f

nvme-rdma: fix crash when connect rejected · 43efdb8e

由 Chao Leng 提交于 10月 12, 2020

A crash can happened when a connect is rejected.   The host establishes
the connection after received ConnectReply, and then continues to send
the fabrics Connect command.  If the controller does not receive the
ReadyToUse capsule, host may receive a ConnectReject reply.

Call nvme_rdma_destroy_queue_ib after the host received the
RDMA_CM_EVENT_REJECTED event.  Then when the fabrics Connect command
times out, nvme_rdma_timeout calls nvme_rdma_complete_rq to fail the
request.  A crash happenes due to use after free in
nvme_rdma_complete_rq.

nvme_rdma_destroy_queue_ib is redundant when handling the
RDMA_CM_EVENT_REJECTED event as nvme_rdma_destroy_queue_ib is already
called in connection failure handler.
Signed-off-by: NChao Leng <lengchao@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

43efdb8e

14 10月, 2020 1 次提交

nvme: translate zone resource errors · afaf5c6c

由 Keith Busch 提交于 9月 24, 2020

Translate zoned resource errors to the appropriate blk_status_t.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

afaf5c6c

07 10月, 2020 23 次提交

nvme-core: remove extra condition for vwc · c4485252

由 Chaitanya Kulkarni 提交于 10月 01, 2020

In nvme_set_queue_limits() we initialize vwc to false and later add
a condition to set vwc true. The value of the vwc can be declare
initialized which makes all the blk_queue_XXX() calls uniform.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c4485252

nvme-core: remove extra variable · af5d6f7b

由 Chaitanya Kulkarni 提交于 10月 01, 2020

In nvme_validate_ns() the exra variable ctrl is used only twice.
Using ns->ctrl directly still maintains the redability and original
length of the lines in the code. Get rid of the extra variable ctrl &
use ns->ctrl directly.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

af5d6f7b

nvme: remove nvme_identify_ns_list · 7b153362

由 Christoph Hellwig 提交于 9月 28, 2020

Just fold it into the only caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

7b153362

nvme: refactor nvme_validate_ns · 0a05226a

由 Christoph Hellwig 提交于 9月 28, 2020

Move the logic to revalidate the block_device size or remove the
namespace from the caller into nvme_validate_ns.  This removes
the return value and thus the status code translation.  Additionally
it also catches non-permanent errors from nvme_update_ns_info using
the existing logic.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

0a05226a

nvme: move nvme_validate_ns · b2dc748a

由 Christoph Hellwig 提交于 9月 28, 2020

Move nvme_validate_ns just above its only remaining caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

b2dc748a

nvme: query namespace identifiers before adding the namespace · 8b7c0ff2

由 Christoph Hellwig 提交于 9月 28, 2020

Check the namespace identifier list first thing when scanning namespaces.
This keeps the code to query the CSI common between the alloc and validate
path, and helps to structure the code better for multiple command set
support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

8b7c0ff2

nvme: revalidate zone bitmaps in nvme_update_ns_info · 3a9967ba

由 Christoph Hellwig 提交于 9月 28, 2020

Consolidate the two calls into a single place.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>

3a9967ba

nvme: remove nvme_update_formats · af0f446d

由 Christoph Hellwig 提交于 9月 28, 2020

Now that the queue is frozen before updating ->lba_shift we can't hit the
invalid references mentioned in the comment any more.  More importantly
this code would not have helped us if the format was changed by another
controller or through implementation defined back channels.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

af0f446d

nvme: update the known admin effects · 75eb779e

由 Christoph Hellwig 提交于 9月 25, 2020

A Format NVM command can change the capabilities of namespaces, while
Sanitize does change the Logical Block Content and must be serialized.

Also remove CSUPP bit for Format - it is not a mandatory command,
and we don't check for the bit anyway.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

75eb779e

nvme: set the queue limits in nvme_update_ns_info · 658d9f7c

由 Christoph Hellwig 提交于 9月 28, 2020

Only set the queue limits once we have the real block size.  This also
updates the limits on a rescan if needed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

658d9f7c

nvme: remove the 0 lba_shift check in nvme_update_ns_info · 310b30e5

由 Christoph Hellwig 提交于 9月 28, 2020

We can no longer reach this code if Identify Namespace failed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

310b30e5

nvme: clean up the check for too large logic block sizes · 13f0b26b

由 Christoph Hellwig 提交于 9月 28, 2020

Use a single statement to set both the capacity and fake block size
instead of two.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>

13f0b26b

nvme: freeze the queue over ->lba_shift updates · f9d5f457

由 Christoph Hellwig 提交于 9月 28, 2020

Ensure that there can't be any I/O in flight went we change the disk
geometry in nvme_update_ns_info, most notable the LBA size by lifting
the queue free from nvme_update_disk_info into the caller
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

f9d5f457

nvme: factor out a nvme_configure_metadata helper · d4609ea8

由 Christoph Hellwig 提交于 9月 25, 2020

Factor out a helper from nvme_update_ns_info that configures the
per-namespaces metadata and PI settings.  Also make sure the helpers
clear the flags explicitly instead of all of ->features to allow for
potentially reusing ->features for future non-metadata flags.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

d4609ea8

nvme: call nvme_identify_ns as the first thing in nvme_alloc_ns_block · fab72f5a

由 Christoph Hellwig 提交于 9月 28, 2020

Check if the namespace actually exists as the very first thing and don't
bother with any extra work if not.  This should speed up and simplify
the sequential scanning for NVMe 1.0 devices.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

fab72f5a

nvme: lift the check for an unallocated namespace into nvme_identify_ns · b8b8cd01

由 Christoph Hellwig 提交于 9月 28, 2020

Move the check from the two callers into the common helper.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

b8b8cd01

nvme: rename __nvme_revalidate_disk · 81382f17

由 Christoph Hellwig 提交于 9月 28, 2020

Rename __nvme_revalidate_disk to nvme_update_ns_info and pass a
namespace instead of the gendisk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

81382f17

nvme: rename _nvme_revalidate_disk · 2124f096

由 Christoph Hellwig 提交于 9月 28, 2020

Rename _nvme_revalidate_disk to nvme_validate_ns to better describe
what the function does, and pass the struct nvme_ns instead of the
gendisk to better match the call chain.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

2124f096

nvme: rename nvme_validate_ns to nvme_validate_or_alloc_ns · eba9bcf7

由 Christoph Hellwig 提交于 9月 28, 2020

Use a slightly more descriptive name to enable reusing nvme_validate_ns
in the next patch for a lower level function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

eba9bcf7

nvme: remove the disk argument to nvme_update_zone_info · d525c3c0

由 Christoph Hellwig 提交于 8月 20, 2020

The queue can trivially be derived from the nvme_ns structure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

d525c3c0

nvme: fix initialization of the zone bitmaps · 7fad20dd

由 Christoph Hellwig 提交于 8月 20, 2020

The removal of the ->revalidate_disk method broke the initialization of
the zone bitmaps, as nvme_revalidate_disk now never gets called during
initialization.

Move the zone related code from nvme_revalidate_disk into a new helper in
zns.c, and call it from nvme_alloc_ns in addition to nvme_validate_ns to
ensure the zone bitmaps are initialized during probe.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

7fad20dd

nvme-loop: don't put ctrl on nvme_init_ctrl error · 1401fcc4

由 Chaitanya Kulkarni 提交于 9月 29, 2020

The function nvme_init_ctrl() gets the ctrl reference & when it fails it
does put the ctrl reference in the error unwind code.

When creating loop ctrl in nvme_loop_create_ctrl() if nvme_init_ctrl()
returns non zero (i.e. error) value it jumps to the "out_put_ctrl" label
which calls nvme_put_ctrl(), that will lead to douple ctrl put in error
unwind path.

Update nvme_loop_create_ctrl() such that this patch removes the
"out_put_ctrl" label, add a new "out" label after nvme_put_ctrl() in
error unwind path and jump to newly added label when nvme_init_ctrl()
call retuns an error.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1401fcc4

nvme-core: put ctrl ref when module ref get fail · 4bab6909

由 Chaitanya Kulkarni 提交于 10月 06, 2020

When try_module_get() fails in the nvme_dev_open() it returns without
releasing the ctrl reference which was taken earlier.

Put the ctrl reference which is taken before calling the
try_module_get() in the error return code path.

Fixes: 52a3974f "nvme-core: get/put ctrl and transport module in nvme_dev_open/release()"
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4bab6909

03 10月, 2020 1 次提交

nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() · 7d4194ab

由 Coly Li 提交于 10月 02, 2020

Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.

The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().

This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.
Signed-off-by: NColy Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d4194ab

27 9月, 2020 2 次提交

nvme-pci: allocate separate interrupt for the reserved non-polled I/O queue · 21cc2f3f

由 Jeffle Xu 提交于 9月 24, 2020

One queue will be reserved for non-polled IO when nvme.poll_queues is
greater or equal than the number of IO queues that the nvme controller
can provide. Currently the reserved queue for non-polled IO will reuse
the interrupt used by admin queue in this case, e.g, vector 0.

This can work and the performance may not be an issue since the admin
queue is used unfrequently. However this behaviour may be inconsistent
with that when nvme.poll_queues is smaller than the number of IO
queues available.

Thus allocate separate interrupt for this reserved queue, and thus make
the behaviour consistent.
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
[hch: minor cleanups, mostly to the pre-existing surrounding code]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

21cc2f3f

nvme: fix error handling in nvme_ns_report_zones · 936fab50

由 Christoph Hellwig 提交于 8月 30, 2020

nvme_submit_sync_cmd can return positive NVMe error codes in addition to
the negative Linux error code, which are currently ignored. Fix this
by removing __nvme_ns_report_zones and handling the errors from
nvme_submit_sync_cmd in the caller instead of multiplexing the return
value and the number of zones reported into a single return value.

Fixes: 240e6ee2 ("nvme: support for zoned namespaces")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>

936fab50

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功