提交 · d19b8bc82fc232d17ec45ca148388e4ba05ac4b9 · openeuler / Kernel

02 11月, 2018 1 次提交

nvme-fc: fix request private initialization · d19b8bc8

由 James Smart 提交于 10月 27, 2018

The patch made to avoid Coverity reporting of out of bounds access
on aen_op moved the assignment of a pointer, leaving it null when it
was subsequently used to calculate a private pointer. Thus the private
pointer was bad.

Move/correct the private pointer initialization to be in sync with the
patch.

Fixes: 0d2bdf9f ("nvme-fc: rework the request initialization code")
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d19b8bc8

19 10月, 2018 2 次提交

nvme-fabrics: move controller options matching to fabrics · b7c7be6f

由 Sagi Grimberg 提交于 10月 18, 2018

IP transports will most likely use the same controller options
matching when detecting a duplicate connect. Move it to
fabrics.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b7c7be6f

nvme-rdma: always have a valid trsvcid · bb59b8e5

由 Sagi Grimberg 提交于 10月 19, 2018

If not passed, we set the default trsvcid. We can rely on having trsvcid
and can simplify the controller matching logic.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb59b8e5

18 10月, 2018 3 次提交

nvme-pci: remove duplicate check · 3045c0d0

由 Chaitanya Kulkarni 提交于 10月 17, 2018

This is a cleanup patch doesn't change any functionality. It removes
the duplicate call to the blk_integrity_rq() in the nvme_map_data().
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3045c0d0

nvme-pci: Add support for P2P memory in requests · e0596ab2

由 Logan Gunthorpe 提交于 10月 04, 2018

For P2P requests, we must use the pci_p2pmem_map_sg() function instead of
the dma_map_sg functions.

With that, we can then indicate PCI_P2P support in the request queue.  For
this, we create an NVME_F_PCI_P2P flag which tells the core to set
QUEUE_FLAG_PCI_P2P in the request queue.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

e0596ab2

nvme-pci: Use PCI p2pmem subsystem to manage the CMB · 0f238ff5

由 Logan Gunthorpe 提交于 10月 04, 2018

Register the CMB buffer as p2pmem and use the appropriate allocation
functions to create and destroy the IO submission queues.

If the CMB supports WDS and RDS, publish it for use as P2P memory by other
devices.

Kernels without CONFIG_PCI_P2PDMA will also no longer support NVMe CMB.
However, seeing the main use-cases for the CMB is P2P operations, this
seems like a reasonable dependency.

We drop the __iomem safety on the buffer seeing that, by convention, it's
safe to directly access memory mapped by memremap()/devm_memremap_pages().
Architectures where this is not safe will not be supported by memremap()
and therefore will not support PCI P2P and have no support for CMB.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

0f238ff5

17 10月, 2018 10 次提交

nvme-pci: fix hot removal during error handling · cb4bfda6

由 Keith Busch 提交于 10月 15, 2018

A removal waits for the reset_work to complete. If a surprise removal
occurs around the same time as an error triggered controller reset, and
reset work happened to dispatch a command to the removed controller, the
command won't be recovered since the timeout work doesn't do anything
during error recovery. We wouldn't want to wait for timeout handling
anyway, so this patch fixes this by disabling the controller and killing
admin queues prior to syncing with the reset_work.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

cb4bfda6

nvme-core: make implicit seed truncation explicit · 202359c0

由 Bart Van Assche 提交于 10月 10, 2018

The nvme_user_io.slba field is 64 bits wide. That value is copied into the
32-bit bio_integrity_payload.bip_iter.bi_sector field. Make that truncation
explicit to avoid that Coverity complains about implicit truncation. See
also Coverity ID 1056486 on http://scan.coverity.com/projects/linux.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

202359c0

nvme-fc: rework the request initialization code · 0d2bdf9f

由 Bart Van Assche 提交于 10月 08, 2018

Instead of setting and then clearing the first_sgl pointer for AEN requests,
leave that pointer zero. This patch does not change how requests are
initialized but avoids that Coverity reports the following complaint for
nvme_fc_init_aen_ops():

CID 1418400 (#1 of 1): Out-of-bounds access (OVERRUN)
4. overrun-buffer-val: Overrunning buffer pointed to by aen_op of 312 bytes by passing it to a function which accesses it at byte offset 312.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0d2bdf9f

nvme-fc: introduce struct nvme_fcp_op_w_sgl · d3d0bc78

由 Bart Van Assche 提交于 10月 08, 2018

This patch does not change any functionality but makes the intent of the
code more clear.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d3d0bc78

nvme-fc: fix kernel-doc headers · 76c910c7

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that the kernel-doc tool complains about several
multiple function headers when building with W=1.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

76c910c7

nvme-pci: fix nvme_suspend_queue() kernel-doc header · 40581d1a

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that the kernel-doc tool complains about the
nvme_suspend_queue() function header when building with W=1.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

40581d1a

nvme-core: rework a NQN copying operation · bb2a1d4e

由 Bart Van Assche 提交于 10月 08, 2018

Although it is easy to see that the code in nvme_init_subnqn() guarantees that
the subsys->nqn string is '\0'-terminated, apparently Coverity is not smart
enough to see this. Make it easier for Coverity to analyze this code by changing
the strncpy() call into a strlcpy() call. This patch does not change the
behavior of the code but fixes Coveritiy ID 1423720.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb2a1d4e

nvme-core: declare local symbols static · eb090c4c

由 Bart Van Assche 提交于 10月 08, 2018

This patch avoids that sparse complains about missing declarations.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

eb090c4c

nvmet-rdma: check for timeout in nvme_rdma_wait_for_cm() · 35da77d5

由 Bart Van Assche 提交于 10月 08, 2018

Check whether queue->cm_error holds a value before reading it. This patch
addresses Coverity ID 1373774: unchecked return value.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

35da77d5

nvme: update node paths after adding new path · 886fabf6

由 Keith Busch 提交于 10月 05, 2018

The nvme namespace paths were being updated only when the current path
was not set or nonoptimized. If a new path comes online that is a better
path for its NUMA node, the multipath selector may continue using the
previously set path on a potentially further node.

This patch re-runs the path assignment after successfully adding a new
optimized path.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

886fabf6

09 10月, 2018 3 次提交

lightnvm: do no update csecs and sos on 1.2 · 6fd05cad

由 Javier González 提交于 10月 09, 2018

1.2 devices exposes their data and metadata size through the separate
identify command. Make sure that the NVMe LBA format does not override
these values.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6fd05cad

lightnvm: use internal allocation for chunk log page · 090ee26f

由 Javier González 提交于 10月 09, 2018

The lightnvm subsystem provides helpers to retrieve chunk metadata,
where the target needs to provide a buffer to store the metadata. An
implicit assumption is that this buffer is contiguous and can be used to
retrieve the data from the device. If the device exposes too many
chunks, then kmalloc might fail, thus failing instance creation.

This patch removes this assumption by implementing an internal buffer in
the lightnvm subsystem to retrieve chunk metadata. Targets can then
use virtual memory allocations. Since this is a target API change, adapt
pblk accordingly.
Signed-off-by: NJavier González <javier@cnexlabs.com>
Reviewed-by: NHans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

090ee26f

lightnvm: move bad block and chunk state logic to core · aff3fb18

由 Matias Bjørling 提交于 10月 09, 2018

pblk implements two data paths for recovery line state. One for 1.2
and another for 2.0, instead of having pblk implement these, combine
them in the core to reduce complexity and make available to other
targets.

The new interface will adhere to the 2.0 chunk definition,
including managing open chunks with an active write pointer. To provide
this interface, a 1.2 device recovers the state of the chunks by
manually detecting if a chunk is either free/open/close/offline, and if
open, scanning the flash pages sequentially to find the next writeable
page. This process takes on average ~10 seconds on a device with 64 dies,
1024 blocks and 60us read access time. The process can be parallelized
but is left out for maintenance simplicity, as the 1.2 specification is
deprecated. For 2.0 devices, the logic is maintained internally in the
drive and retrieved through the 2.0 interface.
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

aff3fb18

08 10月, 2018 1 次提交

nvme: remove ns sibling before clearing path · 48f78be3

由 Keith Busch 提交于 10月 05, 2018

The code had been clearing a namespace being deleted as the current
path while that namespace was still in the path siblings list. It is
possible a new IO could set that namespace back to the current path
since it appeared to be an eligable path to select, which may result in
a use-after-free error.

This patch ensures a namespace being removed is not eligable to be reset
as a current path prior to clearing it as the current path.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

48f78be3

03 10月, 2018 1 次提交

PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls · 62b36c3e

由 Oza Pawandeep 提交于 9月 28, 2018

After bfcb79fc ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.

Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.
Signed-off-by: NOza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

62b36c3e

02 10月, 2018 6 次提交

nvme: take node locality into account when selecting a path · f3334447

由 Christoph Hellwig 提交于 9月 11, 2018

Make current_path an array with an entry for every possible node, and
cache the best path on a per-node basis.  Take the node distance into
account when selecting it.  This is primarily useful for dual-ported PCIe
devices which are connected to PCIe root ports on different sockets.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>

f3334447

nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O · 783f4a44

由 James Smart 提交于 9月 27, 2018

When an io is rejected by nvmf_check_ready() due to validation of the
controller state, the nvmf_fail_nonready_command() will normally return
BLK_STS_RESOURCE to requeue and retry.  However, if the controller is
dying or the I/O is marked for NVMe multipath, the I/O is failed so that
the controller can terminate or so that the io can be issued on a
different path.  Unfortunately, as this reject point is before the
transport has accepted the command, blk-mq ends up completing the I/O
and never calls nvme_complete_rq(), which is where multipath may preserve
or re-route the I/O. The end result is, the device user ends up seeing an
EIO error.

Example: single path connectivity, controller is under load, and a reset
is induced.  An I/O is received:

  a) while the reset state has been set but the queues have yet to be
     stopped; or
  b) after queues are started (at end of reset) but before the reconnect
     has completed.

The I/O finishes with an EIO status.

This patch makes the following changes:

  - Adds the HOST_PATH_ERROR pathing status from TP4028
  - Modifies the reject point such that it appears to queue successfully,
    but actually completes the io with the new pathing status and calls
    nvme_complete_rq().
  - nvme_complete_rq() recognizes the new status, avoids resetting the
    controller (likely was already done in order to get this new status),
    and calls the multipather to clear the current path that errored.
    This allows the next command (retry or new command) to select a new
    path if there is one.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

783f4a44

nvme-core: add async event trace helper · 09bd1ff4

由 Chaitanya Kulkarni 提交于 9月 17, 2018

This patch adds a new event for nvme async event notification.
We print the async event in the decoded format when we recognize
the event otherwise we just dump the result.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

09bd1ff4

nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device · 97faec53

由 James Smart 提交于 9月 13, 2018

The fc transport device should allow for a rediscovery, as userspace
might have lost the events. Example is udev events not handled during
system startup.

This patch add a sysfs entry 'nvme_discovery' on the fc class to
have it replay all udev discovery events for all local port/remote
port address pairs.
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97faec53

nvme-fc: fix for a minor typos · d4e4230c

由 Milan P. Gandhi 提交于 8月 10, 2018

Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d4e4230c

nvme: fix typo in nvme_identify_ns_descs · 53b3a661

由 Milan P. Gandhi 提交于 8月 09, 2018

Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

53b3a661

28 9月, 2018 2 次提交

nvme: register ns_id attributes as default sysfs groups · 33b14f67

由 Hannes Reinecke 提交于 9月 28, 2018

We should be registering the ns_id attribute as default sysfs
attribute groups, otherwise we have a race condition between
the uevent and the attributes appearing in sysfs.
Suggested-by: NBart van Assche <bvanassche@acm.org>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

33b14f67

block: genhd: add 'groups' argument to device_add_disk · fef912bf

由 Hannes Reinecke 提交于 9月 28, 2018

Update device_add_disk() to take an 'groups' argument so that
individual drivers can register a device with additional sysfs
attributes.
This avoids race condition the driver would otherwise have if these
groups were to be created with sysfs_add_groups().
Signed-off-by: NMartin Wilck <martin.wilck@suse.com>
Signed-off-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fef912bf

26 9月, 2018 1 次提交
- S
  nvme: properly propagate errors in nvme_mpath_init · bb830add
  由 Susobhan Dey 提交于 9月 25, 2018
```
Signed-off-by: NSusobhan Dey <susobhan.dey@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
  bb830add
28 8月, 2018 1 次提交

nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event · f1ed3df2

由 Michal Wnukowski 提交于 8月 15, 2018

In many architectures loads may be reordered with older stores to
different locations.  In the nvme driver the following two operations
could be reordered:

 - Write shadow doorbell (dbbuf_db) into memory.
 - Read EventIdx (dbbuf_ei) from memory.

This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell).  If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.

This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:

orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
	concat -write 40 -duration 120 -matrix row -testname nvme_test

Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).

Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.

Fixes: f9f38e33 ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: NMichal Wnukowski <wnukowski@google.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
[hch: updated changelog and comment a bit]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f1ed3df2

08 8月, 2018 2 次提交

nvme-fabrics: fix ctrl_loss_tmo < 0 to reconnect forever · 66414e80

由 Tal Shorer 提交于 8月 07, 2018

When the user supplies a ctrl_loss_tmo < 0, we warn them that this will
cause the fabrics layer to attempt reconnection forever. However, in
reality the fabrics layer never attempts to reconnect because the
condition to test whether we should reconnect is backwards in this case.
Signed-off-by: NTal Shorer <tal.shorer@gmail.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

66414e80

nvme: set gendisk read only based on nsattr · 1293477f

由 Chaitanya Kulkarni 提交于 8月 07, 2018

NVMe 1.3 TP 4005 introduces new filed (NSATTR). This field indicates
whether given namespace is write protected or not. This patch sets the
gendisk associated with the namespace to read only based on the identify
namespace nsattr field.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1293477f

07 8月, 2018 1 次提交

nvme: fixup crash on failed discovery · 8f220c41

由 Hannes Reinecke 提交于 8月 07, 2018

When the initial discovery fails the subsystem hasn't been setup yet
in nvme_mpath_stop, and we can't dereference ctrl->subsys.

Fixes: 0d0b660f ("nvme: add ANA support")
Signed-off-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

8f220c41

06 8月, 2018 1 次提交

lightnvm: remove minor version check for 2.0 · f10fe9d8

由 Matias Bjørling 提交于 8月 05, 2018

A minor version number increase should not break backwards
compatibility.

Fixes: 3cb98f84 ("lightnvm: add minor version to generic geometry")
Reviewed-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f10fe9d8

30 7月, 2018 2 次提交

nvme: use blk API to remap ref tags for IOs with metadata · f7f1fc36

由 Max Gurtovoy 提交于 7月 30, 2018

Also moved the logic of the remapping to the nvme core driver instead
of implementing it in the nvme pci driver. This way all the other nvme
transport drivers will benefit from it (in case they'll implement metadata
support).
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f7f1fc36

block: move ref_tag calculation func to the block layer · ddd0bc75

由 Max Gurtovoy 提交于 7月 30, 2018

Currently this function is implemented in the scsi layer, but it's
actual place should be the block layer since T10-PI is a general
data integrity feature that is used in the nvme protocol as well.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ddd0bc75

28 7月, 2018 3 次提交

nvme: add ANA support · 0d0b660f

由 Christoph Hellwig 提交于 5月 14, 2018

Add support for Asynchronous Namespace Access as specified in NVMe 1.3
TP 4004. With ANA each namespace attached to a controller belongs to an
ANA group that describes the characteristics of accessing the namespaces
through this controller. In the optimized and non-optimized states
namespaces can be accessed regularly, although in a multi-pathing
environment we should always prefer to access a namespace through a
controller where an optimized relationship exists. Namespaces in
Inaccessible, Permanent-Loss or Change state for a given controller
should not be accessed.

The states are updated through reading the ANA log page, which is read
once during controller initialization, whenever the ANA change notice
AEN is received, or when one of the ANA specific status codes that
signal a state change is received on a command.

The ANA state is kept in the nvme_ns structure, which makes the checks in
the fast path very simple. Updating the ANA state when reading the log
page is also very simple, the only downside is that finding the initial
ANA state when scanning for namespaces is a bit cumbersome.

The gendisk for a ns_head is only registered once a live path for it
exists. Without that the kernel would hang during partition scanning.

Includes fixes and improvements from Hannes Reinecke.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

0d0b660f

nvme: remove nvme_req_needs_failover · 8decf5d5

由 Christoph Hellwig 提交于 6月 04, 2018

Now that we just call out to blk_path_error there isn't really any good
reason to not merge it into the only caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

8decf5d5

nvme: simplify the API for getting log pages · 0e98719b

由 Christoph Hellwig 提交于 6月 06, 2018

Merge nvme_get_log and nvme_get_log_ext into a single helper, which takes
a plain nsid instead of the nvme_ns pointer.  Also add support for the
log specific field while we're at it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>

0e98719b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功