提交 · 9a0be7abb62ff2a7dc3360ab45c31f29b3faf642 · openanolis / cloud-kernel

02 12月, 2015 15 次提交

nvme: refactor set_queue_count · 9a0be7ab

由 Christoph Hellwig 提交于 11月 26, 2015

Split out a helper that just issues the Set Features and interprets the
result which can go to common code, and document why we are ignoring
non-timeout error returns in the PCIe driver.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

9a0be7ab

nvme: move chardev and sysfs interface to common code · f3ca80fc

由 Christoph Hellwig 提交于 11月 28, 2015

For this we need to add a proper controller init routine and a list of
all controllers that is in addition to the list of PCIe controllers,
which stays in pci.c.  Note that we remove the sysfs device when the
last reference to a controller is dropped now - the old code would have
kept it around longer, which doesn't make much sense.

This requires a new ->reset_ctrl operation to implement controleller
resets, and a new ->write_reg32 operation that is required to implement
subsystem resets.  We also now store caches copied of the NVMe compliance
version and the flag if a controller is attached to a subsystem or not in
the generic controller structure now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[Fixes for pr merge]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f3ca80fc

nvme: move namespace scanning to common code · 5bae7f73

由 Christoph Hellwig 提交于 11月 28, 2015

The namespace scanning code has been mostly generic already, we just
need to store a pointer to the tagset in the nvme_ctrl structure, and
add a method to check if a controller is I/O incapable.  The latter
will hopefully be replaced by a proper controller state machine soon.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[Fixed pr conflicts]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5bae7f73

nvme: add a common helper to read Identify Controller data · 7fd8930f

由 Christoph Hellwig 提交于 11月 28, 2015

And add the 64-bit register read operation for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

7fd8930f

C
nvme: move nvme_{enable,disable,shutdown}_ctrl to common code · 5fd4ce1b
由 Christoph Hellwig 提交于 11月 28, 2015
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
5fd4ce1b

nvme: add explicit quirk handling · 106198ed

由 Christoph Hellwig 提交于 11月 26, 2015

Add an enum for all workarounds not in the spec and identify the affected
controllers at probe time.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

106198ed

nvme: move block_device_operations and ns/ctrl freeing to common code · 1673f1f0

由 Christoph Hellwig 提交于 11月 26, 2015

This moves the block_device_operations over to common code mostly
as-is.  The only change is that the ns and ctrl refcounting got some
small refcounting to have wrappers around the kref_put operations.

A new free_ctrl operation is added to allow the PCI driver to free
it's ressources on the final drop.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[Moved the integrity and pr changes due to merge conflict]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1673f1f0

nvme: use the block layer for userspace passthrough metadata · 0b7f1f26

由 Keith Busch 提交于 10月 23, 2015

Use the integrity API to pass through metadata from userspace.  For PI
enabled devices this means that we now validate the reftag, which seems
like an unintentional ommission in the old code.

Thanks to Keith Busch for testing and fixes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[Skip metadata setup on admin commands]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0b7f1f26

nvme: split __nvme_submit_sync_cmd · 4160982e

由 Christoph Hellwig 提交于 11月 20, 2015

Add a separate nvme_submit_user_cmd for commands that directly DMA
to or from userspace.  We'll add metadata support to that soon and
the common version would become too messy.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4160982e

nvme: move nvme_setup_flush and nvme_setup_rw to common code · 22944e99

由 Christoph Hellwig 提交于 10月 16, 2015

And mark them inline so that we don't slow down the I/O submission path by
having to turn it into a forced out of line call.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

22944e99

nvme: move nvme_error_status to common code · 15a190f7

由 Christoph Hellwig 提交于 10月 16, 2015

And mark it inline so that we don't slow down the completion path by
having to turn it into a forced out of line call.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

15a190f7

nvme: split a new struct nvme_ctrl out of struct nvme_dev · 1c63dc66

由 Christoph Hellwig 提交于 11月 26, 2015

The new struct nvme_ctrl will be used by the common NVMe code that sits
on top of struct request_queue and the new nvme_ctrl_ops abstraction.
It only contains the bare minimum required, which consists of values
sampled during controller probe, the admin queue pointer and a second
struct device pointer at the moment, but more will follow later.  Only
values that are not used in the I/O fast path should be moved to
struct nvme_ctrl so that drivers can optimize their cache line usage
easily.  That's also the reason why we have two device pointers as
the struct device is used for DMA mapping purposes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1c63dc66

nvme: use offset instead of a struct for registers · 7a67cbea

由 Christoph Hellwig 提交于 11月 20, 2015

This makes life easier for future non-PCI drivers where access to the
registers might be more complicated.  Note that Linux drivers are
pretty evenly split between the two versions, and in fact the NVMe
driver already uses offsets for the doorbells.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
[Fixed CMBSZ offset]
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

7a67cbea

nvme: split command submission helpers out of pci.c · 21d34711

由 Christoph Hellwig 提交于 11月 26, 2015

Create a new core.c and start by adding the command submission helpers
to it, which are already abstracted away from the actual hardware queues
by the block layer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

21d34711

nvme: move struct nvme_iod to pci.c · 71bd150c

由 Christoph Hellwig 提交于 10月 16, 2015

This structure is specific to the PCIe driver internals and should be moved
to pci.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

71bd150c

29 10月, 2015 1 次提交

nvme: LightNVM support · ca064085

由 Matias Bjørling 提交于 10月 29, 2015

The first generation of Open-Channel SSDs is based on NVMe. The NVMe
driver is extended with support for the LightNVM command set.

Detection is made through PCI IDs. Current supported devices are the
qemu nvme simulator and CNEX Labs Westlake SSD. The qemu nvme enables
support through vendor specific bits in the namespace identification and
the CNEX Labs Westlake SSD implements a LightNVM compatible firmware and
is detected using the same method as qemu.

After detection, vendor specific codes are used to identify the device
and enumerate supported features.
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJavier González <jg@lightnvm.io>
Signed-off-by: NMatias Bjørling <m@bjorling.me>
Signed-off-by: NJens Axboe <axboe@fb.com>

ca064085

10 10月, 2015 4 次提交

nvme: move to a new drivers/nvme/host directory · 57dacad5

由 Jay Sternberg 提交于 10月 09, 2015

This patch moves the NVMe driver from drivers/block/ to its own new
drivers/nvme/host/ directory.  This is in preparation of splitting the
current monolithic driver up and add support for the upcoming NVMe
over Fabrics standard.  The drivers/nvme/host/ is chose to leave space
for a NVMe target implementation in addition to this host side driver.
Signed-off-by: NJay Sternberg <jay.e.sternberg@intel.com>
[hch: rebased, renamed core.c to pci.c, slight tweaks]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

57dacad5

nvme: add a local nvme.h header · f11bb3e2

由 Christoph Hellwig 提交于 10月 03, 2015

Add a new drivers/block/nvme.h which contains all the driver internal
interface.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f11bb3e2

NVMe: Simplify device resume on io queue failure · 0a7385ad

由 Keith Busch 提交于 10月 02, 2015

Releasing IO queues and disks was done in a work queue outside the
controller resume context to delete namespaces if the controller failed
after a resume from suspend. This is unnecessary since we can resume
a device asynchronously.

This patch makes resume use probe_work so it can directly remove
namespaces if the device is manageable but not IO capable. Since the
deleting disks was the only reason we had the convoluted "reset_workfn",
this patch removes that unnecessary indirection.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

0a7385ad

NVMe: Reference count open namespaces · 188c3568

由 Keith Busch 提交于 10月 01, 2015

Dynamic namespace attachment means the namespace may be removed at any
time, so the namespace reference count can not be tied to the device
reference count. This fixes a NULL dereference if an opened namespace
is detached from a controller.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

188c3568

19 8月, 2015 2 次提交

NVMe: Add nvme subsystem reset IOCTL · 81f03fed

由 Jon Derrick 提交于 8月 10, 2015

Controllers can perform optional subsystem resets as introduced in NVMe
1.1. This patch adds an IOCTL to trigger the subsystem reset by writing
"NVMe" to the NSSR register.
Signed-off-by: NJon Derrick <jonathan.derrick@intel.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

81f03fed

NVMe: Add nvme subsystem reset support · dfbac8c7

由 Keith Busch 提交于 8月 10, 2015

Controllers part of an NVMe subsystem may be reset by any other controller
in the subsystem. If the device is capable of subsystem resets, this
patch adds detection for such events and performs appropriate controller
initialization upon subsystem reset detection.

The register bit is a RW1C type, so the driver needs to write a 1 to the
status bit to clear the subsystem reset occured bit during initialization.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

dfbac8c7

21 7月, 2015 1 次提交

NVMe: Use CMB for the IO SQes if available · 8ffaadf7

由 Jon Derrick 提交于 7月 20, 2015

Some controllers have a controller-side memory buffer available for use
for submissions, completions, lists, or data.

If a CMB is available, the entire CMB will be ioremapped and it will
attempt to map the IO SQes onto the CMB. The queues will be shrunk as
needed. The CMB will not be used if the queue depth is shrunk below some
threshold where it may have reduced performance over a larger queue
in system memory.
Signed-off-by: NJon Derrick <jonathan.derrick@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

8ffaadf7

06 6月, 2015 1 次提交

NVMe: Automatic namespace rescan · a5768aa8

由 Keith Busch 提交于 6月 01, 2015

Namespaces may be dynamically allocated and deleted or attached and
detached. This has the driver rescan the device for namespace changes
after each device reset or namespace change asynchronous event.

There could potentially be many detached namespaces that we don't want
polluting /dev/ with unusable block handles, so this will delete disks
if the namespace is not active as indicated by the response from identify
namespace. This also skips adding the disk if no capacity is provisioned
to the namespace in the first place.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a5768aa8

22 5月, 2015 3 次提交

nvme: submit internal commands through the block layer · d29ec824

由 Christoph Hellwig 提交于 5月 22, 2015

Use block layer queues with an internal cmd_type to submit internally
generated NVMe commands. This both simplifies the code a lot and allow
for a better structure. For example now the LighNVM code can construct
commands without knowing the details of the underlying I/O descriptors.
Or a future NVMe over network target could inject commands, as well as
could the SCSI translation and ioctl code be reused for such a beast.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d29ec824

nvme: store a struct device pointer in struct nvme_dev · e75ec752

由 Christoph Hellwig 提交于 5月 22, 2015

Most users want the generic device, so store that in struct nvme_dev
instead of the pci_dev.  This also happens to be a nice step towards
making some code reusable for non-PCI transports.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

e75ec752

nvme: consolidate synchronous command submission helpers · f705f837

由 Christoph Hellwig 提交于 5月 22, 2015

Note that we keep the unused timeout argument, but allow callers to
pass 0 instead of a timeout if they want the default.  This will allow
adding a timeout to the pass through path later on.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

f705f837

08 4月, 2015 1 次提交

NVMe: Meta data handling through submit io ioctl · a67a9513

由 Keith Busch 提交于 4月 07, 2015

This adds support for the extended metadata formats through the submit
IO ioctl, and simplifies the rest when using a separate metadata format.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a67a9513

20 2月, 2015 5 次提交

NVMe: Fix potential corruption during shutdown · 07836e65

由 Keith Busch 提交于 2月 19, 2015

The driver has to end unreturned commands at some point even if the
controller has not provided a completion. The driver tried to be safe by
deleting IO queues prior to ending all unreturned commands. That should
cause the controller to internally abort inflight commands, but IO queue
deletion request does not have to be successful, so all bets are off. We
still have to make progress, so to be extra safe, this patch doesn't
clear a queue to release the dma mapping for a command until after the
pci device has been disabled.

This patch removes the special handling during device initialization
so controller recovery can be done all the time. This is possible since
initialization is not inlined with pci probe anymore.
Reported-by: NNilish Choudhury <nilesh.choudhury@oracle.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

07836e65

NVMe: Asynchronous controller probe · 2e1d8448

由 Keith Busch 提交于 2月 12, 2015

This performs the longest parts of nvme device probe in scheduled work.
This speeds up probe significantly when multiple devices are in use.
Signed-off-by: NKeith Busch <keith.busch@intel.com>

2e1d8448

NVMe: Register management handle under nvme class · b3fffdef

由 Keith Busch 提交于 2月 03, 2015

This creates a new class type for nvme devices to register their
management character devices with. This is so we do not rely on miscdev
to provide enough minors for as many nvme devices some people plan to
use. The previous limit was approximately 60 NVMe controllers, depending
on the platform and kernel. Now the limit is 1M, which ought to be enough
for anybody.

Since we have a new device class, it makes sense to attach the block
devices under this as well, so part of this patch moves the management
handle initialization prior to the namespaces discovery.
Signed-off-by: NKeith Busch <keith.busch@intel.com>

b3fffdef

NVMe: Update SCSI Inquiry VPD 83h translation · 4f1982b4

由 Keith Busch 提交于 2月 19, 2015

The original translation created collisions on Inquiry VPD 83 for many
existing devices. Newer specifications provide other ways to translate
based on the device's version can be used to create unique identifiers.

Version 1.1 provides an EUI64 field that uniquely identifies each
namespace, and 1.2 added the longer NGUID field for the same reason.
Both follow the IEEE EUI format and readily translate to the SCSI device
identification EUI designator type 2h. For devices implementing either,
the translation will use this type, defaulting to the EUI64 8-byte type if
implemented then NGUID's 16 byte version if not. If neither are provided,
the 1.0 translation is used, and is updated to use the SCSI String format
to guarantee a unique identifier.

Knowing when to use the new fields depends on the nvme controller's
revision. The NVME_VS macro was not decoding this correctly, so that is
fixed in this patch and moved to a more appropriate place.

Since the Identify Namespace structure required an update for the NGUID
field, this patch adds the remaining new 1.2 fields to the structure.
Signed-off-by: NKeith Busch <keith.busch@intel.com>

4f1982b4

NVMe: Metadata format support · e1e5e564

由 Keith Busch 提交于 2月 19, 2015

Adds support for NVMe metadata formats and exposes block devices for
all namespaces regardless of their format. Namespace formats that are
unusable will have disk capacity set to 0, but a handle to the block
device is created to simplify device management. A namespace is not
usable when the format requires host interleave block and metadata in
single buffer, has no provisioned storage, or has better data but failed
to register with blk integrity.

The namespace has to be scanned in two phases to support separate
metadata formats. The first establishes the sector size and capacity
prior to invoking add_disk. If metadata is required, the capacity will
be temporarilly set to 0 until it can be revalidated and registered with
the integrity extenstions after add_disk completes.

The driver relies on the integrity extensions to provide the metadata
buffer. NVMe requires this be a single physically contiguous region,
so only one integrity segment is allowed per command. If the metadata
is used for T10 PI, the driver provides mappings to save and restore
the reftag physical block translation. The driver provides no-op
functions for generate and verify if metadata is not used for protection
information. This way the setup is always provided by the block layer.

If a request does not supply a required metadata buffer, the command
is failed with bad address. This could only happen if a user manually
disables verify/generate on such a disk. The only exception to where
this is okay is if the controller is capable of stripping/generating
the metadata, which is possible on some types of formats.

The metadata scatter gather list now occupies the spot in the nvme_iod
that used to be used to link retryable IOD's, but we don't do that
anymore, so the field was unused.
Signed-off-by: NKeith Busch <keith.busch@intel.com>

e1e5e564

30 1月, 2015 1 次提交

NVMe: avoid kmalloc/kfree for smaller IO · ac3dd5bd

由 Jens Axboe 提交于 1月 22, 2015

Currently we allocate an nvme_iod for each IO, which holds the
sg list, prps, and other IO related info. Set a threshold of
2 pages and/or 8KB of data, below which we can just embed this
in the per-command pdu in blk-mq. For any IO at or below
NVME_INT_PAGES and NVME_INT_BYTES, we save a kmalloc and kfree.

For higher IOPS, this saves up to 1% of CPU time.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

ac3dd5bd

05 11月, 2014 3 次提交

NVMe: Convert to blk-mq · a4aea562

由 Matias Bjørling 提交于 11月 04, 2014

This converts the NVMe driver to a blk-mq request-based driver.

The NVMe driver is currently bio-based and implements queue logic within
itself.  By using blk-mq, a lot of these responsibilities can be moved
and simplified.

The patch is divided into the following blocks:

 * Per-command data and cmdid have been moved into the struct request
   field. The cmdid_data can be retrieved using blk_mq_rq_to_pdu() and id
   maintenance are now handled by blk-mq through the rq->tag field.

 * The logic for splitting bio's has been moved into the blk-mq layer.
   The driver instead notifies the block layer about limited gap support in
   SG lists.

 * blk-mq handles timeouts and is reimplemented within nvme_timeout().
   This both includes abort handling and command cancelation.

 * Assignment of nvme queues to CPUs are replaced with the blk-mq
   version. The current blk-mq strategy is to assign the number of
   mapped queues and CPUs to provide synergy, while the nvme driver
   assign as many nvme hw queues as possible. This can be implemented in
   blk-mq if needed.

 * NVMe queues are merged with the tags structure of blk-mq.

 * blk-mq takes care of setup/teardown of nvme queues and guards invalid
   accesses. Therefore, RCU-usage for nvme queues can be removed.

 * IO tracing and accounting are handled by blk-mq and therefore removed.

 * Queue suspension logic is replaced with the logic from the block
   layer.

Contributions in this patch from:

  Sam Bradshaw <sbradshaw@micron.com>
  Jens Axboe <axboe@fb.com>
  Keith Busch <keith.busch@intel.com>
  Robert Nelson <rlnelson@google.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Acked-by: NJens Axboe <axboe@fb.com>

Updated for new ->queue_rq() prototype.
Signed-off-by: NJens Axboe <axboe@fb.com>

a4aea562

NVMe: Mismatched host/device page size support · 1d090624

由 Keith Busch 提交于 6月 23, 2014

Adds support for devices with max page size smaller than the host's.
In the case we encounter such a host/device combination, the driver will
split a page into as many PRP entries as necessary for the device's page
size capabilities. If the device's reported minimum page size is greater
than the host's, the driver will not attempt to enable the device and
return an error instead.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1d090624

NVMe: Async event request · 6fccf938

由 Keith Busch 提交于 6月 18, 2014

Submits NVMe asynchronous event requests, one event up to the controller
maximum or number of possible different event types (8), whichever is
smaller. Events successfully returned by the controller are logged.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6fccf938

13 6月, 2014 1 次提交

NVMe: Fix hot cpu notification dead lock · f3db22fe

由 Keith Busch 提交于 6月 11, 2014

There is a potential dead lock if a cpu event occurs during nvme probe
since it registered with hot cpu notification. This fixes the race by
having the module register with notification outside of probe rather
than have each device register.

The actual work is done in a scheduled work queue instead of in the
notifier since assigning IO queues has the potential to block if the
driver creates additional queues.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

f3db22fe

04 6月, 2014 1 次提交

NVMe: Rename io_timeout to nvme_io_timeout · bd67608a

由 Matthew Wilcox 提交于 6月 03, 2014

It's positively immoral to have a global variable called 'io_timeout'.
Keep the module parameter called io_timeout, though.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

bd67608a

05 5月, 2014 1 次提交

NVMe: Flush with data support · 53562be7

由 Keith Busch 提交于 4月 29, 2014

It is possible a filesystem may send a flush flagged bio with write
data. There is no such composite NVMe command, so the driver sends flush
and write separately.

The device is allowed to execute these commands in any order, so it was
possible the driver ends the bio after the write completes, but while the
flush is still active. We don't want to let a filesystem believe flush
succeeded before it really has; this could cause data corruption on a
power loss between these events. To fix, this patch splits the flush
and write into chained bios.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

53562be7

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功