提交 · fc1841e1c15d72b0897ecfc1627ecdc284f0ec95 · openanolis / cloud-kernel

19 6月, 2017 14 次提交

dm ioctl: add a new DM_DEV_ARM_POLL ioctl · fc1841e1

由 Mikulas Patocka 提交于 5月 05, 2017

This ioctl will record the current global event number in the structure
dm_file, so that next select or poll call will wait until new events
arrived since this ioctl.

The DM_DEV_ARM_POLL ioctl has the same effect as closing and reopening
the handle.

Using the DM_DEV_ARM_POLL ioctl is optional - if the userspace is OK
with closing and reopening the /dev/mapper/control handle after select
or poll, there is no need to re-arm via ioctl.

Usage:
1. open the /dev/mapper/control device
2. send the DM_DEV_ARM_POLL ioctl
3. scan the event numbers of all devices we are interested in and process
   them
4. call select, poll or epoll on the handle (it waits until some new event
   happens since the DM_DEV_ARM_POLL ioctl)
5. go to step 2
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAndy Grover <agrover@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

fc1841e1

dm: add basic support for using the select or poll function · 93e6442c

由 Mikulas Patocka 提交于 1月 16, 2017

Add the ability to poll on the /dev/mapper/control device.  The select
or poll function waits until any event happens on any dm device since
opening the /dev/mapper/control device.  When select or poll returns the
device as readable, we must close and reopen the device to wait for new
dm events.

Usage:
1. open the /dev/mapper/control device
2. scan the event numbers of all devices we are interested in and process
   them
3. call select, poll or epoll on the handle (it waits until some new event
   happens since opening the device)
4. close the /dev/mapper/control handle
5. go to step 1

The next commit allows to re-arm the polling without closing and
reopening the device.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NAndy Grover <agrover@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

93e6442c

nvme: host: unquiesce queue in nvme_kill_queues() · 443bd90f

由 Ming Lei 提交于 6月 19, 2017

When nvme_kill_queues() is run, queues may be in
quiesced state, so we forcibly unquiesce queues to avoid
blocking dispatch, and I/O hang can be avoided in
remove path.

Peviously we use blk_mq_start_stopped_hw_queues() as
counterpart of blk_mq_quiesce_queue(), now we have
introduced blk_mq_unquiesce_queue(), so use it explicitly.

Cc: linux-nvme@lists.infradead.org
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

443bd90f

blk-mq: use the introduced blk_mq_unquiesce_queue() · f660174e

由 Ming Lei 提交于 6月 06, 2017

blk_mq_unquiesce_queue() is used for unquiescing the
queue explicitly, so replace blk_mq_start_stopped_hw_queues()
with it.

For the scsi part, this patch takes Bart's suggestion to
switch to block quiesce/unquiesce API completely.

Cc: linux-nvme@lists.infradead.org
Cc: linux-scsi@vger.kernel.org
Cc: dm-devel@redhat.com
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f660174e

block: remove bio_clone() and all references. · 9b10f6a9

由 NeilBrown 提交于 6月 18, 2017

bio_clone() is no longer used.
Only bio_clone_bioset() or bio_clone_fast().
This is for the best, as bio_clone() used fs_bio_set,
and filesystems are unlikely to want to use bio_clone().

So remove bio_clone() and all references.
This includes a fix to some incorrect documentation.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9b10f6a9

bcache: use kmalloc to allocate bio in bch_data_verify() · 5a136fdf

由 NeilBrown 提交于 6月 18, 2017

This function allocates a bio, then a collection
of pages.  It copes with failure.

It currently uses a mempool() to allocate the bio,
but alloc_page() to allocate the pages.  These fail
in different ways, so the usage is inconsistent.

Change the bio_clone() to bio_clone_kmalloc()
so that no pool is used either for the bio or the pages.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKent Overstreet <kent.overstreet@gmail.com>
Reviewed-by : Ming Lei <ming.lei@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5a136fdf

xen-blkfront: remove bio splitting. · 4559fa55

由 NeilBrown 提交于 6月 18, 2017

bios that are re-submitted will pass through blk_queue_split() when
blk_queue_bio() is called, and this will split the bio if necessary.
There is no longer any need to do this splitting in xen-blkfront.
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4559fa55

lightnvm/pblk-read: use bio_clone_fast() · b25d5237

由 NeilBrown 提交于 6月 18, 2017

pblk_submit_read() uses bio_clone_bioset() but doesn't change the
io_vec, so bio_clone_fast() is a better choice.

It also uses fs_bio_set which is intended for filesystems.  Using it
in a device driver can deadlock.
So allocate a new bioset, and and use bio_clone_fast().
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJavier González <javier@cnexlabs.com>
Tested-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b25d5237

pktcdvd: use bio_clone_fast() instead of bio_clone() · a1d91404

由 NeilBrown 提交于 6月 18, 2017

pktcdvd doesn't change the bi_io_vec of the clone bio,
so it is more efficient to use bio_clone_fast(), and not clone
the bi_io_vec.
This requires providing a bio_set, and it is safest to
provide a dedicated bio_set rather than sharing
fs_bio_set, which filesytems use.
This new bio_set, pkt_bio_set, can also be use for the bio_split()
call as the two allocations (bio_clone_fast, and bio_split) are
independent, neither can block a bio allocated by the other.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a1d91404

drbd: use bio_clone_fast() instead of bio_clone() · 8cb0defb

由 NeilBrown 提交于 6月 18, 2017

drbd does not modify the bi_io_vec of the cloned bio,
so there is no need to clone that part.  So bio_clone_fast()
is the better choice.
For bio_clone_fast() we need to specify a bio_set.
We could use fs_bio_set, which bio_clone() uses, or
drbd_md_io_bio_set, which drbd uses for metadata, but it is
generally best to avoid sharing bio_sets unless you can
be certain that there are no interdependencies.

So create a new bio_set, drbd_io_bio_set, and use bio_clone_fast().

Also remove a "XXX cannot fail ???" comment because it definitely
cannot fail - bio_clone_fast() doesn't fail if the GFP flags allow for
sleeping.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8cb0defb

rbd: use bio_clone_fast() instead of bio_clone() · f856dc36

由 NeilBrown 提交于 6月 18, 2017

bio_clone() makes a copy of the bi_io_vec, but rbd never changes that,
so there is no need for a copy.
bio_clone_fast() can be used instead, which avoids making the copy.

This requires that we provide a bio_set.  bio_clone() uses fs_bio_set,
but it isn't, in general, safe to use the same bio_set at different
levels of the stack, as that can lead to deadlocks.  As filesystems
use fs_bio_set, block devices shouldn't.

As rbd never stacks, it is safe to have a single global bio_set for
all rbd devices to use.  So allocate that when the module is
initialised, and use it with bio_clone_fast().
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f856dc36

blk: make the bioset rescue_workqueue optional. · 47e0fb46

由 NeilBrown 提交于 6月 18, 2017

This patch converts bioset_create() to not create a workqueue by
default, so alloctions will never trigger punt_bios_to_rescuer().  It
also introduces a new flag BIOSET_NEED_RESCUER which tells
bioset_create() to preserve the old behavior.

All callers of bioset_create() that are inside block device drivers,
are given the BIOSET_NEED_RESCUER flag.

biosets used by filesystems or other top-level users do not
need rescuing as the bio can never be queued behind other
bios.  This includes fs_bio_set, blkdev_dio_pool,
btrfs_bioset, xfs_ioend_bioset, and one allocated by
target_core_iblock.c.

biosets used by md/raid do not need rescuing as
their usage was recently audited and revised to never
risk deadlock.

It is hoped that most, if not all, of the remaining biosets
can end up being the non-rescued version.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Credit-to: Ming Lei <ming.lei@redhat.com> (minor fixes)
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

47e0fb46

blk: replace bioset_create_nobvec() with a flags arg to bioset_create() · 011067b0

由 NeilBrown 提交于 6月 18, 2017

"flags" arguments are often seen as good API design as they allow
easy extensibility.
bioset_create_nobvec() is implemented internally as a variation in
flags passed to __bioset_create().

To support future extension, make the internal structure part of the
API.
i.e. add a 'flags' argument to bioset_create() and discard
bioset_create_nobvec().

Note that the bio_split allocations in drivers/md/raid* do not need
the bvec mempool - they should have used bioset_create_nobvec().
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

011067b0

blk: remove bio_set arg from blk_queue_split() · af67c31f

由 NeilBrown 提交于 6月 18, 2017

blk_queue_split() is always called with the last arg being q->bio_split,
where 'q' is the first arg.

Also blk_queue_split() sometimes uses the passed-in 'bs' and sometimes uses
q->bio_split.

This is inconsistent and unnecessary.  Remove the last arg and always use
q->bio_split inside blk_queue_split()
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Credit-to: Javier González <jg@lightnvm.io> (Noticed that lightnvm was missed)
Reviewed-by: NJavier González <javier@cnexlabs.com>
Tested-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

af67c31f

18 6月, 2017 1 次提交

loop: Add PF_LESS_THROTTLE to block/loop device thread. · b2ee7d46

由 NeilBrown 提交于 6月 16, 2017

When a filesystem is mounted from a loop device, writes are
throttled by balance_dirty_pages() twice: once when writing
to the filesystem and once when the loop_handle_cmd() writes
to the backing file.  This double-throttling can trigger
positive feedback loops that create significant delays.  The
throttling at the lower level is seen by the upper level as
a slow device, so it throttles extra hard.

The PF_LESS_THROTTLE flag was created to handle exactly this
circumstance, though with an NFS filesystem mounted from a
local NFS server.  It reduces the throttling on the lower
layer so that it can proceed largely unthrottled.

To demonstrate this, create a filesystem on a loop device
and write (e.g. with dd) several large files which combine
to consume significantly more than the limit set by
/proc/sys/vm/dirty_ratio or dirty_bytes.  Measure the total
time taken.

When I do this directly on a device (no loop device) the
total time for several runs (mkfs, mount, write 200 files,
umount) is fairly stable: 28-35 seconds.
When I do this over a loop device the times are much worse
and less stable.  52-460 seconds.  Half below 100seconds,
half above.
When I apply this patch, the times become stable again,
though not as fast as the no-loop-back case: 53-72 seconds.

There may be room for further improvement as the total overhead still
seems too high, but this is a big improvement.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <tom.leiming@gmail.com>
Suggested-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b2ee7d46

16 6月, 2017 2 次提交

block: swim3: make of_device_ids const. · cc3f2e9f

由 Arvind Yadav 提交于 6月 16, 2017

of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
   8908	   1096	    624	  10628	   2984	drivers/block/swim3.o

File size after constify swim3_match:
   text	   data	    bss	    dec	    hex	filename
   9708	    296	    624	  10628	   2984	drivers/block/swim3.o
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

cc3f2e9f

nvme: implement NS Optimal IO Boundary from 1.3 Spec · 6b8190d6

由 Scott Bauer 提交于 6月 15, 2017

The NVMe 1.3 spec introduces Namespace Optimal IO Boundaries (NOIOB),
which standardizes the stripe mechanism we currently have quirks for.
This patch implements the necessary logic to handle this new feature.
Signed-off-by: NScott Bauer <scott.bauer@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6b8190d6

15 6月, 2017 23 次提交

S
nvme: don't hard code size of struct t10_pi_tuple · 8fa61121
由 Sagi Grimberg 提交于 6月 15, 2017
```
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
8fa61121

nvme: no need to wait for the reset when keepalive fails · 39bdc590

由 Christoph Hellwig 提交于 6月 12, 2017

We don't need to wait for the reset from the delayed work item that
is kicked off when we don't get a keepalive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

39bdc590

nvme: move reset workqueue handling to common code · d86c4d8e

由 Christoph Hellwig 提交于 6月 15, 2017

This moves the nvme_reset function from the PCIe driver to common code,
renaming it to nvme_reset_ctrl in the process. Additionally a new
helper nvme_reset_ctrl_sync is added for the case where we want to
wait for the reset. To facilitate that the reset_work work structure is
move to the common nvme_ctrl structure and the ->reset_ctrl method is
removed. For now the drivers initialize the reset_work with their own
callback, but longer term we should move to callouts for specific
parts of the reset process and move even more code to the core.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

d86c4d8e

nvme-pci: merge init_request methods · 0350815a

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0350815a

nvme-loop: merge init_request methods · 62b83b18

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

62b83b18

nvme-fc: merge init_request methods · 76f983cb

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

76f983cb

nvme-rdma: merge init_request and exit_request methods · 385475ee

由 Christoph Hellwig 提交于 6月 13, 2017

Now that we get the tagset passed we can have a single implementation for
the I/O and admin queues.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

385475ee

nvme: move protection information check into nvme_setup_rw · ebe6d874

由 Christoph Hellwig 提交于 6月 12, 2017

It only applies to read/write commands, and this way non-PCIe drivers
get the check as well instead of having to duplicate it when adding
metadata support.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ebe6d874

nvme: mark shutdown_timeout static · b3b1b0b0

由 Christoph Hellwig 提交于 6月 12, 2017

And open code the SHUTDOWN_TIMEOUT macro.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b3b1b0b0

nvme-rdma: fix error code in nvme_rdma_create_ctrl() · bb472baa

由 Dan Carpenter 提交于 6月 14, 2017

We accidentally return ERR_PTR(0) which is NULL.  The caller isn't
explicitly checking for that but I couldn't immediately spot whether
this would lead to a NULL dereference.  Anyway, we can fix add an
error code easily enough.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bb472baa

nvmf: keep track of nvmet connect error status · 97ddc36e

由 Guan Junxiong 提交于 6月 13, 2017

To let the host know what happends to the connection establishment,
adjust the behavior of nvmf_log_connect_error to make more connect
specifig error codes human-readble.
Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97ddc36e

nvmet-fc: Remove a set-but-not-used variable · 1b633277

由 Bart Van Assche 提交于 6月 08, 2017

This was detected by building the nvmet-fc driver with W=1.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1b633277

nvme: use ctrl->device consistently for logging · f0425db0

由 Johannes Thumshirn 提交于 6月 09, 2017

Change the few left over users of ctrl->dev over to using ctrl->device
for logging purposes, so we consistently use the same device.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f0425db0

nvmet: allow overriding the NVMe VS via configfs · c61d788b

由 Johannes Thumshirn 提交于 6月 07, 2017

Allow overriding the announced NVMe Version of a via configfs.

This is particularly helpful when debugging new features for the host
or target side without bumping the hard coded version (as the target
might not be fully compliant to the announced version yet).
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c61d788b

nvmet: add uuid field to nvme_ns and populate via configfs · 430c7bef

由 Johannes Thumshirn 提交于 6月 07, 2017

Add the UUID field from the NVMe Namespace Identification Descriptor
to the nvmet_ns structure and allow it's population via configfs.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

430c7bef

nvmet: implement namespace identify descriptor list · 637dc0f3

由 Johannes Thumshirn 提交于 6月 07, 2017

A NVMe Identify NS command with a CNS value of '3' is expecting a list
of Namespace Identification Descriptor structures to be returned to
the host for the namespace requested in the namespace identify
command.

This Namespace Identification Descriptor structure consists of the
type of the namespace identifier, the length of the identifier and the
actual identifier.

Valid types are NGUID and UUID which we have saved in our nvme_ns
structure if they have been configured via configfs. If no value has
been assigened to one of these we return an "invalid opcode" back to
the host to maintain backward compatibiliy with older implementations
without Namespace Identify Descriptor list support.

Also as the Namespace Identify Descriptor list is the only mandatory
feature change between 1.2.1 and 1.3 we can bump the advertised
version as well.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

637dc0f3

nvme: provide UUID value to userspace · d934f984

由 Johannes Thumshirn 提交于 6月 07, 2017

Now that we have a way for getting the UUID from a target, provide it
to userspace as well.

Unfortunately there is already a sysfs attribute called UUID which is
a misnomer as it holds the NGUID value. So instead of creating yet
another wrong name, create a new 'nguid' sysfs attribute for the
NGUID. For the UUID attribute add a check wheter the namespace has a
UUID assigned to it and return this or return the NGUID to maintain
backwards compatibility. This should give userspace a chance to catch
up.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@rimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d934f984

nvme: get list of namespace descriptors · 3b22ba26

由 Johannes Thumshirn 提交于 6月 07, 2017

If a target identifies itself as NVMe 1.3 compliant, try to get the
list of Namespace Identification Descriptors and populate the UUID,
NGUID and EUI64 fileds in the NVMe namespace structure with these
values.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3b22ba26

nvme: rename uuid to nguid in nvme_ns · 90985b84

由 Johannes Thumshirn 提交于 6月 07, 2017

The uuid field in the nvme_ns structure represents the nguid field
from the identify namespace command. And as NVMe 1.3 introduced an
UUID in the NVMe Namespace Identification Descriptor this will
collide.

So rename the uuid to nguid to prevent any further
confusion. Unfortunately we export the nguid to sysfs in the uuid
sysfs attribute, but this can't be changed anymore without possibly
breaking existing userspace.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

90985b84

nvmet: use NVME_IDENTIFY_DATA_SIZE · 0add5e8e

由 Johannes Thumshirn 提交于 6月 07, 2017

Use NVME_IDENTIFY_DATA_SIZE define instead of hard coding the magic
4096 value.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHannes Reinecke <hare@suse.com>
[hch: converted three more users]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0add5e8e

nvme-pci: remove redundant includes · d19d4c8e

由 Sagi Grimberg 提交于 6月 05, 2017

Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>

d19d4c8e

nvme-pci: Remove watchdog timer · b2a0eb1a

由 Keith Busch 提交于 6月 07, 2017

The controller status polling was added to preemptively reset a failed
controller. This early detection would allow commands that would normally
timeout a chance for a retry, or find broken links when the platform
didn't support hotplug.

This once-per-second MMIO read, however, created more problems than
it solves. This often races with PCIe Hotplug events that required
complicated syncing between work queues, frequently triggered PCIe
Completion Timeout errors that also lead to fatal machine checks, and
unnecessarily disrupts low power modes by running on idle controllers.

This patch removes the watchdog timer, and instead checks controller
health only on an IO timeout when we have a reason to believe something
is wrong. If the controller is failed, the driver will disable immediately
and request scheduling a reset.
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b2a0eb1a

nvme-pci: remap BAR0 to cover admin CQ doorbell for large stride · 97f6ef64

由 Xu Yu 提交于 5月 24, 2017

The existing driver initially maps 8192 bytes of BAR0 which is
intended to cover doorbells of admin SQ and CQ. However, if a
large stride, e.g. 10, is used, the doorbell of admin CQ will
be out of 8192 bytes. Consequently, a page fault will be raised
when the admin CQ doorbell is accessed in nvme_configure_admin_queue().

This patch fixes this issue by remapping BAR0 before accessing
admin CQ doorbell if the initial mapping is not enough.
Signed-off-by: NXu Yu <yu.a.xu@intel.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

97f6ef64

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功