提交 · 2d4fe27850420606155fb1f7d18ab2b40153e67b · OpenHarmony / kernel_linux

10 5月, 2013 1 次提交

NVMe: Use user defined admin ioctl timeout · 94f370ca

由 Keith Busch 提交于 5月 09, 2013

Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

94f370ca

08 5月, 2013 3 次提交

NVMe: Simplify Firmware Activate code slightly · ab3ea5bf

由 Matthew Wilcox 提交于 5月 06, 2013

Add definitions for the three Firmware Activate actions, and change the
SCSI translation code to construct the command into a temporary variable
instead of translating the endianness back-and-forth.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NVishal Verma <vishal.l.verma@linux.intel.com>

ab3ea5bf

NVMe: Only clear the enable bit when disabling controller · 44af146a

由 Matthew Wilcox 提交于 5月 04, 2013

Many of the bits in the Controller Configuration register may only be
modified when the Enable bit is clear.  Clearing them at the same time
as the Enable bit might be OK, but let's play it safe and only touch the
Enable bit.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

44af146a

NVMe: Wait for device to acknowledge shutdown · ba47e386

由 Matthew Wilcox 提交于 5月 04, 2013

A recent update to the specification makes it clear that the host
is expected to wait for the device to acknowledge the Enable bit
transitioning to 0 as well as waiting for the device to acknowledge a
transition to 1.
Reported-by: NKhosrow Panah <Khosrow.Panah@idt.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

ba47e386

07 5月, 2013 1 次提交

block_device_operations->release() should return void · db2a144b

由 Al Viro 提交于 5月 05, 2013

The value passed is 0 in all but "it can never happen" cases (and those
only in a couple of drivers) *and* it would've been lost on the way
out anyway, even if something tried to pass something meaningful.
Just don't bother.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

db2a144b

03 5月, 2013 17 次提交

NVMe: Schedule timeout for sync commands · 78f8d257

由 Keith Busch 提交于 4月 19, 2013

Schedule a timeout on sync commands in case the command times out and
the device is not being polled for timeouts. This prevents device removal
from hanging forever if the device has stopped responding.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

78f8d257

NVMe: Meta-data support in NVME_IOCTL_SUBMIT_IO · f410c680

由 Keith Busch 提交于 4月 23, 2013

This adds support for namespaces with separate meta-data formats in the
submit io ioctl. The meta-data buffer has to be a contiguous, so such
a buffer is allocated and the mapped user pages are copied to/from this
buffer for write/read commands.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

f410c680

NVMe: Device specific stripe size handling · 159b67d7

由 Keith Busch 提交于 4月 09, 2013

We have an nvme device that has a concept of a stripe size. IO requests
that do not transfer data crossing a stripe boundary has greater
performance compared to IO that does cross it. This patch sets the
stripe size for the device if the device and vendor ids match one with
this feature and splits IO requests that cross the stripe boundary.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

159b67d7

NVMe: Split non-mergeable bio requests · 427e9708

由 Keith Busch 提交于 4月 09, 2013

It is possible a bio request can not be submitted as a single NVMe IO
command if the bio_vec is not mergeable with the NVMe PRP alignement
constraints. This condition was handled by submitting an IO for the
mergeable portion then submitting a follow on IO for the remaining data
after the previous IO completes. The remainder to be sent was tracked
by manipulating the bio->bi_idx and bio->bi_sector. This patch splits
the request as many times as necessary and submits the bios together.

Since submitting the bio may cause it to be requeued on split,
nvme_resubmit_bios had to be modified to remove the wait queue when
the bio list is empty prior to submitting the bio since a split would
have added the wait queue a second time, corrupting the wait queue head
task list.

There are a few other benefits from doing this: it fixes a potential
issue with the previous handling of a non-mergeable bio as the requeuing
method could would use an unlocked nvme_queue if the callback isn't
invoked on the queue's associated cpu; it will be possible to retry a
failed bio if desired at some later time since it does not manipulate
the original bio; the bio integrity extensions require the bio to be in
its original condition for the checks to work correctly if we implement
the end-to-end data protection in the future.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

427e9708

NVMe: Remove dead code in nvme_dev_add · cbb6218f

由 Keith Busch 提交于 5月 01, 2013

There is no situation that could occur where we could error out of this
function and require cleaning up allocated namespaces.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

cbb6218f

NVMe: Check for NULL memory in nvme_dev_add · a9ef4343

由 Keith Busch 提交于 5月 01, 2013

Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

a9ef4343

NVMe: Fix error clean-up on nvme_alloc_queue · 68b8eca5

由 Keith Busch 提交于 5月 01, 2013

The nvme_queue's depth is not set if we fail to allocate submission queue
entries, which was being used to determine how much coherent memory to
free on error. Use the depth variable instead.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

68b8eca5

NVMe: Free admin queue on request_irq error · 025c557a

由 Keith Busch 提交于 5月 01, 2013

Fixes a potential memory leak if requesting the admin queue irq fails.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

025c557a

NVMe: Add scsi unmap to SG_IO · ec503733

由 Keith Busch 提交于 4月 24, 2013

Translates a scsi unmap request from SG_IO ioctl to NVMe
data-set-management deallocate.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Acked-by: NVishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

ec503733

NVMe: queue usage fixes in nvme-scsi · 14385de1

由 Keith Busch 提交于 4月 25, 2013

Fixes nvme queue usages in scsi-to-nvme translation code to not get
a queue more often than it is being put, and not use the queue in an
unsafe way without it being locked.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Acked-by: NVishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

14385de1

rbd: fix image request leak on parent read · b5b09be3

由 Alex Elder 提交于 5月 01, 2013

When a read for a layered image object finds the target object
doesn't exist, a read image request for the parent image is created
and submitted.  When that completes, the callback routine was
not releasing that parent image request.  Fix that.

The slab allocation stuff just added has greatly simplified the
search for the source of this memory leak.

This resolves:
    http://tracker.ceph.com/issues/4803Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

b5b09be3

rbd: allocate image object names with a slab allocator · 78c2a44a

由 Alex Elder 提交于 5月 01, 2013

The names of objects used for image object requests are always fixed
size.  So create a slab cache to manage them.  Define a new function
rbd_segment_name_free() to match rbd_segment_name() (which is what
supplies the dynamically-allocated name buffer).

This is part of:
    http://tracker.ceph.com/issues/3926Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

78c2a44a

rbd: allocate object requests with a slab allocator · 868311b1

由 Alex Elder 提交于 5月 01, 2013

Create a slab cache to manage rbd_obj_request allocation.  We aren't
using a constructor, and we'll zero-fill object request structures
when they're allocated.

This is part of:
    http://tracker.ceph.com/issues/3926Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

868311b1

rbd: allocate name separate from obj_request · f907ad55

由 Alex Elder 提交于 5月 01, 2013

The next patch will define a slab allocator for a object requests.
To use that we'll need to allocate the name of an object separate
from the request structure itself.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

f907ad55

rbd: allocate image requests with a slab allocator · 1c2a9dfe

由 Alex Elder 提交于 5月 01, 2013

Create a slab cache to manage rbd_img_request allocation.  Nothing
too fancy at this point--we'll still initialize everything at
allocation time (no constructor)

This is part of:
    http://tracker.ceph.com/issues/3926Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

1c2a9dfe

rbd: use binary search for snapshot lookup · 30d1cff8

由 Alex Elder 提交于 5月 01, 2013

Use bsearch(3) to make snapshot lookup by id more efficient.  (There
could be thousands of snapshots, and conceivably many more.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

30d1cff8

rbd: clear EXISTS flag if mapped snapshot disappears · 15228ede

由 Alex Elder 提交于 5月 01, 2013

This functionality inadvertently disappeared in the last patch.

Image snapshots can get removed at just about any time.  In
particular it can disappear even if it is in use by an rbd
client as a mapped image.

The rbd client deals with such a disappearance by responding to new
requests with ENXIO.  This is implemented by each rbd device
maintaining an EXISTS flag, which is normally set but cleared if a
snapshot disappears.

This patch (re-)implements the clearing of that flag.

Whenever mapped image header information is refreshed, if the
mapping is for a snapshot, verify the mapped snapshot is still
present in the updated snapshot context.  If it is not, clear the
flag.

It is not necessary to check this in the initial probe, because the
probe will not succeed if the snapshot doesn't exist.

This resolves:
    http://tracker.ceph.com/issues/4880Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

15228ede

02 5月, 2013 18 次提交

rbd: kill off the snapshot list · 33dca39f

由 Alex Elder 提交于 4月 30, 2013

We no longer use the snapshot list for anything.  When we need to
look up a snapshot name, id, size, or feature mask, we just do it
directly rather than relying on this list being updated with every
refresh.  The main reason it existed was for the benefit of the
device/sysfs entries that previously were associated with snapshots.

So get rid of the snapshot list, and struct rbd_snap, and the
hundreds of lines of code that supported them.

This resolves:
    http://tracker.ceph.com/issues/4868Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

33dca39f

rbd: define rbd_snap_size() and rbd_snap_features() · 2ad3d716

由 Alex Elder 提交于 4月 30, 2013

This patch defines a handful of new functions that will allow
us to get rid of the rbd device structure's list of snapshots.

Define rbd_snap_id_by_name() to look up a snapshot id given its
name.  This is efficient for format 1 images but not for format 2.
Fortunately it only gets called at mapping time so it's not that
critical.

Use rbd_snap_id_by_name() to find out the id for a snapshot getting
mapped, and pass that id to new functions rbd_snap_size() and
rbd_snap_features() to look up information about a given snapshot's
size and feature mask given its snapshot id.  All this gets done
in rbd_dev_mapping_set().

As a result, snap_by_name() is no longer needed, so get rid of it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

2ad3d716

rbd: use snap_id not index to look up snap info · 54cac61f

由 Alex Elder 提交于 4月 30, 2013

In order to align with what was needed for format 1 rbd images,
rbd_dev_v2_snap_info() was set up to take as argument an index into
the array of snapshot ids in a rbd device's snapshot context.

This switches that around, so we pass the snapshot id instead.
In doing this, rbd_snap_name() now returns a dynamically-allocated
string rather than a fixed one, so there's no need to make a
duplicate in its caller, rbd_dev_spec_update().

This means the following functions take a snapshot id where they
previously used an index value:
    rbd_dev_snap_info()
    rbd_dev_v1_snap_info()
    rbd_dev_v2_snap_info()

A new function, rbd_dev_snap_index(), determines the snap index for
format 1 images and uses it to look up the name.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

54cac61f

rbd: look up snapshot name in names buffer · 9682fc6d

由 Alex Elder 提交于 4月 30, 2013

Rather than scanning the list of snapshot structures for it, scan
the snapshot context buffer containing snapshot names in order to
determine for a format 1 image the name associated with a given
snapshot id.

Pull out the part of rbd_dev_v1_snap_info() that does this scan into
a new function, _rbd_dev_v1_snap_name().  Have that function return
a dynamically-allocated copy of the name, and don't duplicate it in
rbd_dev_v1_snap_info().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

9682fc6d

rbd: drop obj_request->version · dedc81ea

由 Alex Elder 提交于 4月 30, 2013

Nothing ever uses the version field maintained in the object request
structure any more, so get rid of it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

dedc81ea

rbd: drop rbd_obj_method_sync() version parameter · e2a58ee5

由 Alex Elder 提交于 4月 30, 2013

Only NULL is passed as the version argument to rbd_obj_method_sync(),
so get rid of it.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

e2a58ee5

rbd: more version parameter removal · cc4a38bd

由 Alex Elder 提交于 4月 30, 2013

Continued from the last patch, more parameters that can go away
because we no longer have a need to track object versions.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

cc4a38bd

rbd: get rid of some version parameters · 7097f8df

由 Alex Elder 提交于 4月 30, 2013

Several functions in rbd have parameters meant to allow the version
of an object to be passed in or out.  The purpose of those was to
allow the version of a header object to be maintained, but we no
longer do that.  As a result, these parameters are never actually
needed or used, so get rid of them.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

7097f8df

rbd: stop tracking header object version · b21ebddd

由 Alex Elder 提交于 4月 30, 2013

The rbd code takes care to maintain the version of the header
object.  This was done in hopes of using it to detect a change in
the object between reading it and setting up a watch request to
be notified of changes.

The mechanism was never fully implemented, however.  And we now
avoid the original problem by setting up the watch request before
ever reading the content of the header.

The osd doesn't interpret the object version supplied with a WATCH
osd op, nor does it use the version supplied with a NOTIFY_ACK op
(we can just supply 0 for both).  There is therefore no need to
maintain the header's object version any more, so stop doing so.

We'll be able to simplify some more rbd code in the next few patches
as a result of this.

This resolves:
    http://tracker.ceph.com/issues/3952Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

b21ebddd

rbd: snap names are pointer to constant data · cb75223d

由 Alex Elder 提交于 4月 30, 2013

Make explicit that snapshot names don't change by making functions
return and take parameters that that point to const qualified data.

This resolves:
    http://tracker.ceph.com/issues/4867Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

cb75223d

rbd: don't revalidate so much · a3fbe5d4

由 Alex Elder 提交于 4月 30, 2013

Whenever a header object event causes a mapped rbd image to refresh
its header information, revalidate_disk() is being called.  This was
done in rbd_dev_refresh() outside the control mutex in order to
avoid a lock inversion.  Although a an event like this *might*
indicate the image has changed size, most of the time it does not.

Record the image size before and after the refresh, and only
call revalidate_disk() if it changes.

This resolves:
    http://tracker.ceph.com/issues/4867Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a3fbe5d4

rbd: fix up the layering warning message · 96882f55

由 Alex Elder 提交于 4月 30, 2013

A warning gets spewed for any image being probed, including parent
images.  Set up a condition such that the warning message only gets
printed for the image being mapped, not any of its parents.

Also, I didn't like the way the warning ended up being so long.
Make it a terse warning instead.  People experimenting with layering
will know what the message means.

This is part of:
    http://tracker.ceph.com/issues/4867Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

96882f55

ceph: use ceph_create_snap_context() · 812164f8

由 Alex Elder 提交于 4月 30, 2013

Now that we have a library routine to create snap contexts, use it.

This is part of:
    http://tracker.ceph.com/issues/4857Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

812164f8

rbd: set up devices only for mapped images · b536f69a

由 Alex Elder 提交于 4月 28, 2013

Stop setting up Linux devices during the image probe operation.
Instead, set up the devices as a separate step after the image
probe, in rbd_add().

A consequence of this is that only mapped images get devices
assigned to them, which is pretty sweet.

This resolves:
    http://tracker.ceph.com/issues/4774Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

b536f69a

rbd: don't have device release destroy rbd_dev · 8ad42cd0

由 Alex Elder 提交于 4月 28, 2013

Currently an rbd_device structure gets destroyed from the release
routine for the device embedded within it.  Stop doing that, instead
calling rbd_dev_image_release() right after rbd_bus_del_dev()
wherever the latter is called.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

8ad42cd0

rbd: define rbd_dev_unprobe() · 6fd48b3b

由 Alex Elder 提交于 4月 28, 2013

Define a new function rbd_dev_unprobe() which undoes state changes
that occur from calling rbd_dev_v1_probe() or rbd_dev_v2_probe().
Note that this is a superset of rbd_header_free(), which is now
getting removed (it seems to have been used improperly anyway).

Flesh out rbd_dev_image_release() so it undoes exactly what
rbd_dev_image_probe() does.

This means that:
    - rbd_dev_device_release() gets called when the last device
      reference gets dropped;
    - that undoes everything done by the rbd_dev_device_setup() call
      at the end of rbd_dev_image_probe() (and nothing more), ending
      by calling rbd_dev_image_release(); and
    - rbd_dev_image_release() undoes everything else done by
      rbd_dev_image_probe() (and this includes a call to
      rbd_dev_unprobe().

This means the image and device portions of an rbd device are fairly
cleanly separated now, so error paths should be a little easier to
verify than they used to be.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

6fd48b3b

rbd: don't destroy rbd_dev in device release function · 200a6a8b

由 Alex Elder 提交于 4月 28, 2013

Rename rbd_dev_probe_finish() to be rbd_dev_device_setup().  Its
purpose is to set up the Linux side of an rbd device mapping.
Rename rbd_dev_release() to be rbd_dev_device_release(), making
it more obvious it serves as the inverse of the setup function
(or it will).

Encapsulate some of what was done in rbd_dev_release() into a new
function rbd_dev_image_release(), which serves as the inverse of
setting up the ceph side of the mapped rbd image.

Define a new helper rbd_dev_clear_mapping() to simply zero out the
fields of a mapping structure--the inverse of rbd_dev_set_mapping().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

200a6a8b

rbd: drop module later · 79ab7558

由 Alex Elder 提交于 4月 28, 2013

Drop the module reference at the end of rbd_remove() for symmetry
with adding a reference at the top of rbd_add().
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

79ab7558

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年