提交 · 6678d83f18386eb103f8345024e52c5abe61725c · openanolis / cloud-kernel

09 11月, 2013 1 次提交

block: Consolidate duplicated bio_trim() implementations · 6678d83f

由 Kent Overstreet 提交于 8月 07, 2013

Someone cut and pasted md's md_trim_bio() into xen-blkfront.c. Come on,
we should know better than this.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6678d83f

22 6月, 2013 1 次提交

xen-blkfront: set blk_queue_max_hw_sectors correctly · 294caaf2

由 Roger Pau Monne 提交于 6月 21, 2013

Now that indirect segments are enabled blk_queue_max_hw_sectors must
be set to match the maximum number of sectors we can handle in a
request.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-by: NFelipe Franciosi <felipe.franciosi@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

294caaf2

08 6月, 2013 1 次提交

xen/blkback: Use physical sector size for setup · 7c4d7d71

由 Stefan Bader 提交于 5月 13, 2013

Currently xen-blkback passes the logical sector size over xenbus and
xen-blkfront sets up the paravirt disk with that logical block size.
But newer drives usually have the logical sector size set to 512 for
compatibility reasons and would show the actual sector size only in
physical sector size.
This results in the device being partitioned and accessed in dom0 with
the correct sector size, but the guest thinks 512 bytes is the correct
block size. And that results in poor performance.

To fix this, blkback gets modified to pass also physical-sector-size
over xenbus and blkfront to use both values to set up the paravirt
disk. I did not just change the passed in sector-size because I am
not sure having a bigger logical sector size than the physical one
is valid (and that would happen if a newer dom0 kernel hits an older
domU kernel). Also this way a domU set up before should still be
accessible (just some tools might detect the unaligned setup).

[v2: Make xenbus write failure non-fatal]
[v3: Use xenbus_scanf instead of xenbus_gather]
[v4: Rebased against segment changes]
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

7c4d7d71

05 6月, 2013 1 次提交

xen-blkfront: Introduce a 'max' module parameter to alter the amount of indirect segments. · 2d5dc3ba

由 Konrad Rzeszutek Wilk 提交于 5月 15, 2013

The max module parameter (by default 32) is the maximum number of
segments that the frontend will negotiate with the backend for indirect
descriptors.  Higher value means more potential throughput but more
memory usage. The backend picks the minimum of the frontend and its
default backend value.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

2d5dc3ba

08 5月, 2013 1 次提交

xen-blkfront: use a different scatterlist for each request · b7649158

由 Roger Pau Monne 提交于 5月 02, 2013

In blkif_queue_request blkfront iterates over the scatterlist in order
to set the segments of the request, and in blkif_completion blkfront
iterates over the raw request, which makes it hard to know the exact
position of the source and destination memory positions.

This can be solved by allocating a scatterlist for each request, that
will be keep until the request is finished, allowing us to copy the
data back to the original memory without having to iterate over the
raw request.

Oracle-Bug: 16660413 - LARGE ASYNCHRONOUS READS APPEAR BROKEN ON 2.6.39-400
CC: stable@vger.kernel.org
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: NAnne Milicia <anne.milicia@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b7649158

07 5月, 2013 1 次提交

block_device_operations->release() should return void · db2a144b

由 Al Viro 提交于 5月 05, 2013

The value passed is 0 in all but "it can never happen" cases (and those
only in a couple of drivers) *and* it would've been lost on the way
out anyway, even if something tried to pass something meaningful.
Just don't bother.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

db2a144b

19 4月, 2013 1 次提交

xen-block: implement indirect descriptors · 402b27f9

由 Roger Pau Monne 提交于 4月 18, 2013

Indirect descriptors introduce a new block operation
(BLKIF_OP_INDIRECT) that passes grant references instead of segments
in the request. This grant references are filled with arrays of
blkif_request_segment_aligned, this way we can send more segments in a
request.

The proposed implementation sets the maximum number of indirect grefs
(frames filled with blkif_request_segment_aligned) to 256 in the
backend and 32 in the frontend. The value in the frontend has been
chosen experimentally, and the backend value has been set to a sane
value that allows expanding the maximum number of indirect descriptors
in the frontend if needed.

The migration code has changed from the previous implementation, in
which we simply remapped the segments on the shared ring. Now the
maximum number of segments allowed in a request can change depending
on the backend, so we have to requeue all the requests in the ring and
in the queue and split the bios in them if they are bigger than the
new maximum number of segments.

[v2: Fixed minor comments by Konrad.
[v1: Added padding to make the indirect request 64bit aligned.
 Added some BUGs, comments; fixed number of indirect pages in
 blkif_get_x86_{32/64}_req. Added description about the indirect operation
 in blkif.h]
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
[v3: Fixed spaces and tabs mix ups]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

402b27f9

20 3月, 2013 2 次提交

xen-blkfront: remove frame list from blk_shadow · b1173e31

由 Roger Pau Monne 提交于 3月 18, 2013

We already have the frame (pfn of the grant page) stored inside struct
grant, so there's no need to keep an aditional list of mapped frames
for a specific request. This reduces memory usage in blkfront.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xen.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b1173e31

xen-blkfront: pre-allocate pages for requests · 9c1e050c

由 Roger Pau Monne 提交于 3月 18, 2013

This prevents us from having to call alloc_page while we are preparing
the request. Since blkfront was calling alloc_page with a spinlock
held we used GFP_ATOMIC, which can fail if we are requesting a lot of
pages since it is using the emergency memory pools.

Allocating all the pages at init prevents us from having to call
alloc_page, thus preventing possible failures.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xen.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9c1e050c

19 3月, 2013 2 次提交

xen-blkfront: switch from llist to list · 155b7edb

由 Roger Pau Monne 提交于 3月 18, 2013

The git commit f84adf49
(xen-blkfront: drop the use of llist_for_each_entry_safe)

was a stop-gate to fix a GCC4.1 bug. The appropiate way
is to actually use an list instead of using an llist.

As such this patch replaces the usage of llist with an
list.

Since we always manipulate the list while holding the io_lock, there's
no need for additional locking (llist used previously is safe to use
concurrently without additional locking).
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
CC: stable@vger.kernel.org
[v1: Redid the git commit description]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

155b7edb

xen-blkfront: replace kmalloc and then memcpy with kmemdup · 29d0b218

由 Mihnea Dobrescu-Balaur 提交于 3月 11, 2013

The benefits are:
* code is cleaner
* kmemdup adds additional debugging info useful for tracking the real
place where memory was allocated (CONFIG_DEBUG_SLAB).
Signed-off-by: NMihnea Dobrescu-Balaur <mihneadb@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

29d0b218

20 2月, 2013 1 次提交

xen-blkfront: drop the use of llist_for_each_entry_safe · f84adf49

由 Konrad Rzeszutek Wilk 提交于 2月 13, 2013

Replace llist_for_each_entry_safe with a while loop.

llist_for_each_entry_safe can trigger a bug in GCC 4.1, so it's best
to remove it and use a while loop and do the deletion manually.

Specifically this bug can be triggered by hot-unplugging a disk, either
by doing xm block-detach or by save/restore cycle.

BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffffa0047223>] blkif_free+0x63/0x130 [xen_blkfront]
The crash call trace is:
	...
bad_area_nosemaphore+0x13/0x20
do_page_fault+0x25e/0x4b0
page_fault+0x25/0x30
? blkif_free+0x63/0x130 [xen_blkfront]
blkfront_resume+0x46/0xa0 [xen_blkfront]
xenbus_dev_resume+0x6c/0x140
pm_op+0x192/0x1b0
device_resume+0x82/0x1e0
dpm_resume+0xc9/0x1a0
dpm_resume_end+0x15/0x30
do_suspend+0x117/0x1e0

When drilling down to the assembler code, on newer GCC it does
.L29:
        cmpq    $-16, %r12      #, persistent_gnt check
        je      .L30    	#, out of the loop
.L25:
	... code in the loop
        testq   %r13, %r13      # n
        je      .L29    	#, back to the top of the loop
        cmpq    $-16, %r12      #, persistent_gnt check
        movq    16(%r12), %r13  # <variable>.node.next, n
        jne     .L25    	#,	back to the top of the loop
.L30:

While on GCC 4.1, it is:
L78:
	... code in the loop
	testq   %r13, %r13      # n
        je      .L78    #,	back to the top of the loop
        movq    16(%rbx), %r13  # <variable>.node.next, n
        jmp     .L78    #,	back to the top of the loop

Which basically means that the exit loop condition instead of
being:

	&(pos)->member != NULL;

is:
	;

which makes the loop unbound.

Since xen-blkfront is the only user of the llist_for_each_entry_safe
macro remove it from llist.h.

Orabug: 16263164
CC: stable@vger.kernel.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

f84adf49

18 12月, 2012 2 次提交

xen-blkfront: handle bvecs with partial data · d62f6918

由 Roger Pau Monne 提交于 12月 07, 2012

Currently blkfront fails to handle cases in blkif_completion like the
following:

1st loop in rq_for_each_segment
 * bv_offset: 3584
 * bv_len: 512
 * offset += bv_len
 * i: 0

2nd loop:
 * bv_offset: 0
 * bv_len: 512
 * i: 0

In the second loop i should be 1, since we assume we only wanted to
read a part of the previous page. This patches fixes this cases where
only a part of the shared page is read, and blkif_completion assumes
that if the bv_offset of a bvec is less than the previous bv_offset
plus the bv_size we have to switch to the next shared page.
Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: linux-kernel@vger.kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d62f6918

llist/xen-blkfront: implement safe version of llist_for_each_entry · ebb351cf

由 Roger Pau Monne 提交于 12月 04, 2012

Implement a safe version of llist_for_each_entry, and use it in
blkif_free. Previously grants where freed while iterating the list,
which lead to dereferences when trying to fetch the next item.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
[v2: Move the llist_for_each_entry_safe in llist.h]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ebb351cf

27 11月, 2012 1 次提交

xen-blkfront: free allocated page · 07c540a0

由 Roger Pau Monne 提交于 11月 16, 2012

Free the page allocated for the persistent grant.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

07c540a0

04 11月, 2012 1 次提交

xen/blkback: persistent-grants fixes · cb5bd4d1

由 Roger Pau Monne 提交于 11月 02, 2012

This patch contains fixes for persistent grants implementation v2:

 * handle == 0 is a valid handle, so initialize grants in blkback
   setting the handle to BLKBACK_INVALID_HANDLE instead of 0. Reported
   by Konrad Rzeszutek Wilk.

 * new_map is a boolean, use "true" or "false" instead of 1 and 0.
   Reported by Konrad Rzeszutek Wilk.

 * blkfront announces the persistent-grants feature as
   feature-persistent-grants, use feature-persistent instead which is
   consistent with blkback and the public Xen headers.

 * Add a consistency check in blkfront to make sure we don't try to
   access segments that have not been set.
Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NRoger Pau Monne <roger.pau@citrix.com>
[v1: The new_map int->bool had already been changed]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

cb5bd4d1

30 10月, 2012 1 次提交

xen/blkback: Persistent grant maps for xen blk drivers · 0a8704a5

由 Roger Pau Monne 提交于 10月 24, 2012

This patch implements persistent grants for the xen-blk{front,back}
mechanism. The effect of this change is to reduce the number of unmap
operations performed, since they cause a (costly) TLB shootdown. This
allows the I/O performance to scale better when a large number of VMs
are performing I/O.

Previously, the blkfront driver was supplied a bvec[] from the request
queue. This was granted to dom0; dom0 performed the I/O and wrote
directly into the grant-mapped memory and unmapped it; blkfront then
removed foreign access for that grant. The cost of unmapping scales
badly with the number of CPUs in Dom0. An experiment showed that when
Dom0 has 24 VCPUs, and guests are performing parallel I/O to a
ramdisk, the IPIs from performing unmap's is a bottleneck at 5 guests
(at which point 650,000 IOPS are being performed in total). If more
than 5 guests are used, the performance declines. By 10 guests, only
400,000 IOPS are being performed.

This patch improves performance by only unmapping when the connection
between blkfront and back is broken.

On startup blkfront notifies blkback that it is using persistent
grants, and blkback will do the same. If blkback is not capable of
persistent mapping, blkfront will still use the same grants, since it
is compatible with the previous protocol, and simplifies the code
complexity in blkfront.

To perform a read, in persistent mode, blkfront uses a separate pool
of pages that it maps to dom0. When a request comes in, blkfront
transmutes the request so that blkback will write into one of these
free pages. Blkback keeps note of which grefs it has already
mapped. When a new ring request comes to blkback, it looks to see if
it has already mapped that page. If so, it will not map it again. If
the page hasn't been previously mapped, it is mapped now, and a record
is kept of this mapping. Blkback proceeds as usual. When blkfront is
notified that blkback has completed a request, it memcpy's from the
shared memory, into the bvec supplied. A record that the {gref, page}
tuple is mapped, and not inflight is kept.

Writes are similar, except that the memcpy is peformed from the
supplied bvecs, into the shared pages, before the request is put onto
the ring.

Blkback stores a mapping of grefs=>{page mapped to by gref} in
a red-black tree. As the grefs are not known apriori, and provide no
guarantees on their ordering, we have to perform a search
through this tree to find the page, for every gref we receive. This
operation takes O(log n) time in the worst case. In blkfront grants
are stored using a single linked list.

The maximum number of grants that blkback will persistenly map is
currently set to RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST, to
prevent a malicios guest from attempting a DoS, by supplying fresh
grefs, causing the Dom0 kernel to map excessively. If a guest
is using persistent grants and exceeds the maximum number of grants to
map persistenly the newly passed grefs will be mapped and unmaped.
Using this approach, we can have requests that mix persistent and
non-persistent grants, and we need to handle them correctly.
This allows us to set the maximum number of persistent grants to a
lower value than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST, although
setting it will lead to unpredictable performance.

In writing this patch, the question arrises as to if the additional
cost of performing memcpys in the guest (to/from the pool of granted
pages) outweigh the gains of not performing TLB shootdowns. The answer
to that question is `no'. There appears to be very little, if any
additional cost to the guest of using persistent grants. There is
perhaps a small saving, from the reduced number of hypercalls
performed in granting, and ending foreign access.
Signed-off-by: NOliver Chick <oliver.chick@citrix.com>
Signed-off-by: NRoger Pau Monne <roger.pau@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: Fixed up the misuse of bool as int]

0a8704a5

21 8月, 2012 1 次提交

workqueue: deprecate flush[_delayed]_work_sync() · 43829731

由 Tejun Heo 提交于 8月 20, 2012

flush[_delayed]_work_sync() are now spurious.  Mark them deprecated
and convert all users to flush[_delayed]_work().

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant and the regular flushes guarantee that the work item is
not pending or running on any CPU on return, so there's no reason to
use the sync flushes at all and they're going away.

This patch doesn't make any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mattia Dongili <malattia@linux.it>
Cc: Kent Yoder <key@linux.vnet.ibm.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Bryan Wu <bryan.wu@canonical.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: Anton Vorontsov <cbou@mail.ru>
Cc: Sangbeom Kim <sbkim73@samsung.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Petr Vandrovec <petr@vandrovec.name>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Avi Kivity <avi@redhat.com>

43829731

19 7月, 2012 1 次提交

xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-op · 89c30f16

由 Theodore Ts'o 提交于 7月 17, 2012

With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a
no-op; interrupt randomness is now collected unconditionally in a very
low-overhead fashion; see commit 775f4b29.  The IRQF_SAMPLE_RANDOM
flag was scheduled to be removed in 2009 on the
feature-removal-schedule, so this patch is preparation for the final
removal of this flag.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>

89c30f16

12 6月, 2012 1 次提交

xen/blkfront: Add WARN to deal with misbehaving backends. · 6878c32e

由 Konrad Rzeszutek Wilk 提交于 5月 25, 2012

Part of the ring structure is the 'id' field which is under
control of the frontend. The frontend stamps it with "some"
value (this some in this implementation being a value less
than BLK_RING_SIZE), and when it gets a response expects
said value to be in the response structure. We have a check
for the id field when spolling new requests but not when
de-spolling responses.

We also add an extra check in add_id_to_freelist to make
sure that the 'struct request' was not NULL - as we cannot
pass a NULL to __blk_end_request_all, otherwise that crashes
(and all the operations that the response is dealing with
end up with __blk_end_request_all).

Lastly we also print the name of the operation that failed.

[v1: s/BUG/WARN/ suggested by Stefano]
[v2: Add extra check in add_id_to_freelist]
[v3: Redid op_name per Jan's suggestion]
[v4: add const * and add WARN on failure returns]
Acked-by: NJan Beulich <jbeulich@suse.com>
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6878c32e

12 5月, 2012 2 次提交

xen-blkfront: module exit handling adjustments · 8605067f

由 Jan Beulich 提交于 4月 05, 2012

The blkdev major must be released upon exit, or else the module can't
attach to devices using the same majors upon being loaded again. Also
avoid leaking the minor tracking bitmap.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

8605067f

xen-blkfront: properly name all devices · e77c78c0

由 Jan Beulich 提交于 4月 05, 2012

- devices beyond xvdzz didn't get proper names assigned at all
- extended devices with minors not representable within the kernel's
  major/minor bit split spilled into foreign majors
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e77c78c0

07 4月, 2012 1 次提交

xen: only check xen_platform_pci_unplug if hvm · e95ae5a4

由 Igor Mammedov 提交于 3月 27, 2012

commit b9136d207f08
  xen: initialize platform-pci even if xen_emul_unplug=never

breaks blkfront/netfront by not loading them because of
xen_platform_pci_unplug=0 and it is never set for PV guest.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e95ae5a4

22 3月, 2012 1 次提交

xen: initialize platform-pci even if xen_emul_unplug=never · b9136d20

由 Igor Mammedov 提交于 3月 21, 2012

When xen_emul_unplug=never is specified on kernel command line
reading files from /sys/hypervisor is broken (returns -EBUSY).
It is caused by xen_bus dependency on platform-pci and
platform-pci isn't initialized when xen_emul_unplug=never is
specified.

Fix it by allowing platform-pci to ignore xen_emul_unplug=never,
and do not intialize xen_[blk|net]front instead.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b9136d20

20 3月, 2012 3 次提交

xen-blkfront: make blkif_io_lock spinlock per-device · 3467811e

由 Steven Noonan 提交于 2月 17, 2012

This patch moves the global blkif_io_lock to the per-device structure. The
spinlock seems to exists for two reasons: to disable IRQs when in the interrupt
handlers for blkfront, and to protect the blkfront VBDs when a detachment is
requested.

Having a global blkif_io_lock doesn't make sense given the use case, and it
drastically hinders performance due to contention. All VBDs with pending IOs
have to take the lock in order to get work done, which serializes everything
pretty badly.
Signed-off-by: NSteven Noonan <snoonan@amazon.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

3467811e

xen/blkfront: don't put bdev right after getting it · dad5cf65

由 Andrew Jones 提交于 2月 16, 2012

We should hang onto bdev until we're done with it.
Signed-off-by: NAndrew Jones <drjones@redhat.com>
[v1: Fixed up git commit description]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

dad5cf65

xen-blkfront: use bitmap_set() and bitmap_clear() · 34ae2e47

由 Akinobu Mita 提交于 1月 21, 2012

Use bitmap_set and bitmap_clear rather than modifying individual bits
in a memory region.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

34ae2e47

05 1月, 2012 1 次提交

Xen: consolidate and simplify struct xenbus_driver instantiation · 73db144b

由 Jan Beulich 提交于 12月 22, 2011

The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.

Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).

Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.

v2: Restore DRV_NAME for the driver name in xen-pciback.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

73db144b

17 12月, 2011 1 次提交

xen-blkfront: Use kcalloc instead of kzalloc to allocate array · f094148a

由 Thomas Meyer 提交于 11月 29, 2011

The advantage of kcalloc is, that will prevent integer overflows which could
result from the multiplication of number of elements and size and it is also
a bit nicer to read.

The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107Signed-off-by: NThomas Meyer <thomas@m3y3r.de>
[v1: Seperated the drivers/block/cciss_scsi.c out of this patch]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

f094148a

19 11月, 2011 2 次提交

xen/blk[front|back]: Enhance discard support with secure erasing support. · 5ea42986

由 Konrad Rzeszutek Wilk 提交于 10月 12, 2011

Part of the blkdev_issue_discard(xx) operation is that it can also
issue a secure discard operation that will permanantly remove the
sectors in question. We advertise that we can support that via the
'discard-secure' attribute and on the request, if the 'secure' bit
is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE.

CC: Li Dongyang <lidongyang@novell.com>
[v1: Used 'flag' instead of 'secure:1' bit]
[v2: Use 'reserved' uint8_t instead of adding a new value]
[v3: Check for nseg when mapping instead of operation]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

5ea42986

xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard together · 97e36834

由 Konrad Rzeszutek Wilk 提交于 10月 12, 2011

In a union type structure to deal with the overlapping
attributes in a easier manner.
Suggested-by: NIan Campbell <Ian.Campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

97e36834

13 10月, 2011 4 次提交

xen-blkfront: plug device number leak in xlblk_init() error path · 469738e6

由 Laszlo Ersek 提交于 10月 07, 2011

... though after a failed xenbus_register_frontend() all may be lost.
Acked-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NLaszlo Ersek <lersek@redhat.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

469738e6

xen-blkfront: If no barrier or flush is supported, use invalid operation. · d11e6158

由 Konrad Rzeszutek Wilk 提交于 9月 16, 2011

Guard against issuing BLKIF_OP_WRITE_BARRIER or BLKIF_OP_FLUSH_CACHE
by checking whether we successfully negotiated with the backend.
The negotiation with the backend also sets the q->flush_flags which
fortunately for us is also used when submitting an bio to us. If
we don't support barriers or flushes it would be set to zero so
we should never end up having to deal with REQ_FLUSH | REQ_FUA.

However, other third party implementations of __make_request that
might be stacked on top of us might not be so smart, so lets fix this up.
Acked-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d11e6158

xen-blkfront: fix a deadlock while handling discard response · 69ef68ce

由 Li Dongyang 提交于 9月 14, 2011

When we get -EOPNOTSUPP response for a discard request, we will clear
the discard flag on the request queue so we won't attempt to send discard
requests to backend again, and this should be protected under rq->queue_lock.
However, when we setup the request queue, we pass blkif_io_lock to
blk_init_queue so rq->queue_lock is blkif_io_lock indeed, and this lock
is already taken when we are in blkif_interrpt, so remove the
spin_lock/spin_unlock when we clear the discard flag or we will end up
with deadlock here
Signed-off-by: NLi Dongyang <lidongyang@novell.com>
[v1: Updated description a bit and removed comment from source]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

69ef68ce

xen-blkfront: Handle discard requests. · ed30bf31

由 Li Dongyang 提交于 9月 01, 2011

If the backend advertises 'feature-discard', then interrogate
the backend for alignment and granularity. Setup the request
queue with the appropiate values and send the discard operation
as required.
Signed-off-by: NLi Dongyang <lidongyang@novell.com>
[v1: Amended commit description]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ed30bf31

15 7月, 2011 2 次提交

xen-blkfront: Fix one off warning about name clash · 89153b5c

由 Stefan Bader 提交于 7月 14, 2011

Avoid telling users to use xvde and onwards when using xvde.
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

89153b5c

xen-blkfront: Drop name and minor adjustments for emulated scsi devices · 196cfe2a

由 Stefan Bader 提交于 7月 14, 2011

These were intended to avoid the namespace clash when representing
emulated IDE and SCSI devices. However that seems to confuse users
more than expected (a disk defined as sda becomes xvde).
So for now go back to the scheme which does no adjustments. This
will break when mixing IDE and SCSI names in the configuration of
guests but should be by now expected.
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

196cfe2a

12 5月, 2011 2 次提交

xen-blkfront: Introduce BLKIF_OP_FLUSH_DISKCACHE support. · edf6ef59

由 Konrad Rzeszutek Wilk 提交于 5月 03, 2011

If the backend supports the 'feature-flush-cache' mode, use that
instead of the 'feature-barrier' support.

Currently there are three backends that support the 'feature-flush-cache'
mode: NetBSD, Solaris and Linux kernel. The 'flush' option is much
light-weight version than the 'barrier' support so lets try to use as
there are no filesystems in the kernel that use full barriers anymore.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

edf6ef59

xen-blkfront: fix data size for xenbus_gather in blkfront_connect · 4352b47a

由 Marek Marczykowski 提交于 5月 03, 2011

barrier variable is int, not long. This overflow caused another variable
override: "err" (in PV code) and "binfo" (in xenlinux code -
drivers/xen/blkfront/blkfront.c). The later caused incorrect device
flags (RO/removable etc).
Signed-off-by: NMarek Marczykowski <marmarek@mimuw.edu.pl>
Acked-by: NIan Campbell <Ian.Campbell@citrix.com>
[v1: Changed title]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

4352b47a

09 3月, 2011 1 次提交

xen: Union the blkif_request request specific fields · 51de6952

由 Owen Smith 提交于 12月 22, 2010

Prepare for extending the block device ring to allow request
specific fields, by moving the request specific fields for
reads, writes and barrier requests to a union member.
Acked-by: NJens Axboe <jaxboe@fusionio.com>
Signed-off-by: NOwen Smith <owen.smith@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

51de6952

openanolis / cloud-kernel 11 个月 前同步成功

openanolis / cloud-kernel
11 个月前同步成功