提交 · 907c3eb18e0bd86ca12a9de80befe8e3647bac3e · openeuler / Kernel

20 8月, 2015 1 次提交

xen-blkfront: convert to blk-mq APIs · 907c3eb1

由 Bob Liu 提交于 7月 13, 2015

Note: This patch is based on original work of Arianna's internship for
GNOME's Outreach Program for Women.

Only one hardware queue is used now, so there is no significant
performance change

The legacy non-mq code is deleted completely which is the same as other
drivers like virtio, mtip, and nvme.

Also dropped one unnecessary holding of info->io_lock when calling
blk_mq_stop_hw_queues().
Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NJens Axboe <axboe@fb.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

907c3eb1

24 7月, 2015 2 次提交

xen-blkfront: don't add indirect pages to list when !feature_persistent · 7b076750

由 Bob Liu 提交于 7月 22, 2015

We should consider info->feature_persistent when adding indirect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.

When we are using persistent grants the indirect_pages list
should always be empty because blkfront has pre-allocated enough
persistent pages to fill all requests on the ring.

CC: stable@vger.kernel.org
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

7b076750

xen-blkfront: introduce blkfront_gather_backend_features() · d50babbe

由 Bob Liu 提交于 7月 22, 2015

There is a bug when migrate from !feature-persistent host to feature-persistent
host, because domU still thinks new host/backend doesn't support persistent.
Dmesg like:
backed has not unmapped grant: 839
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 839

The fix is to recheck feature-persistent of new backend in blkif_recover().
See: https://lkml.org/lkml/2015/5/25/469

As Roger suggested, we can split the part of blkfront_connect that checks for
optional features, like persistent grants, indirect descriptors and
flush/barrier features to a separate function and call it from both
blkfront_connect and blkif_recover
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d50babbe

22 6月, 2015 1 次提交

drivers: xen-blkfront: only talk_to_blkback() when in XenbusStateInitialising · a9b54bb9

由 Bob Liu 提交于 6月 19, 2015

Patch 69b91ede
"drivers: xen-blkback: delay pending_req allocation to connect_ring"
exposed an problem that Xen blkfront has. There is a race
with XenStored and the drivers such that we can see two:

vbd vbd-268440320: blkfront:blkback_changed to state 2.
vbd vbd-268440320: blkfront:blkback_changed to state 2.
vbd vbd-268440320: blkfront:blkback_changed to state 4.

state changes to XenbusStateInitWait ('2'). The end result is that
blkback_changed() receives two notify and calls twice setup_blkring().

While the backend driver may only get the first setup_blkring() which is
wrong and reads out-dated (or reads them as they are being updated
with new ring-ref values).

The end result is that the ring ends up being incorrectly set.

The other drivers in the tree have such checks already in.
Reported-and-Tested-by: NRobert Butera <robert.butera@oracle.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

a9b54bb9

17 6月, 2015 2 次提交

block/xen-blkfront: Remove invalid comment · ee4b7179

由 Julien Grall 提交于 6月 17, 2015

Since commit b7649158 "xen-blkfront: use a different scatterlist for each
request", biovec has been replaced by scatterlist when copying back the
data during a completion request.
Signed-off-by: NJulien Grall <julien.grall@citrix.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

ee4b7179

block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS · db26a686

由 Julien Grall 提交于 6月 17, 2015

Signed-off-by: NJulien Grall <julien.grall@citrix.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

db26a686

06 6月, 2015 2 次提交

xen/block: add multi-page ring support · 86839c56

由 Bob Liu 提交于 6月 03, 2015

Extend xen/block to support multi-page ring, so that more requests can be
issued by using more than one pages as the request ring between blkfront
and backend.
As a result, the performance can get improved significantly.

We got some impressive improvements on our highend iscsi storage cluster
backend. If using 64 pages as the ring, the IOPS increased about 15 times
for the throughput testing and above doubled for the latency testing.

The reason was the limit on outstanding requests is 32 if use only one-page
ring, but in our case the iscsi lun was spread across about 100 physical
drives, 32 was really not enough to keep them busy.

Changes in v2:
 - Rebased to 4.0-rc6.
 - Document on how multi-page ring feature working to linux io/blkif.h.

Changes in v3:
 - Remove changes to linux io/blkif.h and follow the protocol defined
   in io/blkif.h of XEN tree.
 - Rebased to 4.1-rc3

Changes in v4:
 - Turn to use 'ring-page-order' and 'max-ring-page-order'.
 - A few comments from Roger.

Changes in v5:
 - Clarify with 4k granularity to comment
 - Address more comments from Roger
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

86839c56

driver: xen-blkfront: move talk_to_blkback to a more suitable place · 8ab0144a

由 Bob Liu 提交于 6月 03, 2015

The major responsibility of talk_to_blkback() is allocate and initialize
the request ring and write the ring info to xenstore.
But this work should be done after backend entered 'XenbusStateInitWait' as
defined in the protocol file.
See xen/include/public/io/blkif.h in XEN git tree:
Front                                Back
=================================    =====================================
XenbusStateInitialising              XenbusStateInitialising
 o Query virtual device               o Query backend device identification
   properties.                          data.
 o Setup OS device instance.          o Open and validate backend device.
                                      o Publish backend features and
                                        transport parameters.
                                                     |
                                                     |
                                                     V
                                     XenbusStateInitWait

o Query backend features and
  transport parameters.
o Allocate and initialize the
  request ring.

There is no problem with this yet, but it is an violation of the design and
furthermore it would not allow frontend/backend to negotiate 'multi-page'
and 'multi-queue' features.

Changes in v2:
 - Re-write the commit message to be more clear.
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

8ab0144a

15 4月, 2015 1 次提交

xenbus_client: Extend interface to support multi-page ring · ccc9d90a

由 Wei Liu 提交于 4月 03, 2015

Originally Xen PV drivers only use single-page ring to pass along
information. This might limit the throughput between frontend and
backend.

The patch extends Xenbus driver to support multi-page ring, which in
general should improve throughput if ring is the bottleneck. Changes to
various frontend / backend to adapt to the new interface are also
included.

Affected Xen drivers:
* blkfront/back
* netfront/back
* pcifront/back
* scsifront/back
* vtpmfront

The interface is documented, as before, in xenbus_client.c.
Signed-off-by: NWei Liu <wei.liu2@citrix.com>
Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

ccc9d90a

13 2月, 2015 1 次提交

kernel.h: remove ancient __FUNCTION__ hack · 02f1f217

由 Rasmus Villemoes 提交于 2月 12, 2015

__FUNCTION__ hasn't been treated as a string literal since gcc 3.4, so
this only helps people who only test-compile using 3.3 (compiler-gcc3.h
barks at anything older than that).  Besides, there are almost no
occurrences of __FUNCTION__ left in the tree.

[akpm@linux-foundation.org: convert remaining __FUNCTION__ references]
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

02f1f217

11 2月, 2015 1 次提交

xen-blkfront: fix accounting of reqs when migrating · 3bb8c98e

由 Roger Pau Monne 提交于 2月 02, 2015

Current migration code uses blk_put_request in order to finish a request
before requeuing it. This function doesn't update the statistics of the
queue, which completely screws accounting. Use blk_end_request_all instead
which properly updates the statistics of the queue.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: NOuyang Zhaowei (Charles) <ouyangzhaowei@huawei.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xenproject.org

3bb8c98e

11 12月, 2014 2 次提交

xen/blkfront: remove redundant flush_op · fdf9b965

由 Vitaly Kuznetsov 提交于 12月 09, 2014

flush_op is unambiguously defined by feature_flush:
    REQ_FUA | REQ_FLUSH -> BLKIF_OP_WRITE_BARRIER
    REQ_FLUSH -> BLKIF_OP_FLUSH_DISKCACHE
    0 -> 0
and thus can be removed. This is just a cleanup.

The patch was suggested by Boris Ostrovsky.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

fdf9b965

xen/blkfront: improve protection against issuing unsupported REQ_FUA · ad42d391

由 Vitaly Kuznetsov 提交于 12月 01, 2014

Guard against issuing unsupported REQ_FUA and REQ_FLUSH was introduced
in d11e6158 and was factored out into blkif_request_flush_valid() in
0f1ca65e. However:
1) This check in incomplete. In case we negotiated to feature_flush = REQ_FLUSH
   and flush_op = BLKIF_OP_FLUSH_DISKCACHE (so FUA is unsupported) FUA request
   will still pass the check.
2) blkif_request_flush_valid() is misnamed. It is bool but returns true when
   the request is invalid.
3) When blkif_request_flush_valid() fails -EIO is being returned. It seems that
   -EOPNOTSUPP is more appropriate here.
Fix all of the above issues.

This patch is based on the original patch by Laszlo Ersek and a comment by
Jeff Moyer.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ad42d391

06 10月, 2014 1 次提交

xen: remove DEFINE_XENBUS_DRIVER() macro · 95afae48

由 David Vrabel 提交于 9月 08, 2014

The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse
errors.

Replace the uses with standard structure definitions instead.  This is
similar to pci and usb device registration.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

95afae48

02 10月, 2014 1 次提交

xen, blkfront: factor out flush-related checks from do_blkif_request() · 0f1ca65e

由 Arianna Avanzini 提交于 8月 22, 2014

This commit factors out some checks related to the request insertion
path, which can be done in an function instead of by itself.
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NArianna Avanzini <avanzini.arianna@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

0f1ca65e

29 5月, 2014 1 次提交

xen-blkfront: remove type check from blkfront_setup_discard · 1c8cad6c

由 Olaf Hering 提交于 5月 21, 2014

In its initial implementation a check for "type" was added, but only phy
and file are handled. This breaks advertised discard support for other
type values such as qdisk.

Fix and simplify this function: If the backend advertises discard
support it is supposed to implement it properly, so enable
feature_discard unconditionally. If the backend advertises the need for
a certain granularity and alignment then propagate both properties to
the blocklayer. The discard-secure property is a boolean, update the code
to reflect that.
Signed-off-by: NOlaf Hering <olaf@aepfle.de>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

1c8cad6c

16 4月, 2014 1 次提交

block: remove struct request buffer member · b4f42e28

由 Jens Axboe 提交于 4月 10, 2014

This was used in the olden days, back when onions were proper
yellow. Basically it mapped to the current buffer to be
transferred. With highmem being added more than a decade ago,
most drivers map pages out of a bio, and rq->buffer isn't
pointing at anything valid.

Convert old style drivers to just use bio_data().

For the discard payload use case, just reference the page
in the bio.
Signed-off-by: NJens Axboe <axboe@fb.com>

b4f42e28

08 2月, 2014 2 次提交

xen-blkfront: handle backend CLOSED without CLOSING · 36613717

由 David Vrabel 提交于 2月 04, 2014

Backend drivers shouldn't transistion to CLOSED unless the frontend is
CLOSED.  If a backend does transition to CLOSED too soon then the
frontend may not see the CLOSING state and will not properly shutdown.

So, treat an unexpected backend CLOSED state the same as CLOSING.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

36613717

xen-blkif: drop struct blkif_request_segment_aligned · 80bfa2f6

由 Roger Pau Monne 提交于 2月 04, 2014

This was wrongly introduced in commit 402b27f9, the only difference
between blkif_request_segment_aligned and blkif_request_segment is
that the former has a named padding, while both share the same
memory layout.

Also correct a few minor glitches in the description, including for it
to no longer assume PAGE_SIZE == 4096.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
[Description fix by Jan Beulich]
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Reported-by: NJan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: NMatt Rushton <mrushton@amazon.com>
Cc: Matt Wilson <msw@amazon.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

80bfa2f6

04 1月, 2014 1 次提交

xen/pvhvm: If xen_platform_pci=0 is set don't blow up (v4). · 51c71a3b

由 Konrad Rzeszutek Wilk 提交于 11月 26, 2013

The user has the option of disabling the platform driver:
00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)

which is used to unplug the emulated drivers (IDE, Realtek 8169, etc)
and allow the PV drivers to take over. If the user wishes
to disable that they can set:

  xen_platform_pci=0
  (in the guest config file)

or
  xen_emul_unplug=never
  (on the Linux command line)

except it does not work properly. The PV drivers still try to
load and since the Xen platform driver is not run - and it
has not initialized the grant tables, most of the PV drivers
stumble upon:

input: Xen Virtual Keyboard as /devices/virtual/input/input5
input: Xen Virtual Pointer as /devices/virtual/input/input6M
------------[ cut here ]------------
kernel BUG at /home/konrad/ssd/konrad/linux/drivers/xen/grant-table.c:1206!
invalid opcode: 0000 [#1] SMP
Modules linked in: xen_kbdfront(+) xenfs xen_privcmd
CPU: 6 PID: 1389 Comm: modprobe Not tainted 3.13.0-rc1upstream-00021-ga6c892b-dirty #1
Hardware name: Xen HVM domU, BIOS 4.4-unstable 11/26/2013
RIP: 0010:[<ffffffff813ddc40>]  [<ffffffff813ddc40>] get_free_entries+0x2e0/0x300
Call Trace:
 [<ffffffff8150d9a3>] ? evdev_connect+0x1e3/0x240
 [<ffffffff813ddd0e>] gnttab_grant_foreign_access+0x2e/0x70
 [<ffffffffa0010081>] xenkbd_connect_backend+0x41/0x290 [xen_kbdfront]
 [<ffffffffa0010a12>] xenkbd_probe+0x2f2/0x324 [xen_kbdfront]
 [<ffffffff813e5757>] xenbus_dev_probe+0x77/0x130
 [<ffffffff813e7217>] xenbus_frontend_dev_probe+0x47/0x50
 [<ffffffff8145e9a9>] driver_probe_device+0x89/0x230
 [<ffffffff8145ebeb>] __driver_attach+0x9b/0xa0
 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230
 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230
 [<ffffffff8145cf1c>] bus_for_each_dev+0x8c/0xb0
 [<ffffffff8145e7d9>] driver_attach+0x19/0x20
 [<ffffffff8145e260>] bus_add_driver+0x1a0/0x220
 [<ffffffff8145f1ff>] driver_register+0x5f/0xf0
 [<ffffffff813e55c5>] xenbus_register_driver_common+0x15/0x20
 [<ffffffff813e76b3>] xenbus_register_frontend+0x23/0x40
 [<ffffffffa0015000>] ? 0xffffffffa0014fff
 [<ffffffffa001502b>] xenkbd_init+0x2b/0x1000 [xen_kbdfront]
 [<ffffffff81002049>] do_one_initcall+0x49/0x170

.. snip..

which is hardly nice. This patch fixes this by having each
PV driver check for:
 - if running in PV, then it is fine to execute (as that is their
   native environment).
 - if running in HVM, check if user wanted 'xen_emul_unplug=never',
   in which case bail out and don't load any PV drivers.
 - if running in HVM, and if PCI device 5853:0001 (xen_platform_pci)
   does not exist, then bail out and not load PV drivers.
 - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=ide-disks',
   then bail out for all PV devices _except_ the block one.
   Ditto for the network one ('nics').
 - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=unnecessary'
   then load block PV driver, and also setup the legacy IDE paths.
   In (v3) make it actually load PV drivers.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it
Reported-by: NAnthony PERARD <anthony.perard@citrix.com>
Reported-and-Tested-by: NFabio Fantoni <fabio.fantoni@m2r.biz>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Add extra logic to handle the myrid ways 'xen_emul_unplug'
can be used per Ian and Stefano suggestion]
[v3: Make the unnecessary case work properly]
[v4: s/disks/ide-disks/ spotted by Fabio]
Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> [for PCI parts]
CC: stable@vger.kernel.org

51c71a3b

27 11月, 2013 2 次提交

block: xen-blkfront: Fix possible NULL ptr dereference · 2f089cb8

由 Felipe Pena 提交于 11月 09, 2013

In the blkif_release function the bdget_disk() call might returns
a NULL ptr which might be dereferenced on bdev->bd_openers checking
Signed-off-by: NFelipe Pena <felipensp@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Added WARN per Roger's suggestion]

2f089cb8

xen-blkfront: Silence pfn maybe-uninitialized warning · 427bfe07

由 Tim Gardner 提交于 11月 14, 2013

pfn cannot actually be used unless (!info->feature_persistent), nor is
pfn accessed in get_grant() unless (!info->feature_persistent), but silence
this warning anyway. gcc-4.8

drivers/block/xen-blkfront.c: In function 'do_blkif_request':
drivers/block/xen-blkfront.c:508:20: warning: 'pfn' may be used uninitialized in this function [-Wmaybe-uninitialized]
     gnt_list_entry = get_grant(&gref_head, pfn, info);
                    ^
drivers/block/xen-blkfront.c:492:19: note: 'pfn' was declared here
     unsigned long pfn;

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NTim Gardner <tim.gardner@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>

427bfe07

24 11月, 2013 1 次提交

block: Abstract out bvec iterator · 4f024f37

由 Kent Overstreet 提交于 10月 11, 2013

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6

4f024f37

09 11月, 2013 4 次提交

xen-blkfront: restore the non-persistent data path · bfe11d6d

由 Roger Pau Monne 提交于 10月 29, 2013

When persistent grants were added they were always used, even if the
backend doesn't have this feature (there's no harm in always using the
same set of pages). This restores the old data path when the backend
doesn't have persistent grants, removing the burden of doing a memcpy
when it is not actually needed.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-by: NFelipe Franciosi <felipe.franciosi@citrix.com>
Cc: Felipe Franciosi <felipe.franciosi@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Fix up whitespace issues]

bfe11d6d

xen-blkfront: improve aproximation of required grants per request · c47206e2

由 Roger Pau Monne 提交于 8月 12, 2013

Improve the calculation of required grants to process a request by
using nr_phys_segments instead of always assuming a request is going
to use all posible segments.

nr_phys_segments contains the number of scatter-gather DMA addr+len
pairs, which is basically what we put at every granted page.
for_each_sg iterates over the DMA addr+len pairs and uses a grant
page for each of them.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c47206e2

xen-blkfront: revoke foreign access for grants not mapped by the backend · fbe363c4

由 Roger Pau Monne 提交于 8月 12, 2013

There's no need to keep the foreign access in a grant if it is not
persistently mapped by the backend. This allows us to free grants that
are not mapped by the backend, thus preventing blkfront from hoarding
all grants.

The main effect of this is that blkfront will only persistently map
the same grants as the backend, and it will always try to use grants
that are already mapped by the backend. Also the number of persistent
grants in blkfront is the same as in blkback (and is controlled by the
value in blkback).
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NMatt Wilson <msw@amazon.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fbe363c4

block: Consolidate duplicated bio_trim() implementations · 6678d83f

由 Kent Overstreet 提交于 8月 07, 2013

Someone cut and pasted md's md_trim_bio() into xen-blkfront.c. Come on,
we should know better than this.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6678d83f

22 6月, 2013 1 次提交

xen-blkfront: set blk_queue_max_hw_sectors correctly · 294caaf2

由 Roger Pau Monne 提交于 6月 21, 2013

Now that indirect segments are enabled blk_queue_max_hw_sectors must
be set to match the maximum number of sectors we can handle in a
request.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-by: NFelipe Franciosi <felipe.franciosi@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

294caaf2

08 6月, 2013 1 次提交

xen/blkback: Use physical sector size for setup · 7c4d7d71

由 Stefan Bader 提交于 5月 13, 2013

Currently xen-blkback passes the logical sector size over xenbus and
xen-blkfront sets up the paravirt disk with that logical block size.
But newer drives usually have the logical sector size set to 512 for
compatibility reasons and would show the actual sector size only in
physical sector size.
This results in the device being partitioned and accessed in dom0 with
the correct sector size, but the guest thinks 512 bytes is the correct
block size. And that results in poor performance.

To fix this, blkback gets modified to pass also physical-sector-size
over xenbus and blkfront to use both values to set up the paravirt
disk. I did not just change the passed in sector-size because I am
not sure having a bigger logical sector size than the physical one
is valid (and that would happen if a newer dom0 kernel hits an older
domU kernel). Also this way a domU set up before should still be
accessible (just some tools might detect the unaligned setup).

[v2: Make xenbus write failure non-fatal]
[v3: Use xenbus_scanf instead of xenbus_gather]
[v4: Rebased against segment changes]
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

7c4d7d71

05 6月, 2013 1 次提交

xen-blkfront: Introduce a 'max' module parameter to alter the amount of indirect segments. · 2d5dc3ba

由 Konrad Rzeszutek Wilk 提交于 5月 15, 2013

The max module parameter (by default 32) is the maximum number of
segments that the frontend will negotiate with the backend for indirect
descriptors.  Higher value means more potential throughput but more
memory usage. The backend picks the minimum of the frontend and its
default backend value.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

2d5dc3ba

08 5月, 2013 1 次提交

xen-blkfront: use a different scatterlist for each request · b7649158

由 Roger Pau Monne 提交于 5月 02, 2013

In blkif_queue_request blkfront iterates over the scatterlist in order
to set the segments of the request, and in blkif_completion blkfront
iterates over the raw request, which makes it hard to know the exact
position of the source and destination memory positions.

This can be solved by allocating a scatterlist for each request, that
will be keep until the request is finished, allowing us to copy the
data back to the original memory without having to iterate over the
raw request.

Oracle-Bug: 16660413 - LARGE ASYNCHRONOUS READS APPEAR BROKEN ON 2.6.39-400
CC: stable@vger.kernel.org
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: NAnne Milicia <anne.milicia@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b7649158

07 5月, 2013 1 次提交

block_device_operations->release() should return void · db2a144b

由 Al Viro 提交于 5月 05, 2013

The value passed is 0 in all but "it can never happen" cases (and those
only in a couple of drivers) *and* it would've been lost on the way
out anyway, even if something tried to pass something meaningful.
Just don't bother.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

db2a144b

19 4月, 2013 1 次提交

xen-block: implement indirect descriptors · 402b27f9

由 Roger Pau Monne 提交于 4月 18, 2013

Indirect descriptors introduce a new block operation
(BLKIF_OP_INDIRECT) that passes grant references instead of segments
in the request. This grant references are filled with arrays of
blkif_request_segment_aligned, this way we can send more segments in a
request.

The proposed implementation sets the maximum number of indirect grefs
(frames filled with blkif_request_segment_aligned) to 256 in the
backend and 32 in the frontend. The value in the frontend has been
chosen experimentally, and the backend value has been set to a sane
value that allows expanding the maximum number of indirect descriptors
in the frontend if needed.

The migration code has changed from the previous implementation, in
which we simply remapped the segments on the shared ring. Now the
maximum number of segments allowed in a request can change depending
on the backend, so we have to requeue all the requests in the ring and
in the queue and split the bios in them if they are bigger than the
new maximum number of segments.

[v2: Fixed minor comments by Konrad.
[v1: Added padding to make the indirect request 64bit aligned.
 Added some BUGs, comments; fixed number of indirect pages in
 blkif_get_x86_{32/64}_req. Added description about the indirect operation
 in blkif.h]
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
[v3: Fixed spaces and tabs mix ups]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

402b27f9

20 3月, 2013 2 次提交

xen-blkfront: remove frame list from blk_shadow · b1173e31

由 Roger Pau Monne 提交于 3月 18, 2013

We already have the frame (pfn of the grant page) stored inside struct
grant, so there's no need to keep an aditional list of mapped frames
for a specific request. This reduces memory usage in blkfront.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xen.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b1173e31

xen-blkfront: pre-allocate pages for requests · 9c1e050c

由 Roger Pau Monne 提交于 3月 18, 2013

This prevents us from having to call alloc_page while we are preparing
the request. Since blkfront was calling alloc_page with a spinlock
held we used GFP_ATOMIC, which can fail if we are requesting a lot of
pages since it is using the emergency memory pools.

Allocating all the pages at init prevents us from having to call
alloc_page, thus preventing possible failures.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xen.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9c1e050c

19 3月, 2013 2 次提交

xen-blkfront: switch from llist to list · 155b7edb

由 Roger Pau Monne 提交于 3月 18, 2013

The git commit f84adf49
(xen-blkfront: drop the use of llist_for_each_entry_safe)

was a stop-gate to fix a GCC4.1 bug. The appropiate way
is to actually use an list instead of using an llist.

As such this patch replaces the usage of llist with an
list.

Since we always manipulate the list while holding the io_lock, there's
no need for additional locking (llist used previously is safe to use
concurrently without additional locking).
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
CC: stable@vger.kernel.org
[v1: Redid the git commit description]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

155b7edb

xen-blkfront: replace kmalloc and then memcpy with kmemdup · 29d0b218

由 Mihnea Dobrescu-Balaur 提交于 3月 11, 2013

The benefits are:
* code is cleaner
* kmemdup adds additional debugging info useful for tracking the real
place where memory was allocated (CONFIG_DEBUG_SLAB).
Signed-off-by: NMihnea Dobrescu-Balaur <mihneadb@gmail.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

29d0b218

20 2月, 2013 1 次提交

xen-blkfront: drop the use of llist_for_each_entry_safe · f84adf49

由 Konrad Rzeszutek Wilk 提交于 2月 13, 2013

Replace llist_for_each_entry_safe with a while loop.

llist_for_each_entry_safe can trigger a bug in GCC 4.1, so it's best
to remove it and use a while loop and do the deletion manually.

Specifically this bug can be triggered by hot-unplugging a disk, either
by doing xm block-detach or by save/restore cycle.

BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffffa0047223>] blkif_free+0x63/0x130 [xen_blkfront]
The crash call trace is:
	...
bad_area_nosemaphore+0x13/0x20
do_page_fault+0x25e/0x4b0
page_fault+0x25/0x30
? blkif_free+0x63/0x130 [xen_blkfront]
blkfront_resume+0x46/0xa0 [xen_blkfront]
xenbus_dev_resume+0x6c/0x140
pm_op+0x192/0x1b0
device_resume+0x82/0x1e0
dpm_resume+0xc9/0x1a0
dpm_resume_end+0x15/0x30
do_suspend+0x117/0x1e0

When drilling down to the assembler code, on newer GCC it does
.L29:
        cmpq    $-16, %r12      #, persistent_gnt check
        je      .L30    	#, out of the loop
.L25:
	... code in the loop
        testq   %r13, %r13      # n
        je      .L29    	#, back to the top of the loop
        cmpq    $-16, %r12      #, persistent_gnt check
        movq    16(%r12), %r13  # <variable>.node.next, n
        jne     .L25    	#,	back to the top of the loop
.L30:

While on GCC 4.1, it is:
L78:
	... code in the loop
	testq   %r13, %r13      # n
        je      .L78    #,	back to the top of the loop
        movq    16(%rbx), %r13  # <variable>.node.next, n
        jmp     .L78    #,	back to the top of the loop

Which basically means that the exit loop condition instead of
being:

	&(pos)->member != NULL;

is:
	;

which makes the loop unbound.

Since xen-blkfront is the only user of the llist_for_each_entry_safe
macro remove it from llist.h.

Orabug: 16263164
CC: stable@vger.kernel.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

f84adf49

18 12月, 2012 2 次提交

xen-blkfront: handle bvecs with partial data · d62f6918

由 Roger Pau Monne 提交于 12月 07, 2012

Currently blkfront fails to handle cases in blkif_completion like the
following:

1st loop in rq_for_each_segment
 * bv_offset: 3584
 * bv_len: 512
 * offset += bv_len
 * i: 0

2nd loop:
 * bv_offset: 0
 * bv_len: 512
 * i: 0

In the second loop i should be 1, since we assume we only wanted to
read a part of the previous page. This patches fixes this cases where
only a part of the shared page is read, and blkif_completion assumes
that if the bv_offset of a bvec is less than the previous bv_offset
plus the bv_size we have to switch to the next shared page.
Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: linux-kernel@vger.kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d62f6918

llist/xen-blkfront: implement safe version of llist_for_each_entry · ebb351cf

由 Roger Pau Monne 提交于 12月 04, 2012

Implement a safe version of llist_for_each_entry, and use it in
blkif_free. Previously grants where freed while iterating the list,
which lead to dereferences when trying to fetch the next item.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
[v2: Move the llist_for_each_entry_safe in llist.h]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ebb351cf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功