- 24 9月, 2015 1 次提交
-
-
由 Roger Pau Monne 提交于
This is due to commit 86839c56 "xen/block: add multi-page ring support" When using an guest under UEFI - after the domain is destroyed the following warning comes from blkback. ------------[ cut here ]------------ WARNING: CPU: 2 PID: 95 at /home/julien/works/linux/drivers/block/xen-blkback/xenbus.c:274 xen_blkif_deferred_free+0x1f4/0x1f8() Modules linked in: CPU: 2 PID: 95 Comm: kworker/2:1 Tainted: G W 4.2.0 #85 Hardware name: APM X-Gene Mustang board (DT) Workqueue: events xen_blkif_deferred_free Call trace: [<ffff8000000890a8>] dump_backtrace+0x0/0x124 [<ffff8000000891dc>] show_stack+0x10/0x1c [<ffff8000007653bc>] dump_stack+0x78/0x98 [<ffff800000097e88>] warn_slowpath_common+0x9c/0xd4 [<ffff800000097f80>] warn_slowpath_null+0x14/0x20 [<ffff800000557a0c>] xen_blkif_deferred_free+0x1f0/0x1f8 [<ffff8000000ad020>] process_one_work+0x160/0x3b4 [<ffff8000000ad3b4>] worker_thread+0x140/0x494 [<ffff8000000b2e34>] kthread+0xd8/0xf0 ---[ end trace 6f859b7883c88cdd ]--- Request allocation has been moved to connect_ring, which is called every time blkback connects to the frontend (this can happen multiple times during a blkback instance life cycle). On the other hand, request freeing has not been moved, so it's only called when destroying the backend instance. Due to this mismatch, blkback can allocate the request pool multiple times, without freeing it. In order to fix it, move the freeing of requests to xen_blkif_disconnect to restore the symmetry between request allocation and freeing. Reported-by: NJulien Grall <julien.grall@citrix.com> Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Tested-by: NJulien Grall <julien.grall@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: xen-devel@lists.xenproject.org CC: stable@vger.kernel.org # 4.2 Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 29 7月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 24 7月, 2015 1 次提交
-
-
由 Bob Liu 提交于
The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge work haven't finished. There is a work_pending() before this BUG_ON, but it doesn't account if the work is still currently running. CC: stable@vger.kernel.org Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Signed-off-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 17 6月, 2015 1 次提交
-
-
由 Julien Grall 提交于
Make the code less confusing to read now that Linux may not have the same page size as Xen. Signed-off-by: NJulien Grall <julien.grall@citrix.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 06 6月, 2015 2 次提交
-
-
由 Bob Liu 提交于
Extend xen/block to support multi-page ring, so that more requests can be issued by using more than one pages as the request ring between blkfront and backend. As a result, the performance can get improved significantly. We got some impressive improvements on our highend iscsi storage cluster backend. If using 64 pages as the ring, the IOPS increased about 15 times for the throughput testing and above doubled for the latency testing. The reason was the limit on outstanding requests is 32 if use only one-page ring, but in our case the iscsi lun was spread across about 100 physical drives, 32 was really not enough to keep them busy. Changes in v2: - Rebased to 4.0-rc6. - Document on how multi-page ring feature working to linux io/blkif.h. Changes in v3: - Remove changes to linux io/blkif.h and follow the protocol defined in io/blkif.h of XEN tree. - Rebased to 4.1-rc3 Changes in v4: - Turn to use 'ring-page-order' and 'max-ring-page-order'. - A few comments from Roger. Changes in v5: - Clarify with 4k granularity to comment - Address more comments from Roger Signed-off-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Bob Liu 提交于
This is a pre-patch for multi-page ring feature. In connect_ring, we can know exactly how many pages are used for the shared ring, delay pending_req allocation here so that we won't waste too much memory. Signed-off-by: NBob Liu <bob.liu@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 27 4月, 2015 2 次提交
-
-
由 Bob Liu 提交于
There are several place using gnttab async unmap and wait for completion, so move the common code to a function gnttab_unmap_refs_sync(). Signed-off-by: NBob Liu <bob.liu@oracle.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 Bob Liu 提交于
Commit c43cf3ea ("xen-blkback: safely unmap grants in case they are still in use") use gnttab_unmap_refs_async() to wait until the mapped pages are no longer in use before unmapping them, but that commit missed the persistent case. Purge persistent pages can't be unmapped either unless no longer in use. Signed-off-by: NBob Liu <bob.liu@oracle.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 15 4月, 2015 1 次提交
-
-
由 Wei Liu 提交于
Originally Xen PV drivers only use single-page ring to pass along information. This might limit the throughput between frontend and backend. The patch extends Xenbus driver to support multi-page ring, which in general should improve throughput if ring is the bottleneck. Changes to various frontend / backend to adapt to the new interface are also included. Affected Xen drivers: * blkfront/back * netfront/back * pcifront/back * scsifront/back * vtpmfront The interface is documented, as before, in xenbus_client.c. Signed-off-by: NWei Liu <wei.liu2@citrix.com> Signed-off-by: NPaul Durrant <paul.durrant@citrix.com> Signed-off-by: NBob Liu <bob.liu@oracle.com> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 07 4月, 2015 2 次提交
-
-
由 Tao Chen 提交于
Define pr_fmt macro with {xen-blkback: } prefix, then remove all use of DRV_PFX in the pr sentences. Replace all DPRINTK with pr sentences, and get rid of DPRINTK macro. It will simplify the code. And if the pr sentences miss a \n, add it in the end. If the DPRINTK sentences have redundant \n, remove it. It will format the code. These all make the readability of the code become better. Signed-off-by: NTao Chen <boby.chen@huawei.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
-
由 Tao Chen 提交于
The blkback name is like blkback.domid.xvd[a-z], if domid has four digits (means larger than 1000), then the backmost xvd wouldn't be fully shown. Define a BLKBACK_NAME_LEN macro to be 20, enlarge the array size of blkback name, so it will be fully shown in any case. Signed-off-by: NTao Chen <boby.chen@huawei.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
-
- 11 2月, 2015 1 次提交
-
-
由 David Vrabel 提交于
Prior to the existance of 64-bit backends using the X86_64 ABI, frontends used the X86_32 ABI. These old frontends do not specify the ABI and when used with a 64-bit backend do not work. On x86, default to the X86_32 ABI if one is not specified. Backends on ARM continue to default to their NATIVE ABI. Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
-
- 28 1月, 2015 2 次提交
-
-
由 Jennifer Herbert 提交于
Use gnttab_unmap_refs_async() to wait until the mapped pages are no longer in use before unmapping them. This allows blkback to use network storage which may retain refs to pages in queued skbs after the block I/O has completed. Signed-off-by: NJennifer Herbert <jennifer.herbert@citrix.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Acked-by: NJens Axboe <axboe@kernel.de> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
由 David Vrabel 提交于
Add gnttab_alloc_pages() and gnttab_free_pages() to allocate/free pages suitable to for granted maps. Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com> Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
-
- 06 10月, 2014 1 次提交
-
-
由 David Vrabel 提交于
The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse errors. Replace the uses with standard structure definitions instead. This is similar to pci and usb device registration. Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 02 10月, 2014 2 次提交
-
-
由 Roger Pau Monné 提交于
Fix leaking a page when a grant mapping has failed. CC: stable@vger.kernel.org Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Reported-and-Tested-by: NTao Chen <boby.chen@huawei.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Vitaly Kuznetsov 提交于
blkback does not unmap persistent grants when frontend goes to Closed state (e.g. when blkfront module is being removed). This leads to the following in guest's dmesg: [ 343.243825] xen:grant_table: WARNING: g.e. 0x445 still in use! [ 343.243825] xen:grant_table: WARNING: g.e. 0x42a still in use! ... When load module -> use device -> unload module sequence is performed multiple times it is possible to hit BUG() condition in blkfront module: [ 343.243825] kernel BUG at drivers/block/xen-blkfront.c:954! [ 343.243825] invalid opcode: 0000 [#1] SMP [ 343.243825] Modules linked in: xen_blkfront(-) ata_generic pata_acpi [last unloaded: xen_blkfront] ... [ 343.243825] Call Trace: [ 343.243825] [<ffffffff814111ef>] ? unregister_xenbus_watch+0x16f/0x1e0 [ 343.243825] [<ffffffffa0016fbf>] blkfront_remove+0x3f/0x140 [xen_blkfront] ... [ 343.243825] RIP [<ffffffffa0016aae>] blkif_free+0x34e/0x360 [xen_blkfront] [ 343.243825] RSP <ffff88001eb8fdc0> We don't need to keep these grants if we're disconnecting as frontend might already forgot about them. Solve the issue by moving xen_blkbk_free_caches() call from xen_blkif_free() to xen_blkif_disconnect(). Now we can see the following: [ 928.590893] xen:grant_table: WARNING: g.e. 0x587 still in use! [ 928.591861] xen:grant_table: WARNING: g.e. 0x372 still in use! ... [ 929.592146] xen:grant_table: freeing g.e. 0x587 [ 929.597174] xen:grant_table: freeing g.e. 0x372 ... Backend does not keep persistent grants any more, reconnect works fine. CC: stable@vger.kernel.org Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 29 5月, 2014 2 次提交
-
-
由 Valentin Priescu 提交于
Currently xenwatch blocks in VBD disconnect, waiting for all pending I/O requests to finish. If the VBD is attached to a hot-swappable disk, then xenwatch can hang for a long period of time, stalling other watches. INFO: task xenwatch:39 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ffff880057f01bd0 0000000000000246 ffff880057f01ac0 ffffffff810b0782 ffff880057f01ad0 00000000000131c0 0000000000000004 ffff880057edb040 ffff8800344c6080 0000000000000000 ffff880058c00ba0 ffff880057edb040 Call Trace: [<ffffffff810b0782>] ? irq_to_desc+0x12/0x20 [<ffffffff8128f761>] ? list_del+0x11/0x40 [<ffffffff8147a080>] ? wait_for_common+0x60/0x160 [<ffffffff8147bcef>] ? _raw_spin_lock_irqsave+0x2f/0x50 [<ffffffff8147bd49>] ? _raw_spin_unlock_irqrestore+0x19/0x20 [<ffffffff8147a26a>] schedule+0x3a/0x60 [<ffffffffa018fe6a>] xen_blkif_disconnect+0x8a/0x100 [xen_blkback] [<ffffffff81079f70>] ? wake_up_bit+0x40/0x40 [<ffffffffa018ffce>] xen_blkbk_remove+0xae/0x1e0 [xen_blkback] [<ffffffff8130b254>] xenbus_dev_remove+0x44/0x90 [<ffffffff81345cb7>] __device_release_driver+0x77/0xd0 [<ffffffff81346488>] device_release_driver+0x28/0x40 [<ffffffff813456e8>] bus_remove_device+0x78/0xe0 [<ffffffff81342c9f>] device_del+0x12f/0x1a0 [<ffffffff81342d2d>] device_unregister+0x1d/0x60 [<ffffffffa0190826>] frontend_changed+0xa6/0x4d0 [xen_blkback] [<ffffffffa019c252>] ? frontend_changed+0x192/0x650 [xen_netback] [<ffffffff8130ae50>] ? cmp_dev+0x60/0x60 [<ffffffff81344fe4>] ? bus_for_each_dev+0x94/0xa0 [<ffffffff8130b06e>] xenbus_otherend_changed+0xbe/0x120 [<ffffffff8130b4cb>] frontend_changed+0xb/0x10 [<ffffffff81309c82>] xenwatch_thread+0xf2/0x130 [<ffffffff81079f70>] ? wake_up_bit+0x40/0x40 [<ffffffff81309b90>] ? xenbus_directory+0x80/0x80 [<ffffffff810799d6>] kthread+0x96/0xa0 [<ffffffff81485934>] kernel_thread_helper+0x4/0x10 [<ffffffff814839f3>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8147c17c>] ? retint_restore_args+0x5/0x6 [<ffffffff81485930>] ? gs_change+0x13/0x13 With this patch, when there is still pending I/O, the actual disconnect is done by the last reference holder (last pending I/O request). In this case, xenwatch doesn't block indefinitely. Signed-off-by: NValentin Priescu <priescuv@amazon.com> Reviewed-by: NSteven Kady <stevkady@amazon.com> Reviewed-by: NSteven Noonan <snoonan@amazon.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Olaf Hering 提交于
Newer toolstacks may provide a boolean property "discard-enable" in the backend node. Its purpose is to disable discard for file backed storage to avoid fragmentation. Recognize this setting also for physical storage. If that property exists and is false, do not advertise "feature-discard" to the frontend. Signed-off-by: NOlaf Hering <olaf@aepfle.de> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 12 2月, 2014 1 次提交
-
-
由 Roger Pau Monne 提交于
Initialize persistent_purge_work work_struct on xen_blkif_alloc (and remove the previous initialization done in purge_persistent_gnt). This prevents flush_work from complaining even if purge_persistent_gnt has not been used. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Tested-by: NSander Eikelenboom <linux@eikelenboom.it> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 2月, 2014 4 次提交
-
-
由 Roger Pau Monne 提交于
This was wrongly introduced in commit 402b27f9, the only difference between blkif_request_segment_aligned and blkif_request_segment is that the former has a named padding, while both share the same memory layout. Also correct a few minor glitches in the description, including for it to no longer assume PAGE_SIZE == 4096. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> [Description fix by Jan Beulich] Signed-off-by: NJan Beulich <jbeulich@suse.com> Reported-by: NJan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: NMatt Rushton <mrushton@amazon.com> Cc: Matt Wilson <msw@amazon.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Roger Pau Monne 提交于
Introduce a new variable to keep track of the number of in-flight requests. We need to make sure that when xen_blkif_put is called the request has already been freed and we can safely free xen_blkif, which was not the case before. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: NMatt Rushton <mrushton@amazon.com> Reviewed-by: NMatt Rushton <mrushton@amazon.com> Cc: Matt Wilson <msw@amazon.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Roger Pau Monne 提交于
I've at least identified two possible memory leaks in blkback, both related to the shutdown path of a VBD: - blkback doesn't wait for any pending purge work to finish before cleaning the list of free_pages. The purge work will call put_free_pages and thus we might end up with pages being added to the free_pages list after we have emptied it. Fix this by making sure there's no pending purge work before exiting xen_blkif_schedule, and moving the free_page cleanup code to xen_blkif_free. - blkback doesn't wait for pending requests to end before cleaning persistent grants and the list of free_pages. Again this can add pages to the free_pages list or persistent grants to the persistent_gnts red-black tree. Fixed by moving the persistent grants and free_pages cleanup code to xen_blkif_free. Also, add some checks in xen_blkif_free to make sure we are cleaning everything. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: NMatt Rushton <mrushton@amazon.com> Reviewed-by: NMatt Rushton <mrushton@amazon.com> Cc: Matt Wilson <msw@amazon.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Matt Rushton 提交于
Currently shrink_free_pagepool() is called before the pages used for persistent grants are released via free_persistent_gnts(). This results in a memory leak when a VBD that uses persistent grants is torn down. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xen.org Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: NMatt Rushton <mrushton@amazon.com> Signed-off-by: NMatt Wilson <msw@amazon.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 03 2月, 2014 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
This reverts commit 08ece5bb. As it breaks ARM builds and needs more attention on the ARM side. Acked-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 31 1月, 2014 1 次提交
-
-
由 Zoltan Kiss 提交于
The grant mapping API does m2p_override unnecessarily: only gntdev needs it, for blkback and future netback patches it just cause a lock contention, as those pages never go to userspace. Therefore this series does the following: - the original functions were renamed to __gnttab_[un]map_refs, with a new parameter m2p_override - based on m2p_override either they follow the original behaviour, or just set the private flag and call set_phys_to_machine - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with m2p_override false - a new function gnttab_[un]map_refs_userspace provides the old behaviour It also removes a stray space from page.h and change ret to 0 if XENFEAT_auto_translated_physmap, as that is the only possible return value there. v2: - move the storing of the old mfn in page->index to gnttab_map_refs - move the function header update to a separate patch v3: - a new approach to retain old behaviour where it needed - squash the patches into one v4: - move out the common bits from m2p* functions, and pass pfn/mfn as parameter - clear page->private before doing anything with the page, so m2p_find_override won't race with this v5: - change return value handling in __gnttab_[un]map_refs - remove a stray space in page.h - add detail why ret = 0 now at some places v6: - don't pass pfn to m2p* functions, just get it locally Signed-off-by: NZoltan Kiss <zoltan.kiss@citrix.com> Suggested-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NDavid Vrabel <david.vrabel@citrix.com> Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 24 11月, 2013 1 次提交
-
-
由 Kent Overstreet 提交于
Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: xfs@oss.sgi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchand@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Peng Tao <tao.peng@emc.com> Cc: Andy Adamson <andros@netapp.com> Cc: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Mel Gorman <mgorman@suse.de>6
-
- 09 11月, 2013 1 次提交
-
-
由 Vegard Nossum 提交于
If the permission check fails, we drop a reference to the blkif without having taken it in the first place. The bug was introduced in commit 604c499c (xen/blkback: Check device permissions before allowing OP_DISCARD). Cc: stable@vger.kernel.org Cc: Jan Beulich <JBeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 9月, 2013 1 次提交
-
-
由 Jingoo Han 提交于
The use of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: NJingoo Han <jg1.han@samsung.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 7月, 2013 1 次提交
-
-
由 Kees Cook 提交于
Calling kthread_run with a single name parameter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 6月, 2013 1 次提交
-
-
由 Roger Pau Monne 提交于
With the introduction of indirect segments we can receive requests with a number of segments bigger than the maximum number of allowed iovecs in a bios, so make sure that blkback doesn't try to allocate a bios with more iovecs than BIO_MAX_PAGES Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 22 6月, 2013 1 次提交
-
-
由 Roger Pau Monne 提交于
The code generat with gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54) creates an unbound loop for the second foreach_grant_safe loop in purge_persistent_gnt. The workaround is to avoid having this second loop and instead perform all the work inside the first loop by adding a new variable, clean_used, that will be set when all the desired persistent grants have been removed and we need to iterate over the remaining ones to remove the WAS_ACTIVE flag. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Reported-by: NTom O'Neill <toneill@vmem.com> Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 18 6月, 2013 1 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
Check that the ring does not have an insane amount of requests (more than there could fit on the ring). If we detect this case we will stop processing the requests and wait until the XenBus disconnects the ring. The existing check RING_REQUEST_CONS_OVERFLOW which checks for how many responses we have created in the past (rsp_prod_pvt) vs requests consumed (req_cons) and whether said difference is greater or equal to the size of the ring, does not catch this case. Wha the condition does check if there is a need to process more as we still have a backlog of responses to finish. Note that both of those values (rsp_prod_pvt and req_cons) are not exposed on the shared ring. To understand this problem a mini crash course in ring protocol response/request updates is in place. There are four entries: req_prod and rsp_prod; req_event and rsp_event to track the ring entries. We are only concerned about the first two - which set the tone of this bug. The req_prod is a value incremented by frontend for each request put on the ring. Conversely the rsp_prod is a value incremented by the backend for each response put on the ring (rsp_prod gets set by rsp_prod_pvt when pushing the responses on the ring). Both values can wrap and are modulo the size of the ring (in block case that is 32). Please see RING_GET_REQUEST and RING_GET_RESPONSE for the more details. The culprit here is that if the difference between the req_prod and req_cons is greater than the ring size we have a problem. Fortunately for us, the '__do_block_io_op' loop: rc = blk_rings->common.req_cons; rp = blk_rings->common.sring->req_prod; while (rc != rp) { .. blk_rings->common.req_cons = ++rc; /* before make_response() */ } will loop up to the point when rc == rp. The macros inside of the loop (RING_GET_REQUEST) is smart and is indexing based on the modulo of the ring size. If the frontend has provided a bogus req_prod value we will loop until the 'rc == rp' - which means we could be processing already processed requests (or responses) often. The reason the RING_REQUEST_CONS_OVERFLOW is not helping here is b/c it only tracks how many responses we have internally produced and whether we would should process more. The astute reader will notice that the macro RING_REQUEST_CONS_OVERFLOW provides two arguments - more on this later. For example, if we were to enter this function with these values: blk_rings->common.sring->req_prod = X+31415 (X is the value from the last time __do_block_io_op was called). blk_rings->common.req_cons = X blk_rings->common.rsp_prod_pvt = X The RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, blk_rings->common.req_cons) is doing: req_cons - rsp_prod_pvt >= 32 Which is, X - X >= 32 or 0 >= 32 And that is false, so we continue on looping (this bug). If we re-use said macro RING_REQUEST_CONS_OVERFLOW and pass in the rp instead (sring->req_prod) of rc, the this macro can do the check: req_prod - rsp_prov_pvt >= 32 Which is, X + 31415 - X >= 32 , or 31415 >= 32 which is true, so we can error out and break out of the function. Unfortunatly the difference between rsp_prov_pvt and req_prod can be at 32 (which would error out in the macro). This condition exists when the backend is lagging behind with the responses and still has not finished responding to all of them (so make_response has not been called), and the rsp_prov_pvt + 32 == req_cons. This ends up with us not being able to use said macro. Hence introducing a new macro called RING_REQUEST_PROD_OVERFLOW which does a simple check of: req_prod - rsp_prod_pvt > RING_SIZE And with the X values from above: X + 31415 - X > 32 Returns true. Also not that if the ring is full (which is where the RING_REQUEST_CONS_OVERFLOW triggered), we would not hit the same condition: X + 32 - X > 32 Which is false. Lets use that macro. Note that in v5 of this patchset the macro was different - we used an earlier version. Cc: stable@vger.kernel.org [v1: Move the check outside the loop] [v2: Add a pr_warn as suggested by David] [v3: Use RING_REQUEST_CONS_OVERFLOW as suggested by Jan] [v4: Move wake_up after kthread_stop as suggested by Jan] [v5: Use RING_REQUEST_PROD_OVERFLOW instead] [v6: Use RING_REQUEST_PROD_OVERFLOW - Jan's version] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: NJan Beulich <jbeulich@suse.com> gadsa
-
- 08 6月, 2013 2 次提交
-
-
由 Konrad Rzeszutek Wilk 提交于
We need to make sure that the device is not RO or that the request is not past the number of sectors we want to issue the DISCARD operation for. This fixes CVE-2013-2140. Cc: stable@vger.kernel.org Acked-by: NJan Beulich <JBeulich@suse.com> Acked-by: NIan Campbell <Ian.Campbell@citrix.com> [v1: Made it pr_warn instead of pr_debug] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Stefan Bader 提交于
Currently xen-blkback passes the logical sector size over xenbus and xen-blkfront sets up the paravirt disk with that logical block size. But newer drives usually have the logical sector size set to 512 for compatibility reasons and would show the actual sector size only in physical sector size. This results in the device being partitioned and accessed in dom0 with the correct sector size, but the guest thinks 512 bytes is the correct block size. And that results in poor performance. To fix this, blkback gets modified to pass also physical-sector-size over xenbus and blkfront to use both values to set up the paravirt disk. I did not just change the passed in sector-size because I am not sure having a bigger logical sector size than the physical one is valid (and that would happen if a newer dom0 kernel hits an older domU kernel). Also this way a domU set up before should still be accessible (just some tools might detect the unaligned setup). [v2: Make xenbus write failure non-fatal] [v3: Use xenbus_scanf instead of xenbus_gather] [v4: Rebased against segment changes] Signed-off-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 07 5月, 2013 1 次提交
-
-
由 Roger Pau Monne 提交于
Allocate pending requests in smaller chunks instead of allocating them all at the same time. This change also removes the global array of pending_reqs, it is no longer necessay. Variables related to the grant mapping have been grouped into a struct called "grant_page", this allows to allocate them in smaller chunks, and also improves memory locality. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Reported-by: NSander Eikelenboom <linux@eikelenboom.it> Tested-by: NSander Eikelenboom <linux@eikelenboom.it> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 19 4月, 2013 1 次提交
-
-
由 Roger Pau Monne 提交于
Indirect descriptors introduce a new block operation (BLKIF_OP_INDIRECT) that passes grant references instead of segments in the request. This grant references are filled with arrays of blkif_request_segment_aligned, this way we can send more segments in a request. The proposed implementation sets the maximum number of indirect grefs (frames filled with blkif_request_segment_aligned) to 256 in the backend and 32 in the frontend. The value in the frontend has been chosen experimentally, and the backend value has been set to a sane value that allows expanding the maximum number of indirect descriptors in the frontend if needed. The migration code has changed from the previous implementation, in which we simply remapped the segments on the shared ring. Now the maximum number of segments allowed in a request can change depending on the backend, so we have to requeue all the requests in the ring and in the queue and split the bios in them if they are bigger than the new maximum number of segments. [v2: Fixed minor comments by Konrad. [v1: Added padding to make the indirect request 64bit aligned. Added some BUGs, comments; fixed number of indirect pages in blkif_get_x86_{32/64}_req. Added description about the indirect operation in blkif.h] Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> [v3: Fixed spaces and tabs mix ups] Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 18 4月, 2013 3 次提交
-
-
由 Roger Pau Monne 提交于
Preparatory change for implementing indirect descriptors. Change xen_blkbk_{map/unmap} in order to be able to map/unmap a random amount of grants (previously it was limited to BLKIF_MAX_SEGMENTS_PER_REQUEST). Also, remove the usage of pending_req in the map/unmap functions, so we can map/unmap grants without needing to pass a pending_req. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xen.org Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Roger Pau Monne 提交于
Remove the last dependency from blkbk by moving the list of free requests to blkif. This change reduces the contention on the list of available requests. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xen.org Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Roger Pau Monne 提交于
Moving grant ref handles from blkbk to pending_req will allow us to get rid of the shared blkbk structure. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xen.org Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-