提交 · acb7aa0db09b8abd38abeb84334a8a27a52fbb1b · openeuler / Kernel

17 4月, 2013 1 次提交

NVMe: Use round_jiffies_relative() for the periodic, once-per-second timer · acb7aa0d

由 Arjan van de Ven 提交于 2月 04, 2013

The nvme driver has a "once per second" event where the management kthread
wakes up the system and then reschedules itself for 1 second later.
For power efficiency reasons, I'd like this timer to happen together
with other wakeups in the system.

This patch makes the schedule_timeout() call in the kthread use
round_jiffies_relative(), causing the wakeup to at least align with other
"once per X seconds" events in the kernel.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Tested-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

acb7aa0d

29 3月, 2013 1 次提交

NVMe: Add nvme-scsi.c · 5d0f6131

由 Vishal Verma 提交于 3月 04, 2013

Translates SCSI commands in SG_IO ioctl to NVMe commands.
Uses the scsi-nvme translation spec from nvmexpress.org as reference.
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

5d0f6131

27 3月, 2013 4 次提交

NVMe: Add definitions for format command · f8ebf840

由 Vishal Verma 提交于 3月 27, 2013

The SCSI emulation has the ability to send format commands, so we need
to add the definition of the command. Also add a missing error code.
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

f8ebf840

NVMe: Move structures & definitions to header file · 13c3b0fc

由 Vishal Verma 提交于 3月 04, 2013

nvme-scsi.c uses several data structures and definitions that were
previously private to nvme-core.c.  Move the definitions to nvme.h,
protected by __KERNEL__.
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

13c3b0fc

NVMe: Rename nvme.c to nvme-core.c · 729dd1bd

由 Vishal Verma 提交于 3月 04, 2013

In preparation for adding nvme-scsi.c
It is preferable to retain the module name 'nvme'
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

729dd1bd

NVMe: Add discard support for capable devices · 0e5e4f0e

由 Keith Busch 提交于 11月 09, 2012

This adds discard support to block queues if the nvme device is capable of
deallocating blocks as indicated by the controller's optional command support.
A discard flagged bio request will submit an NVMe deallocate Data Set
Management command for the requested blocks.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

0e5e4f0e

23 3月, 2013 1 次提交

NVMe: Add namespaces with no LBA range feature · 12209036

由 Keith Busch 提交于 1月 31, 2013

The LBA Range Type feature is optional in the NVMe specification,
so we should continue with adding namespaces for controllers that do
not implement this feature.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>

12209036

28 2月, 2013 7 次提交

nbd: fix sparse warning · 398eb085

由 Alex Elder 提交于 2月 27, 2013

I just fixed this in "drivers/block/rbd.c" and I noticed that
"drivers/block/nbd.c" has the same problem.  Fix a warning issued by
sparse by adding some lockdep annotations to indicate the queue lock gets
dropped (because it's held when do_nbd_request() is called) and
re-acquired within the function.
Signed-off-by: NAlex Elder <elder@inktank.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Paul Clements <paul.clements@us.sios.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

398eb085

nbd: show read-only state in sysfs · a83e814b

由 Paolo Bonzini 提交于 2月 27, 2013

Pass the read-only flag to set_device_ro, so that it will be visible to
the block layer and in sysfs.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Alex Bligh <alex@alex.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a83e814b

nbd: fsync and kill block device on shutdown · 3a2d63f8

由 Paolo Bonzini 提交于 2月 27, 2013

There are two problems with shutdown in the NBD driver.

1: Receiving the NBD_DISCONNECT ioctl does not sync the filesystem.

   This patch adds the sync operation into __nbd_ioctl()'s
   NBD_DISCONNECT handler.  This is useful because BLKFLSBUF is restricted
   to processes that have CAP_SYS_ADMIN, and the NBD client may not
   possess it (fsync of the block device does not sync the filesystem,
   either).

2: Once we clear the socket we have no guarantee that later reads will
   come from the same backing storage.

   The patch adds calls to kill_bdev() in __nbd_ioctl()'s socket
   clearing code so the page cache is cleaned, lest reads that hit on the
   page cache will return stale data from the previously-accessible disk.

Example:

    # qemu-nbd -r -c/dev/nbd0 /dev/sr0
    # file -s /dev/nbd0
    /dev/stdin: # UDF filesystem data (version 1.5) etc.
    # qemu-nbd -d /dev/nbd0
    # qemu-nbd -r -c/dev/nbd0 /dev/sda
    # file -s /dev/nbd0
    /dev/stdin: # UDF filesystem data (version 1.5) etc.

While /dev/sda has:

    # file -s /dev/sda
    /dev/sda: x86 boot sector; etc.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NPaul Clements <Paul.Clements@steeleye.com>
Cc: Alex Bligh <alex@alex.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a2d63f8

nbd: support FLUSH requests · 75f187ab

由 Alex Bligh 提交于 2月 27, 2013

Currently, the NBD device does not accept flush requests from the Linux
block layer. If the NBD server opened the target with neither O_SYNC nor
O_DSYNC, however, the device will be effectively backed by a writeback
cache. Without issuing flushes properly, operation of the NBD device will
not be safe against power losses.

The NBD protocol has support for both a cache flush command and a FUA
command flag; the server will also pass a flag to note its support for
these features. This patch adds support for the cache flush command and
flag. In the kernel, we receive the flags via the NBD_SET_FLAGS ioctl,
and map NBD_FLAG_SEND_FLUSH to the argument of blk_queue_flush. When the
flag is active the block layer will send REQ_FLUSH requests, which we
translate to NBD_CMD_FLUSH commands.

FUA support is not included in this patch because all free software
servers implement it with a full fdatasync; thus it has no advantage over
supporting flush only. Because I [Paolo] cannot really benchmark it in a
realistic scenario, I cannot tell if it is a good idea or not. It is also
not clear if it is valid for an NBD server to support FUA but not flush.
The Linux block layer gives a warning for this combination, the NBD
protocol documentation says nothing about it.

The patch also fixes a small problem in the handling of flags: nbd->flags
must be cleared at the end of NBD_DO_IT, but the driver was not doing
that. The bug manifests itself as follows. Suppose you two different
client/server pairs to start the NBD device. Suppose also that the first
client supports NBD_SET_FLAGS, and the first server sends
NBD_FLAG_SEND_FLUSH; the second pair instead does neither of these two
things. Before this patch, the second invocation of NBD_DO_IT will use a
stale value of nbd->flags, and the second server will issue an error every
time it receives an NBD_CMD_FLUSH command.

This bug is pre-existing, but it becomes much more important after this
patch; flush failures make the device pretty much unusable, unlike
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAlex Bligh <alex@alex.org.uk>
Acked-by: NPaul Clements <Paul.Clements@steeleye.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

75f187ab

drbd: convert to idr_alloc() · 56de2102

由 Tejun Heo 提交于 2月 27, 2013

Convert to the much saner new idr interface.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

56de2102

block/loop: convert to idr_alloc() · c718aa65

由 Tejun Heo 提交于 2月 27, 2013

Convert to the much saner new idr interface.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c718aa65

block/loop: don't use idr_remove_all() · 9d609166

由 Tejun Heo 提交于 2月 27, 2013

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d609166

27 2月, 2013 3 次提交

libceph: update osd request/reply encoding · 1b83bef2

由 Sage Weil 提交于 2月 25, 2013

Use the new version of the encoding for osd requests and replies.  In the
process, update the way we are tracking request ops and reply lengths and
results in the struct ceph_osd_request.  Update the rbd and fs/ceph users
appropriately.

The main changes are:
 - we keep pointers into the request memory for fields we need to update
   each time the request is sent out over the wire
 - we keep information about the result in an array in the request struct
   where the users can easily get at it.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

1b83bef2

rbd: pass length, not op for osd completions · c47f9371

由 Alex Elder 提交于 2月 26, 2013

The only thing type-specific osd completion functions do with their
osd op parameter is (in some cases) extract the number of bytes
transferred from it.  In the other cases, the xferred bytes field
is not used, and total message data transfer byte count (which may
well be zero) is used.

Just set the object request transfer count in the main osd request
callback function and provide that to the other routines.  There is
then no longer any need to pass the op pointer to the type-specific
completion routines, so drop those parameters.

Stop doing anything with the total message data length.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

c47f9371

rbd: move rbd_osd_trivial_callback() · 39bf2c5d

由 Alex Elder 提交于 2月 26, 2013

This function is slightly out of place, probably the result
of an errant automatic merge or something.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

39bf2c5d

26 2月, 2013 5 次提交

A
switch vfs_getattr() to struct path · 3dadecce
由 Al Viro 提交于 1月 24, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3dadecce

rbd: eliminate sparse warnings · cc344fa1

由 Alex Elder 提交于 2月 19, 2013

Fengguang Wu reminded me that there were outstanding sparse reports
in the ceph and rbd code.  This patch fixes these problems in rbd
that lead to those reports:
    - Convert functions that are never referenced externally to have
      static scope.
    - Add a lockdep annotation to rbd_request_fn(), because it
      releases a lock before acquiring it again.

This partially resolves:
    http://tracker.ceph.com/issues/4184Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

cc344fa1

rbd: normalize dout() calls · 37206ee5

由 Alex Elder 提交于 2月 20, 2013

Add dout() calls to facilitate tracing of image and object requests.
Change a few existing calls so they use __func__ rather than the
hard-coded function name.  Have calls always add ":" after the name
of the function, and prefix pointer values with a consistent tag
indicating what it represents.  (Note that there remain some older
dout() calls that are left untouched by this patch.)

Issue a warning if rbd_osd_write_callback() ever gets a short write.

This resolves:
    http://tracker.ceph.com/issues/4235Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

37206ee5

rbd: barriers are hard · 632b88ca

由 Alex Elder 提交于 2月 21, 2013

Let's go shopping!

I'm afraid this may not have gotten it right:
    07741308  rbd: add barriers near done flag operations

The smp_wmb() should have been done *before* setting the done flag,
to ensure all other data was valid before marking the object request
done.

Switch to use atomic_inc_return() here to set the done flag, which
allows us to verify we don't mark something done more than once.
Doing this also implies general barriers before and after the call.

And although a read memory barrier might have been sufficient before
reading the done flag, convert this to a full memory barrier just
to put this issue to bed.

This resolves:
    http://tracker.ceph.com/issues/4238Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

632b88ca

rbd: ignore zero-length requests · 4dda41d3

由 Alex Elder 提交于 2月 20, 2013

The old request code simply ignored zero-length requests.  We should
still operate that same way to avoid any changes in behavior.  We
can implement handling for special zero-length requests separately
(see http://tracker.ceph.com/issues/4236).

Add some assertions based on this new constraint.

This resolves:
    http://tracker.ceph.com/issues/4237Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

4dda41d3

23 2月, 2013 1 次提交
- A
  new helper: file_inode(file) · 496ad9aa
  由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  496ad9aa
22 2月, 2013 6 次提交

loopdev: ignore negative offset when calculate loop device size · b7a1da69

由 Guo Chao 提交于 2月 21, 2013

Negative offset may cause loop device size larger than backing file
size.

 $ fallocate -l 1M a
 $ losetup --offset 0xffffffffffff0000 /dev/loop0 a
 $ blockdev --getsize64 /dev/loop0
 1114112
 $ ls -l a
 -rw-r--r-- 1 root root 1048576 Jan 23 12:46 a
 $ cat /dev/loop0
 cat: /dev/loop0: Input/output error

It makes no sense to do that. Only apply offset when it's positive.

Fix a typo in the comment by the way.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b7a1da69

loopdev: remove an user triggerable oops · b1a66504

由 Guo Chao 提交于 2月 21, 2013

When loopdev is built as module and we pass an invalid parameter,
loop_init() will return directly without deregister misc device, which
will cause an oops when insert loop module next time because we left some
garbage in the misc device list.

Test case:
sudo modprobe loop max_part=1024
(failed due to invalid parameter)
sudo modprobe loop
(oops)

Clean up nicely to avoid such oops.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1a66504

loopdev: move common code into loop_figure_size() · 7b0576a3

由 Guo Chao 提交于 2月 21, 2013

Update block device size in accord with gendisk size and let userspace
know the change in loop_figure_size(). This is a clean up to remove
common code of loop_figure_size()'s two callers.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7b0576a3

loopdev: update block device size in loop_set_status() · 541c742a

由 Guo Chao 提交于 2月 21, 2013

Loop device driver sometimes fails to impose the size limit on the
device. Keep issuing following two commands:

losetup --offset 7517244416 --sizelimit 3224971264 /dev/loop0 backed_file
blockdev --getsize64 /dev/loop0

blockdev reports file size instead of sizelimit several out of 100 times.

The problems are:

	- losetup set up the device in two ioctl:
		  LOOP_SET_FD and LOOP_SET_STATUS64.

	- LOOP_SET_STATUS64 only update size of gendisk.

Block device size will be updated lazily when device comes to use. If udev
rushes in between the two ioctl, it will bring in a block device whose
size is backing file size. If the device is not released after
LOOP_SET_STATUS64 ioctl, blockdev will not see the updated size.

Update block size in LOOP_SET_STATUS64 ioctl.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Reported-by: NM. Hindess <hindessm@uk.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

541c742a

loopdev: fix a deadlock · 5370019d

由 Guo Chao 提交于 2月 21, 2013

bd_mutex and lo_ctl_mutex can be held in different order.

Path #1:

blkdev_open
 blkdev_get
  __blkdev_get (hold bd_mutex)
   lo_open (hold lo_ctl_mutex)

Path #2:

blkdev_ioctl
 lo_ioctl (hold lo_ctl_mutex)
  lo_set_capacity (hold bd_mutex)

Lockdep does not report it, because path #2 actually holds a subclass of
lo_ctl_mutex.  This subclass seems creep into the code by mistake.  The
patch author actually just mentioned it in the changelog, see commit
f028f3b2 ("loop: fix circular locking in loop_clr_fd()"), also see:

	http://marc.info/?l=linux-kernel&m=123806169129727&w=2

Path #2 hold bd_mutex to call bd_set_size(), I've protected it
with i_mutex in a previous patch, so drop bd_mutex at this site.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5370019d

drivers/block/swim3.c: fix null pointer dereference · 7414d4f6

由 Cong Ding 提交于 2月 21, 2013

The use of pointer fs should be after the null check.
Signed-off-by: NCong Ding <dinggnu@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7414d4f6

20 2月, 2013 9 次提交

libceph: drop return value from page vector copy routines · 903bb32e

由 Alex Elder 提交于 2月 06, 2013

The return values provided for ceph_copy_to_page_vector() and
ceph_copy_from_page_vector() serve no purpose, so get rid of them.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

903bb32e

rbd: ignore result of ceph_copy_from_page_vector() · 23ed6e13

由 Alex Elder 提交于 2月 06, 2013

The result of ceph_copy_from_page_vector() is simply the length
argument it is provided.

This is called by rbd_obj_method_sync(), which returns the result if
it's non-negative.  But we always either ignore or overwrite that
return value.  So explicitly ignore what's returned by the copy
function, and have rbd_obj_method_sync() always return either a
negative errno or 0.

We also return the result of ceph_copy_from_page_vector() in
rbd_obj_read_sync().  There we still want to return the number of
bytes transferred, but we can use the value we already have in hand
rather than what ceph_copy_from_page_vector() provides.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

23ed6e13

rbd: prevent bytes transferred overflow · 1ceae7ef

由 Alex Elder 提交于 2月 06, 2013

In rbd_obj_read_sync(), verify the number of bytes transferred won't
exceed what can be represented by a size_t before using it to
indicate the number of bytes to copy to the result buffer.

(The real motivation for this is to prepare for the next patch.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

1ceae7ef

libceph: allow STAT osd operations · fbfab539

由 Alex Elder 提交于 2月 08, 2013

Add support for CEPH_OSD_OP_STAT operations in the osd client
and in rbd.

This operation sends no data to the osd; everything required is
encoded in identity of the target object.

The result will be ENOENT if the object doesn't exist.  If it does
exist and no other error occurs the server returns the size and last
modification time of the target object as output data (in little
endian format).  The size is a 64 bit unsigned and the time is
ceph_timespec structure (two unsigned 32-bit integers, representing
a seconds and nanoseconds value).

This resolves:
    http://tracker.ceph.com/issues/4007Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

fbfab539

rbd: add parentheses to object request iterator macros · ef06f4d3

由 Alex Elder 提交于 2月 08, 2013

The for_each_obj_request*() macros should parenthesize their uses of
the ireq parameter.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

ef06f4d3

xen-blkback: use balloon pages for persistent grants · 087ffecd

由 Roger Pau Monne 提交于 2月 14, 2013

With current persistent grants implementation we are not freeing the
persistent grants after we disconnect the device. Since grant map
operations change the mfn of the allocated page, and we can no longer
pass it to __free_page without setting the mfn to a sane value, use
balloon grant pages instead, as the gntdev device does.
Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com>
Cc: stable@vger.kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

087ffecd

xen-blkfront: drop the use of llist_for_each_entry_safe · f84adf49

由 Konrad Rzeszutek Wilk 提交于 2月 13, 2013

Replace llist_for_each_entry_safe with a while loop.

llist_for_each_entry_safe can trigger a bug in GCC 4.1, so it's best
to remove it and use a while loop and do the deletion manually.

Specifically this bug can be triggered by hot-unplugging a disk, either
by doing xm block-detach or by save/restore cycle.

BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffffa0047223>] blkif_free+0x63/0x130 [xen_blkfront]
The crash call trace is:
	...
bad_area_nosemaphore+0x13/0x20
do_page_fault+0x25e/0x4b0
page_fault+0x25/0x30
? blkif_free+0x63/0x130 [xen_blkfront]
blkfront_resume+0x46/0xa0 [xen_blkfront]
xenbus_dev_resume+0x6c/0x140
pm_op+0x192/0x1b0
device_resume+0x82/0x1e0
dpm_resume+0xc9/0x1a0
dpm_resume_end+0x15/0x30
do_suspend+0x117/0x1e0

When drilling down to the assembler code, on newer GCC it does
.L29:
        cmpq    $-16, %r12      #, persistent_gnt check
        je      .L30    	#, out of the loop
.L25:
	... code in the loop
        testq   %r13, %r13      # n
        je      .L29    	#, back to the top of the loop
        cmpq    $-16, %r12      #, persistent_gnt check
        movq    16(%r12), %r13  # <variable>.node.next, n
        jne     .L25    	#,	back to the top of the loop
.L30:

While on GCC 4.1, it is:
L78:
	... code in the loop
	testq   %r13, %r13      # n
        je      .L78    #,	back to the top of the loop
        movq    16(%rbx), %r13  # <variable>.node.next, n
        jmp     .L78    #,	back to the top of the loop

Which basically means that the exit loop condition instead of
being:

	&(pos)->member != NULL;

is:
	;

which makes the loop unbound.

Since xen-blkfront is the only user of the llist_for_each_entry_safe
macro remove it from llist.h.

Orabug: 16263164
CC: stable@vger.kernel.org
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

f84adf49

xen/blkback: Don't trust the handle from the frontend. · 01c681d4

由 Konrad Rzeszutek Wilk 提交于 1月 16, 2013

The 'handle' is the device that the request is from. For the life-time
of the ring we copy it from a request to a response so that the frontend
is not surprised by it. But we do not need it - when we start processing
I/Os we have our own 'struct phys_req' which has only most essential
information about the request. In fact the 'vbd_translate' ends up
over-writing the preq.dev with a value from the backend.

This assignment of preq.dev with the 'handle' value is superfluous
so lets not do it.

Cc: stable@vger.kernel.org
Acked-by: NJan Beulich <jbeulich@suse.com>
Acked-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

01c681d4

xen-blkback: do not leak mode property · 9d092603

由 Jan Beulich 提交于 12月 20, 2012

"be->mode" is obtained from xenbus_read(), which does a kmalloc() for
the message body. The short string is never released, so do it along
with freeing "be" itself, and make sure the string isn't kept when
backend_changed() doesn't complete successfully (which made it
desirable to slightly re-structure that function, so that the error
cleanup can be done in one place).
Reported-by: NOlaf Hering <olaf@aepfle.de>
CC: stable@vger.kernel.org
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9d092603

19 2月, 2013 2 次提交

block: IBM RamSan 70/80 driver fixes · c206c709

由 Philip J Kelleher 提交于 2月 18, 2013

This patch includes the following driver fixes for the
IBM RamSan 70/80 driver:

o Changed the creg_ctrl lock from a mutex to a spinlock.
o Added a count check for ioctl calls.
o Removed unnecessary casting of void pointers.
o Made every function static that needed to be.
o Added comments to explain things more thoroughly.
Signed-off-by: NPhilip J Kelleher <pjk1939@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c206c709

libceph: kill ceph_osdc_create_event() "one_shot" parameter · 3c663bbd

由 Alex Elder 提交于 2月 15, 2013

There is only one caller of ceph_osdc_create_event(), and it
provides 0 as its "one_shot" argument.  Get rid of that argument and
just use 0 in its place.

Replace the code in handle_watch_notify() that executes if one_shot
is nonzero in the event with a BUG_ON() call.

While modifying "osd_client.c", give handle_watch_notify() static
scope.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

3c663bbd

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功