提交 · d11bb4462c4cc6ddd45c6927c617ad79fa6fb8fc · openeuler / Kernel

22 8月, 2011 2 次提交

xen-blkback: fixed indentation and comments · 1bc05b0a

由 Joe Jin 提交于 8月 15, 2011

This patch fixes belows:

1. Fix code style issue.
2. Fix incorrect functions name in comments.
Signed-off-by: NJoe Jin <joe.jin@oracle.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

1bc05b0a

xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed. · 6f5986bc

由 Joe Jin 提交于 8月 15, 2011

When do block-attach/block-detach test with below steps, umount hangs
in the guest. Furthermore shutdown ends up being stuck when umounting file-systems.

1. start guest.
2. attach new block device by xm block-attach in Dom0.
3. mount new disk in guest.
4. execute xm block-detach to detach the block device in dom0 until timeout
5. Any request to the disk will hung.

Root cause:
This issue is caused when setting backend device's state to
'XenbusStateClosing', which sends to the frontend the XenbusStateClosing
notification. When frontend receives the notification it tries to release
the disk in blkfront_closing(), but at that moment the disk is still in use
by guest, so frontend refuses to close. Specifically it sets the disk state to
XenbusStateClosing and sends the notification to backend - when backend receives the
event, it disconnects the vbd from real device, and sets the vbd device state to
XenbusStateClosing. The backend disconnects the real device/file, and any IO
requests to the disk in guest will end up in ether, leaving disk DEAD and set to
XenbusStateClosing. When the guest wants to disconnect the disk, umount will
hang on blkif_release()->xlvbd_release_gendisk() as it is unable to send any IO
to the disk, which prevents clean system shutdown.

Solution:
Don't disconnect backend until frontend state switched to XenbusStateClosed.
Signed-off-by: NJoe Jin <joe.jin@oracle.com>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
[v1: Modified description a bit]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6f5986bc

09 8月, 2011 1 次提交

xen/blkback: Make description more obvious. · ea5e1161

由 Konrad Rzeszutek Wilk 提交于 8月 03, 2011

With the frontend having Xen but the backend not, it just looks odd:

  <*>   Xen virtual block device support
  <*>   Block-device backend driver

Fix it to have the 'Xen' in front of it.
Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ea5e1161

03 8月, 2011 1 次提交

block: swim3: fix unterminated of_device_id table · f41c53a5

由 Axel Lin 提交于 8月 03, 2011

of_device_id structures need a NULL terminating entry, add it.
Signed-off-by: NAxel Lin <axel.lin@gmail.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

f41c53a5

02 8月, 2011 1 次提交

drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse · ddad9ef5

由 H Hartley Sweeten 提交于 8月 02, 2011

The buffer 'sc.cpu_mask' is a kernel buffer.  If bitmap_parse is used
instead of __bitmap_parse the extra parameter that indicates a kernel
buffer is not needed.
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

ddad9ef5

01 8月, 2011 4 次提交

loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other · 05eb0f25

由 Kay Sievers 提交于 7月 31, 2011

LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
waits for show() to finish to remove the sysfs file.

  cat /sys/class/block/loop0/loop/backing_file
    mutex_lock_nested+0x176/0x350
    ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    dev_attr_show+0x1b/0x60
    ? sysfs_read_file+0x86/0x1a0
    ? __get_free_pages+0x12/0x50
    sysfs_read_file+0xaf/0x1a0

  ioctl(LOOP_CLR_FD):
    wait_for_common+0x12c/0x180
    ? try_to_wake_up+0x2a0/0x2a0
    wait_for_completion+0x18/0x20
    sysfs_deactivate+0x178/0x180
    ? sysfs_addrm_finish+0x43/0x70
    ? sysfs_addrm_start+0x1d/0x20
    sysfs_addrm_finish+0x43/0x70
    sysfs_hash_and_remove+0x85/0xa0
    sysfs_remove_group+0x59/0x100
    loop_clr_fd+0x1dc/0x3f0 [loop]
    lo_ioctl+0x223/0x7a0 [loop]

Instead of taking the lo_ctl_mutex from sysfs code, take the inner
lo->lo_lock, to protect the access to the backing_file data.

Thanks to Tejun for help debugging and finding a solution.

Cc: Milan Broz <mbroz@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

05eb0f25

loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices · d134b00b

由 Kay Sievers 提交于 7月 31, 2011

Instead of unconditionally creating a fixed number of dead loop
devices which need to be investigated by storage handling services,
even when they are never used, we allow distros start with 0
loop devices and have losetup(8) and similar switch to the dynamic
/dev/loop-control interface instead of searching /dev/loop%i for free
devices.
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

d134b00b

loop: add management interface for on-demand device allocation · 770fe30a

由 Kay Sievers 提交于 7月 31, 2011

Loop devices today have a fixed pre-allocated number of usually 8.
The number can only be changed at module init time. To find a free
device to use, /dev/loop%i needs to be scanned, and all devices need
to be opened until a free one is possibly found.

This adds a new /dev/loop-control device node, that allows to
dynamically find or allocate a free device, and to add and remove loop
devices from the running system:
 LOOP_CTL_ADD adds a specific device. Arg is the number
 of the device. It returns the device i or a negative
 error code.

 LOOP_CTL_REMOVE removes a specific device, Arg is the
 number the device. It returns the device i or a negative
 error code.

 LOOP_CTL_GET_FREE finds the next unbound device or allocates
 a new one. No arg is given. It returns the device i or a
 negative error code.

The loop kernel module gets automatically loaded when
/dev/loop-control is accessed the first time. The alias
specified in the module, instructs udev to create this
'dead' device node, even when the module is not loaded.

Example:
 cfd = open("/dev/loop-control", O_RDWR);

 # add a new specific loop device
 err = ioctl(cfd, LOOP_CTL_ADD, devnr);

 # remove a specific loop device
 err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);

 # find or allocate a free loop device to use
 devnr = ioctl(cfd, LOOP_CTL_GET_FREE);

 sprintf(loopname, "/dev/loop%i", devnr);
 ffd = open("backing-file", O_RDWR);
 lfd = open(loopname, O_RDWR);
 err = ioctl(lfd, LOOP_SET_FD, ffd);

Cc: Tejun Heo <tj@kernel.org>
Cc: Karel Zak  <kzak@redhat.com>
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

770fe30a

loop: replace linked list of allocated devices with an idr index · 34dd82af

由 Kay Sievers 提交于 7月 31, 2011

Replace the linked list, that keeps track of allocated devices, with an
idr index to allow a more efficient lookup of devices.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

34dd82af

27 7月, 2011 3 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

rbd: set blk_queue request sizes to object size · 029bcbd8

由 Josh Durgin 提交于 7月 22, 2011

This improves performance since more requests can be merged.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NJosh Durgin <josh.durgin@dreamhost.com>

029bcbd8

rbd: cancel watch request when releasing the device · 79e3057c

由 Yehuda Sadeh 提交于 7月 12, 2011

We were missing this cleanup, so when a device was released
the osd didn't clean up its watchers list, so following notifications
could be slow as osd needed to timeout on the client.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

79e3057c

20 7月, 2011 1 次提交
- A
  kill useless checks for sb->s_op == NULL · e7f59097
  由 Al Viro 提交于 7月 07, 2011
```
never is...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  e7f59097
15 7月, 2011 2 次提交

xen-blkfront: Fix one off warning about name clash · 89153b5c

由 Stefan Bader 提交于 7月 14, 2011

Avoid telling users to use xvde and onwards when using xvde.
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

89153b5c

xen-blkfront: Drop name and minor adjustments for emulated scsi devices · 196cfe2a

由 Stefan Bader 提交于 7月 14, 2011

These were intended to avoid the namespace clash when representing
emulated IDE and SCSI devices. However that seems to confuse users
more than expected (a disk defined as sda becomes xvde).
So for now go back to the scheme which does no adjustments. This
will break when mixing IDE and SCSI names in the configuration of
guests but should be by now expected.
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

196cfe2a

14 7月, 2011 1 次提交

dt: remove extra xsysace platform_driver registration · 5d10302f

由 Grant Likely 提交于 7月 14, 2011

After commit 1c48a5c9, "dt: Eliminate
of_platform_{,un}register_driver", the xsysace driver attempts to
register two platform_drivers with the same name, which a) doesn't
work, and b) isn't necessary.  This patch merges the two
platform_drivers.
Reported-by: NDaniel Hellstrom <daniel@gaisler.com>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

5d10302f

09 7月, 2011 1 次提交

cciss: do not attempt to read from a write-only register · 07d0c38e

由 Stephen M. Cameron 提交于 7月 09, 2011

Most smartarrays will tolerate it, but some new ones don't.
Signed-off-by: NStephen M. Cameron <scameron@beardog.cce.hp.com>

Note: this is a regression caused by commit 1ddd5049
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

07d0c38e

01 7月, 2011 3 次提交

xen/blkback: Add module alias for autoloading · a7e9357f

由 Bastian Blank 提交于 6月 29, 2011

Add xen-backend:vbd module alias to the xen-blkback module. This allows
automatic loading of the module.
Signed-off-by: NBastian Blank <waldi@debian.org>
Acked-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

a7e9357f

xen/blkback: Don't let in-flight requests defer pending ones. · b4726a9d

由 Daniel Stodden 提交于 5月 28, 2011

Running RING_FINAL_CHECK_FOR_REQUESTS from make_response is a bad
idea. It means that in-flight I/O is essentially blocking continued
batches. This essentially kills throughput on frontends which unplug
(or even just notify) early and rightfully assume addtional requests
will be picked up on time, not synchronously.
Signed-off-by: NDaniel Stodden <daniel.stodden@citrix.com>
[v1: Rebased and fixed compile problems]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b4726a9d

xen: Add __attribute__((format(printf... where appropriate · 08b8bfc1

由 Joe Perches 提交于 6月 12, 2011

Use the compiler to verify printf formats and arguments.

Fix fallout.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

08b8bfc1

30 6月, 2011 6 次提交

drbd: we should write meta data updates with FLUSH FUA · 86e1e98e

由 Lars Ellenberg 提交于 6月 28, 2011

We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless
disabled in the configuration). The correct semantic now would be to
write with FLUSH/FUA.
For example, with activity log transactions, FUA alone is not enough, we
need the corresponding bitmap update (and all related application
updates) on stable storage as well.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

86e1e98e

drbd: when receive times out on meta socket, also check last receive time on data socket · cb6518cb

由 Lars Ellenberg 提交于 6月 20, 2011

If we have an asymetrically congested network, we may send P_PING,
but due to congestion, the corresponding P_PING_ACK would time out,
and we would drop a (congested, but otherwise) healthy connection
("PingAck did not arrive in time.")
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

cb6518cb

drbd: account bitmap IO during resync as resync-(related-)-io · 5a8b4242

由 Lars Ellenberg 提交于 6月 14, 2011

If we have a good resync rate, we will frequently update the on-disk
bitmap, which, if not accounted for as resync io, may let an otherwise
idle device appear to be "busy", and cause us to throttle resync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5a8b4242

drbd: don't cond_resched_lock with IRQs disabled · 8ccee20e

由 Lars Ellenberg 提交于 6月 06, 2011

The last commit, drbd: add missing spinlock to bitmap receive,
introduced a cond_resched_lock(), where the lock in question is taken
with irqs disabled.

As we must not schedule with IRQs disabled,
and cond_resched_lock_irq() does not exist, yet,
we re-aquire the spin_lock_irq() for each bitmap page processed in turn.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

8ccee20e

drbd: add missing spinlock to bitmap receive · 829c6087

由 Lars Ellenberg 提交于 6月 03, 2011

During bitmap exchange, when using the RLE bitmap compression scheme,
we have a code path that can set the whole bitmap at once.

To avoid holding spin_lock_irq() for too long, we used to lock out other
bitmap modifications during bitmap exchange by other means, and then,
knowing we have exclusive access to the bitmap, modify it without
the spinlock, and with IRQs enabled.

Since we now allow local IO to continue, potentially setting additional
bits during the bitmap receive phase, this is no longer true, and we get
uncoordinated updates of bitmap members, causing bm_set to no longer
accurately reflect the total number of set bits.

To actually see this, you'd need to have a large bitmap, use RLE bitmap
compression, and have busy IO during sync handshake and bitmap exchange.

Fix this by taking the spin_lock_irq() in this code path as well, but
calling cond_resched_lock() after each page worth of bits processed.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

829c6087

drbd: Use the correct max_bio_size when creating resync requests · 0cfdd247

由 Philipp Reisner 提交于 5月 25, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0cfdd247

09 6月, 2011 1 次提交

i8253: Create linux/i8253.h and use it in all 8253 related files · 334955ef

由 Ralf Baechle 提交于 6月 01, 2011

Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20110601180610.054254048@duck.linux-mips.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

 arch/arm/mach-footbridge/isa-timer.c |    2 +-
 arch/mips/cobalt/time.c              |    2 +-
 arch/mips/jazz/irq.c                 |    2 +-
 arch/mips/kernel/i8253.c             |    2 +-
 arch/mips/mti-malta/malta-time.c     |    2 +-
 arch/mips/sgi-ip22/ip22-time.c       |    2 +-
 arch/mips/sni/time.c                 |    2 +-
 arch/x86/kernel/apic/apic.c          |    2 +-
 arch/x86/kernel/apm_32.c             |    2 +-
 arch/x86/kernel/hpet.c               |    2 +-
 arch/x86/kernel/i8253.c              |    2 +-
 arch/x86/kernel/time.c               |    2 +-
 drivers/block/hd.c                   |    2 +-
 drivers/clocksource/i8253.c          |    2 +-
 drivers/input/gameport/gameport.c    |    2 +-
 drivers/input/joystick/analog.c      |    2 +-
 drivers/input/misc/pcspkr.c          |    2 +-
 include/linux/i8253.h                |   11 +++++++++++
 sound/drivers/pcsp/pcsp.h            |    2 +-
 19 files changed, 29 insertions(+), 18 deletions(-)

334955ef

02 6月, 2011 1 次提交

block: fix mismerge of the DISK_EVENT_MEDIA_CHANGE removal · 0f48f260

由 Linus Torvalds 提交于 6月 02, 2011

Jens' back-merge commit 698567f3 ("Merge commit 'v2.6.39' into
for-2.6.40/core") was incorrectly done, and re-introduced the
DISK_EVENT_MEDIA_CHANGE lines that had been removed earlier in commits

 - 9fd097b1 ("block: unexport DISK_EVENT_MEDIA_CHANGE for
   legacy/fringe drivers")

 - 7eec77a1 ("ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd
   and ide-cd")

because of conflicts with the "g->flags" updates near-by by commit
d4dc210f ("block: don't block events on excl write for non-optical
devices")

As a result, we re-introduced the hanging behavior due to infinite disk
media change reports.

Tssk, tssk, people! Don't do back-merges at all, and *definitely* don't
do them to hide merge conflicts from me - especially as I'm likely
better at merging them than you are, since I do so many merges.
Reported-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0f48f260

01 6月, 2011 2 次提交

xen/blkback: potential null dereference in error handling · 9b83c771

由 Dan Carpenter 提交于 5月 27, 2011

blkbk->pending_pages can be NULL here so I added a check for it.
Signed-off-by: NDan Carpenter <error27@gmail.com>
[v1: Redid the loop a bit]
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9b83c771

xen/blkback: don't call vbd_size() if bd_disk is NULL · 6464920a

由 Laszlo Ersek 提交于 5月 25, 2011

...because vbd_size() dereferences bd_disk if bd_part is NULL.

Signed-off-by: Laszlo Ersek<lersek@redhat.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6464920a

30 5月, 2011 2 次提交

drivers, block: virtio_blk: Replace cryptic number with the macro · 6917f83f

由 Liu Yuan 提交于 4月 24, 2011

It is easier to figure out the context by reading SCSI_SENSE_BUFFERSIZE
instead of plain '96'.
Signed-off-by: NLiu Yuan <tailai.ly@taobao.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

6917f83f

virtio_blk: allow re-reading config space at runtime · 7a7c924c

由 Christoph Hellwig 提交于 2月 01, 2011

Wire up the virtio_driver config_changed method to get notified about
config changes raised by the host.  For now we just re-read the device
size to support online resizing of devices, but once we add more
attributes that might be changeable they could be added as well.

Note that the config_changed method is called from irq context, so
we'll have to use the workqueue infrastructure to provide us a proper
user context for our changes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

7a7c924c

29 5月, 2011 1 次提交

x86 idle floppy: deprecate disable_hlt() · 3b70b2e5

由 Len Brown 提交于 4月 01, 2011

Plan to remove floppy_disable_hlt in 2012, an ancient
workaround with comments that it should be removed.

This allows us to remove clutter and a run-time branch
from the idle code.

WARN_ONCE() on invocation until it is removed.

cc: x86@kernel.org
cc: stable@kernel.org # .39.x
Signed-off-by: NLen Brown <len.brown@intel.com>

3b70b2e5

28 5月, 2011 3 次提交

nbd: adjust 'max_part' according to part_shift · 5988ce23

由 Namhyung Kim 提交于 5月 28, 2011

The 'max_part' parameter determines how many partitions are supported
on each nbd device. However the actual number can be changed to the
power of 2 minus 1 form during the module initialization as
alloc_disk() is called with (1 << part_shift) for some reason.

So adjust 'max_part' also at least for consistency with loop and brd.
It is exported via sysfs already, and a user should check this value
after module loading if [s]he wants to use that number correctly
(i.e. fdisk or something).
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

5988ce23

nbd: limit module parameters to a sane value · 3b271082

由 Namhyung Kim 提交于 5月 28, 2011

The 'max_part' parameter controls the number of maximum partition
a nbd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel oops (or, at least, produce invalid device
nodes in some cases).

In addition, specifying large 'nbds_max' value causes same
problem for the same reason.

On my desktop, following command results to the kernel bug:

$ sudo modprobe nbd max_part=100000
 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
 invalid opcode: 0000 [#1] SMP
 last sysfs file: /sys/devices/virtual/block/nbd4/range
 CPU 1
 Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom

 Pid: 2522, comm: modprobe Tainted: G        W   2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO
 RIP: 0010:[<ffffffff8115aa08>]  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
 RSP: 0018:ffff8801009f1de8  EFLAGS: 00010246
 RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3
 RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478
 RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478
 R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0
 R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400
 FS:  00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0)
 Stack:
  ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80
  ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468
  0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a
 Call Trace:
  [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4
  [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43
  [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15
  [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd
  [<ffffffff811f3bdf>] add_disk+0xe4/0x29c
  [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd]
  [<ffffffffa007e000>] ? 0xffffffffa007dfff
  [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e
  [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3
  [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b
 Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83
  7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49
 RIP  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
  RSP <ffff8801009f1de8>
 ---[ end trace 753285ffbf72c57c ]---
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

3b271082

nbd: pass MSG_* flags to kernel_recvmsg() · 35fbf5bc

由 Namhyung Kim 提交于 5月 28, 2011

Unlike kernel_sendmsg(), kernel_recvmsg() requires passing flags explicitly
via last parameter instead of struct msghdr.msg_flags. Therefore calls to
sock_xmit(lo, 0, ..., MSG_WAITALL) have not been processed properly by tcp
layer wrt. the flag. Fix it.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

35fbf5bc

27 5月, 2011 4 次提交

loop: export module parameters · ac04fee0

由 Namhyung Kim 提交于 5月 27, 2011

Export 'max_loop' and 'max_part' parameters to sysfs so user can know
that how many devices are allowed and how many partitions are supported.

If 'max_loop' is 0, there is no restriction on the number of loop devices.
User can create/use the devices as many as minor numbers available. If
'max_part' is 0, it means simply the device doesn't support partitioning.

Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
needed. User should check this value after the module loading if he/she
want to use that number correctly (i.e. fdisk, mknod, etc.).
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

ac04fee0

brd: export module parameters · 8892cbaf

由 Namhyung Kim 提交于 5月 26, 2011

Export 'rd_nr', 'rd_size' and 'max_part' parameters to sysfs so user can
know that how many devices are allowed, how big each device is and how
many partitions are supported. If 'max_part' is 0, it means simply the
device doesn't support partitioning.

Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
needed. User should check this value after the module loading if he/she
want to use that number correctly (i.e. fdisk, mknod, etc.).
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

8892cbaf

brd: fix comment on initial device creation · 13868b76

由 Namhyung Kim 提交于 5月 26, 2011

If 'rd_nr' param was not specified, 16 (can be adjusted via
CONFIG_BLK_DEV_RAM_COUNT) devices would be created by default
but comment said 1. Fix it.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

13868b76

brd: handle on-demand devices correctly · af465668

由 Namhyung Kim 提交于 5月 26, 2011

When finding or allocating a ram disk device, brd_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example (I set CONFIG_BLK_DEV_RAM_COUNT=4
for simplicity) :

$ sudo modprobe brd max_part=15
$ ls -l /dev/ram*
brw-rw---- 1 root disk 1,  0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
$ sudo mknod /dev/ram4 b 1 64
$ sudo dd if=/dev/zero of=/dev/ram4 bs=4k count=256
256+0 records in
256+0 records out
1048576 bytes (1.0 MB) copied, 0.00215578 s, 486 MB/s
namhyung@leonhard:linux$ ls -l /dev/ram*
brw-rw---- 1 root disk 1,    0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1,   16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1,   32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1,   48 2011-05-25 15:41 /dev/ram3
brw-r--r-- 1 root root 1,   64 2011-05-25 15:45 /dev/ram4
brw-rw---- 1 root disk 1, 1024 2011-05-25 15:44 /dev/ram64

After this patch, /dev/ram4 - instead of /dev/ram64 - was
accessed correctly.

In addition, 'range' passed to blk_register_region() should
include all range of dev_t that RAMDISK_MAJOR can address.
It does not need to be limited by partition numbers unless
'rd_nr' param was specified.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

af465668

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功