1. 22 8月, 2011 2 次提交
    • J
      xen-blkback: fixed indentation and comments · 1bc05b0a
      Joe Jin 提交于
      This patch fixes belows:
      
      1. Fix code style issue.
      2. Fix incorrect functions name in comments.
      Signed-off-by: NJoe Jin <joe.jin@oracle.com>
      Cc: Jens Axboe <jaxboe@fusionio.com>
      Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      1bc05b0a
    • J
      xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed. · 6f5986bc
      Joe Jin 提交于
      When do block-attach/block-detach test with below steps, umount hangs
      in the guest. Furthermore shutdown ends up being stuck when umounting file-systems.
      
      1. start guest.
      2. attach new block device by xm block-attach in Dom0.
      3. mount new disk in guest.
      4. execute xm block-detach to detach the block device in dom0 until timeout
      5. Any request to the disk will hung.
      
      Root cause:
      This issue is caused when setting backend device's state to
      'XenbusStateClosing', which sends to the frontend the XenbusStateClosing
      notification. When frontend receives the notification it tries to release
      the disk in blkfront_closing(), but at that moment the disk is still in use
      by guest, so frontend refuses to close. Specifically it sets the disk state to
      XenbusStateClosing and sends the notification to backend - when backend receives the
      event, it disconnects the vbd from real device, and sets the vbd device state to
      XenbusStateClosing. The backend disconnects the real device/file, and any IO
      requests to the disk in guest will end up in ether, leaving disk DEAD and set to
      XenbusStateClosing. When the guest wants to disconnect the disk, umount will
      hang on blkif_release()->xlvbd_release_gendisk() as it is unable to send any IO
      to the disk, which prevents clean system shutdown.
      
      Solution:
      Don't disconnect backend until frontend state switched to XenbusStateClosed.
      Signed-off-by: NJoe Jin <joe.jin@oracle.com>
      Cc: Daniel Stodden <daniel.stodden@citrix.com>
      Cc: Jens Axboe <jaxboe@fusionio.com>
      Cc: Annie Li <annie.li@oracle.com>
      Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
      [v1: Modified description a bit]
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      6f5986bc
  2. 09 8月, 2011 1 次提交
  3. 03 8月, 2011 1 次提交
  4. 02 8月, 2011 1 次提交
  5. 01 8月, 2011 4 次提交
    • K
      loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other · 05eb0f25
      Kay Sievers 提交于
      LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
      files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
      waits for show() to finish to remove the sysfs file.
      
        cat /sys/class/block/loop0/loop/backing_file
          mutex_lock_nested+0x176/0x350
          ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
          ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
          loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
          dev_attr_show+0x1b/0x60
          ? sysfs_read_file+0x86/0x1a0
          ? __get_free_pages+0x12/0x50
          sysfs_read_file+0xaf/0x1a0
      
        ioctl(LOOP_CLR_FD):
          wait_for_common+0x12c/0x180
          ? try_to_wake_up+0x2a0/0x2a0
          wait_for_completion+0x18/0x20
          sysfs_deactivate+0x178/0x180
          ? sysfs_addrm_finish+0x43/0x70
          ? sysfs_addrm_start+0x1d/0x20
          sysfs_addrm_finish+0x43/0x70
          sysfs_hash_and_remove+0x85/0xa0
          sysfs_remove_group+0x59/0x100
          loop_clr_fd+0x1dc/0x3f0 [loop]
          lo_ioctl+0x223/0x7a0 [loop]
      
      Instead of taking the lo_ctl_mutex from sysfs code, take the inner
      lo->lo_lock, to protect the access to the backing_file data.
      
      Thanks to Tejun for help debugging and finding a solution.
      
      Cc: Milan Broz <mbroz@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      05eb0f25
    • K
      loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices · d134b00b
      Kay Sievers 提交于
      Instead of unconditionally creating a fixed number of dead loop
      devices which need to be investigated by storage handling services,
      even when they are never used, we allow distros start with 0
      loop devices and have losetup(8) and similar switch to the dynamic
      /dev/loop-control interface instead of searching /dev/loop%i for free
      devices.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      d134b00b
    • K
      loop: add management interface for on-demand device allocation · 770fe30a
      Kay Sievers 提交于
      Loop devices today have a fixed pre-allocated number of usually 8.
      The number can only be changed at module init time. To find a free
      device to use, /dev/loop%i needs to be scanned, and all devices need
      to be opened until a free one is possibly found.
      
      This adds a new /dev/loop-control device node, that allows to
      dynamically find or allocate a free device, and to add and remove loop
      devices from the running system:
       LOOP_CTL_ADD adds a specific device. Arg is the number
       of the device. It returns the device i or a negative
       error code.
      
       LOOP_CTL_REMOVE removes a specific device, Arg is the
       number the device. It returns the device i or a negative
       error code.
      
       LOOP_CTL_GET_FREE finds the next unbound device or allocates
       a new one. No arg is given. It returns the device i or a
       negative error code.
      
      The loop kernel module gets automatically loaded when
      /dev/loop-control is accessed the first time. The alias
      specified in the module, instructs udev to create this
      'dead' device node, even when the module is not loaded.
      
      Example:
       cfd = open("/dev/loop-control", O_RDWR);
      
       # add a new specific loop device
       err = ioctl(cfd, LOOP_CTL_ADD, devnr);
      
       # remove a specific loop device
       err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);
      
       # find or allocate a free loop device to use
       devnr = ioctl(cfd, LOOP_CTL_GET_FREE);
      
       sprintf(loopname, "/dev/loop%i", devnr);
       ffd = open("backing-file", O_RDWR);
       lfd = open(loopname, O_RDWR);
       err = ioctl(lfd, LOOP_SET_FD, ffd);
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Karel Zak  <kzak@redhat.com>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      770fe30a
    • K
      loop: replace linked list of allocated devices with an idr index · 34dd82af
      Kay Sievers 提交于
      Replace the linked list, that keeps track of allocated devices, with an
      idr index to allow a more efficient lookup of devices.
      
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      34dd82af
  6. 27 7月, 2011 3 次提交
  7. 20 7月, 2011 1 次提交
  8. 15 7月, 2011 2 次提交
  9. 14 7月, 2011 1 次提交
  10. 09 7月, 2011 1 次提交
  11. 01 7月, 2011 3 次提交
  12. 30 6月, 2011 6 次提交
  13. 09 6月, 2011 1 次提交
  14. 02 6月, 2011 1 次提交
    • L
      block: fix mismerge of the DISK_EVENT_MEDIA_CHANGE removal · 0f48f260
      Linus Torvalds 提交于
      Jens' back-merge commit 698567f3 ("Merge commit 'v2.6.39' into
      for-2.6.40/core") was incorrectly done, and re-introduced the
      DISK_EVENT_MEDIA_CHANGE lines that had been removed earlier in commits
      
       - 9fd097b1 ("block: unexport DISK_EVENT_MEDIA_CHANGE for
         legacy/fringe drivers")
      
       - 7eec77a1 ("ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd
         and ide-cd")
      
      because of conflicts with the "g->flags" updates near-by by commit
      d4dc210f ("block: don't block events on excl write for non-optical
      devices")
      
      As a result, we re-introduced the hanging behavior due to infinite disk
      media change reports.
      
      Tssk, tssk, people! Don't do back-merges at all, and *definitely* don't
      do them to hide merge conflicts from me - especially as I'm likely
      better at merging them than you are, since I do so many merges.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Jens Axboe <jaxboe@fusionio.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f48f260
  15. 01 6月, 2011 2 次提交
  16. 30 5月, 2011 2 次提交
  17. 29 5月, 2011 1 次提交
    • L
      x86 idle floppy: deprecate disable_hlt() · 3b70b2e5
      Len Brown 提交于
      Plan to remove floppy_disable_hlt in 2012, an ancient
      workaround with comments that it should be removed.
      
      This allows us to remove clutter and a run-time branch
      from the idle code.
      
      WARN_ONCE() on invocation until it is removed.
      
      cc: x86@kernel.org
      cc: stable@kernel.org # .39.x
      Signed-off-by: NLen Brown <len.brown@intel.com>
      3b70b2e5
  18. 28 5月, 2011 3 次提交
    • N
      nbd: adjust 'max_part' according to part_shift · 5988ce23
      Namhyung Kim 提交于
      The 'max_part' parameter determines how many partitions are supported
      on each nbd device. However the actual number can be changed to the
      power of 2 minus 1 form during the module initialization as
      alloc_disk() is called with (1 << part_shift) for some reason.
      
      So adjust 'max_part' also at least for consistency with loop and brd.
      It is exported via sysfs already, and a user should check this value
      after module loading if [s]he wants to use that number correctly
      (i.e. fdisk or something).
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      5988ce23
    • N
      nbd: limit module parameters to a sane value · 3b271082
      Namhyung Kim 提交于
      The 'max_part' parameter controls the number of maximum partition
      a nbd device can have. However if a user specifies very large
      value it would exceed the limitation of device minor number and
      can cause a kernel oops (or, at least, produce invalid device
      nodes in some cases).
      
      In addition, specifying large 'nbds_max' value causes same
      problem for the same reason.
      
      On my desktop, following command results to the kernel bug:
      
      $ sudo modprobe nbd max_part=100000
       kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
       invalid opcode: 0000 [#1] SMP
       last sysfs file: /sys/devices/virtual/block/nbd4/range
       CPU 1
       Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom
      
       Pid: 2522, comm: modprobe Tainted: G        W   2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO
       RIP: 0010:[<ffffffff8115aa08>]  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
       RSP: 0018:ffff8801009f1de8  EFLAGS: 00010246
       RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3
       RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478
       RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478
       R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0
       R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400
       FS:  00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0)
       Stack:
        ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80
        ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468
        0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a
       Call Trace:
        [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4
        [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43
        [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15
        [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16
        [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd
        [<ffffffff811f3bdf>] add_disk+0xe4/0x29c
        [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd]
        [<ffffffffa007e000>] ? 0xffffffffa007dfff
        [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e
        [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3
        [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b
       Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83
        7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49
       RIP  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
        RSP <ffff8801009f1de8>
       ---[ end trace 753285ffbf72c57c ]---
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      3b271082
    • N
      nbd: pass MSG_* flags to kernel_recvmsg() · 35fbf5bc
      Namhyung Kim 提交于
      Unlike kernel_sendmsg(), kernel_recvmsg() requires passing flags explicitly
      via last parameter instead of struct msghdr.msg_flags. Therefore calls to
      sock_xmit(lo, 0, ..., MSG_WAITALL) have not been processed properly by tcp
      layer wrt. the flag. Fix it.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Paul Clements <Paul.Clements@steeleye.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      35fbf5bc
  19. 27 5月, 2011 4 次提交
    • N
      loop: export module parameters · ac04fee0
      Namhyung Kim 提交于
      Export 'max_loop' and 'max_part' parameters to sysfs so user can know
      that how many devices are allowed and how many partitions are supported.
      
      If 'max_loop' is 0, there is no restriction on the number of loop devices.
      User can create/use the devices as many as minor numbers available. If
      'max_part' is 0, it means simply the device doesn't support partitioning.
      
      Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
      needed. User should check this value after the module loading if he/she
      want to use that number correctly (i.e. fdisk, mknod, etc.).
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      ac04fee0
    • N
      brd: export module parameters · 8892cbaf
      Namhyung Kim 提交于
      Export 'rd_nr', 'rd_size' and 'max_part' parameters to sysfs so user can
      know that how many devices are allowed, how big each device is and how
      many partitions are supported. If 'max_part' is 0, it means simply the
      device doesn't support partitioning.
      
      Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
      needed. User should check this value after the module loading if he/she
      want to use that number correctly (i.e. fdisk, mknod, etc.).
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      8892cbaf
    • N
      brd: fix comment on initial device creation · 13868b76
      Namhyung Kim 提交于
      If 'rd_nr' param was not specified, 16 (can be adjusted via
      CONFIG_BLK_DEV_RAM_COUNT) devices would be created by default
      but comment said 1. Fix it.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      13868b76
    • N
      brd: handle on-demand devices correctly · af465668
      Namhyung Kim 提交于
      When finding or allocating a ram disk device, brd_probe() did not take
      partition numbers into account so that it can result to a different
      device. Consider following example (I set CONFIG_BLK_DEV_RAM_COUNT=4
      for simplicity) :
      
      $ sudo modprobe brd max_part=15
      $ ls -l /dev/ram*
      brw-rw---- 1 root disk 1,  0 2011-05-25 15:41 /dev/ram0
      brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
      brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
      brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
      $ sudo mknod /dev/ram4 b 1 64
      $ sudo dd if=/dev/zero of=/dev/ram4 bs=4k count=256
      256+0 records in
      256+0 records out
      1048576 bytes (1.0 MB) copied, 0.00215578 s, 486 MB/s
      namhyung@leonhard:linux$ ls -l /dev/ram*
      brw-rw---- 1 root disk 1,    0 2011-05-25 15:41 /dev/ram0
      brw-rw---- 1 root disk 1,   16 2011-05-25 15:41 /dev/ram1
      brw-rw---- 1 root disk 1,   32 2011-05-25 15:41 /dev/ram2
      brw-rw---- 1 root disk 1,   48 2011-05-25 15:41 /dev/ram3
      brw-r--r-- 1 root root 1,   64 2011-05-25 15:45 /dev/ram4
      brw-rw---- 1 root disk 1, 1024 2011-05-25 15:44 /dev/ram64
      
      After this patch, /dev/ram4 - instead of /dev/ram64 - was
      accessed correctly.
      
      In addition, 'range' passed to blk_register_region() should
      include all range of dev_t that RAMDISK_MAJOR can address.
      It does not need to be limited by partition numbers unless
      'rd_nr' param was specified.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      af465668