1. 04 3月, 2011 1 次提交
  2. 19 1月, 2011 1 次提交
    • S
      loop: queue_lock NULL pointer derefence in blk_throtl_exit · ee71a968
      Sergey Senozhatsky 提交于
      Performing
      $ sudo mount -o loop -o umask=0 /dev/sdb1 /mnt/
      mount: wrong fs type, bad option, bad superblock on /dev/loop0,
             missing codepage or helper program, or other error
             In some cases useful info is found in syslog - try
             dmesg | tail  or so
      
      $ sudo modprobe -r loop
      
      results in oops:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
       IP: [<ffffffff812479d4>] do_raw_spin_lock+0x14/0x122
       Process modprobe (pid: 6189, threadinfo ffff88009a898000, task ffff880154a88000)
       Call Trace:
        [<ffffffff81486788>] _raw_spin_lock_irq+0x4a/0x51
        [<ffffffff8123404b>] ? blk_throtl_exit+0x3b/0xa0
        [<ffffffff8105b120>] ? cancel_delayed_work_sync+0xd/0xf
        [<ffffffff8123404b>] blk_throtl_exit+0x3b/0xa0
        [<ffffffff81229bc8>] blk_release_queue+0x21/0x65
        [<ffffffff8123bb06>] kobject_release+0x51/0x66
        [<ffffffff8123bab5>] ? kobject_release+0x0/0x66
        [<ffffffff8123ce1e>] kref_put+0x43/0x4d
        [<ffffffff8123ba27>] kobject_put+0x47/0x4b
        [<ffffffff8122717c>] blk_cleanup_queue+0x56/0x5b
        [<ffffffffa01c3824>] loop_exit+0x68/0x844 [loop]
        [<ffffffff8107cccc>] sys_delete_module+0x1e8/0x25b
        [<ffffffff814864c9>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff81002112>] system_call_fastpath+0x16/0x1b
      
      because of an attempt to acquire NULL queue_lock.
      I added the same lines as in blk_queue_make_request -
      index 44e18c0..49e6a54 100644`fall back to embedded per-queue lock'.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      ee71a968
  3. 20 12月, 2010 1 次提交
  4. 17 12月, 2010 1 次提交
  5. 10 11月, 2010 1 次提交
    • C
      block: remove REQ_HARDBARRIER · 02e031cb
      Christoph Hellwig 提交于
      REQ_HARDBARRIER is dead now, so remove the leftovers.  What's left
      at this point is:
      
       - various checks inside the block layer.
       - sanity checks in bio based drivers.
       - now unused bio_empty_barrier helper.
       - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
         but Xen really needs to sort out it's barrier situaton.
       - setting of ordered tags in uas - dead code copied from old scsi
         drivers.
       - scsi different retry for barriers - it's dead and should have been
         removed when flushes were converted to FS requests.
       - blktrace handling of barriers - removed.  Someone who knows blktrace
         better should add support for REQ_FLUSH and REQ_FUA, though.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      02e031cb
  6. 28 10月, 2010 1 次提交
  7. 27 10月, 2010 1 次提交
  8. 05 10月, 2010 1 次提交
    • A
      block: autoconvert trivial BKL users to private mutex · 2a48fc0a
      Arnd Bergmann 提交于
      The block device drivers have all gained new lock_kernel
      calls from a recent pushdown, and some of the drivers
      were already using the BKL before.
      
      This turns the BKL into a set of per-driver mutexes.
      Still need to check whether this is safe to do.
      
      file=$1
      name=$2
      if grep -q lock_kernel ${file} ; then
          if grep -q 'include.*linux.mutex.h' ${file} ; then
                  sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
          else
                  sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
          fi
          sed -i ${file} \
              -e "/^#include.*linux.mutex.h/,$ {
                      1,/^\(static\|int\|long\)/ {
                           /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
      
      } }"  \
          -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
          -e '/[      ]*cycle_kernel_lock();/d'
      else
          sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                      -e '/cycle_kernel_lock()/d'
      fi
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      2a48fc0a
  9. 10 9月, 2010 3 次提交
    • T
      block/loop: implement REQ_FLUSH/FUA support · 6259f284
      Tejun Heo 提交于
      Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead.  Also,
      instead of checking file->f_op->fsync() directly, look at the value of
      vfs_fsync() and ignore -EINVAL return.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6259f284
    • T
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo 提交于
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4913efe4
    • T
      block/loop: queue ordered mode should be DRAIN_FLUSH · 589d7ed0
      Tejun Heo 提交于
      loop implements FLUSH using fsync but was incorrectly setting its
      ordered mode to DRAIN.  Change it to DRAIN_FLUSH.  In practice, this
      doesn't change anything as loop doesn't make use of the block layer
      ordered implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      589d7ed0
  10. 23 8月, 2010 2 次提交
  11. 08 8月, 2010 3 次提交
  12. 22 5月, 2010 2 次提交
    • C
      sanitize vfs_fsync calling conventions · 8018ab05
      Christoph Hellwig 提交于
      Now that the last user passing a NULL file pointer is gone we can remove
      the redundant dentry argument and associated hacks inside vfs_fsynmc_range.
      
      The next step will be removig the dentry argument from ->fsync, but given
      the luck with the last round of method prototype changes I'd rather
      defer this until after the main merge window.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8018ab05
    • D
      generate "change" uevent for loop device · c3473c63
      David Zeuthen 提交于
      Recent udev versions probe loop devices for filesystems meaning that
      the /dev/disk hierarchy may contain useful entries such as
      
       $ ls -l /dev/disk/by-label/Fedora-12-x86_64-Live
       lrwxrwxrwx 1 root root 11 Mar 11 13:41 /dev/disk/by-label/Fedora-12-x86_64-Live -> ../../loop0
      
      Unfortunately, no "change" uevent is generated when the loop device is
      detached so the symlink persists. Additionally, no "change" uevent is
      guaranteed to be generated when attaching an fd or changing capacity.
      For example,  user space could open the loop device O_RDONLY (in fact,
      recent util-linux-ng does this) so udev's OPTIONS+="watch" machinery may
      not trigger the "change" uevent.
      
      This patch ensures that the "change" uevent is generated in all of
      these cases. As a result, the /dev/disk hierarchy works as expected
      for loop devices.
      Signed-off-by: NDavid Zeuthen <davidz@redhat.com>
      Acked-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      c3473c63
  13. 09 4月, 2010 1 次提交
  14. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  15. 29 10月, 2009 1 次提交
    • A
      loop: fix NULL dereference if mount fails · cf6e6932
      Alexey Dobriyan 提交于
      Commit bb214884 ("[PATCH] switch loop")
      started to pass NULL bdev to ioctl hook.
      
      Steps to reproduce:
      
      	[boot with loop.max_part=1]
      	[mount -o loop something so mount fails]
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
      IP: [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
      PGD 0
      Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:35/ACPI0003:00/power_supply/ACAD/online
      CPU 0
      Modules linked in: zfs nvidia(P) [last unloaded: zfs]
      Pid: 15177, comm: mount Tainted: P           2.6.32-rc4-zfs #2 Satellite X200
      RIP: 0010:[<ffffffff811486ee>]  [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
      RSP: 0018:ffff88003b3d5bb8  EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 000000000000125f RSI: 0000000000000000 RDI: 0000000000000000
      RBP: ffff88003b3d5ce8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 00007ffffffff000
      R13: 0000000000000000 R14: ffff880071cef280 R15: 00000000000200da
      FS:  00007fd77cfe7740(0000) GS:ffff880001600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000000000b8 CR3: 0000000001001000 CR4: 00000000000026f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process mount (pid: 15177, threadinfo ffff88003b3d4000, task ffff88007572f920)
      Stack:
       ffff88003b3d5c38 ffffffff812f95f5 ffff88007eeb6600 0000000000000000
      <0> 0000000000000000 ffff88003b3d5c18 ffffffff811547d9 ffff88001bf11ef0
      <0> 7fffffffffffffff ffff88001bf11ee8 ffff88001bf11ef0 0000000000000000
      Call Trace:
       [<ffffffff812f95f5>] ? schedule_timeout+0x1f5/0x250
       [<ffffffff811547d9>] ? rb_insert_color+0x109/0x140
       [<ffffffff812fb754>] ? _spin_unlock_irq+0x14/0x40
       [<ffffffff812f84c6>] ? wait_for_common+0x66/0x170
       [<ffffffff8105a280>] ? default_wake_function+0x0/0x10
       [<ffffffff810f8258>] ioctl_by_bdev+0x38/0x50
       [<ffffffff811d2481>] loop_clr_fd+0x1e1/0x210
       [<ffffffff811d2522>] lo_release+0x72/0x80
       [<ffffffff810f934c>] __blkdev_put+0x1ac/0x1d0
       [<ffffffff810f937b>] blkdev_put+0xb/0x10
       [<ffffffff810f93b9>] blkdev_close+0x39/0x60
       [<ffffffff810ccef3>] __fput+0xd3/0x230
       [<ffffffff810cd06d>] fput+0x1d/0x30
       [<ffffffff810c9680>] filp_close+0x50/0x80
       [<ffffffff81061f11>] put_files_struct+0x81/0x100
       [<ffffffff81061fde>] exit_files+0x4e/0x60
       [<ffffffff81063ec5>] do_exit+0x6b5/0x730
       [<ffffffff8107b279>] ? up_read+0x9/0x10
       [<ffffffff8104c86e>] ? do_page_fault+0x18e/0x2a0
       [<ffffffff81063f81>] do_group_exit+0x41/0xc0
       [<ffffffff81064012>] sys_exit_group+0x12/0x20
       [<ffffffff81030deb>] system_call_fastpath+0x16/0x1b
      Code: f8 48 89 e5 48 81 ec 30 01 00 00 48 89 5d d8 4c 89 6d e8 4c 89 65 e0 4c 89 75 f0 4c 89 7d f8 48 89 bd e8 fe ff ff 49 89 cd 89 f3 <49> 8b 88 b8 00 00 00 81 fa 68 12 00 00 0f 84 57 05 00 00 0f 86
      RIP  [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
       RSP <ffff88003b3d5bb8>
      CR2: 00000000000000b8
      ---[ end trace c0b4d3c3118d1427 ]---
      Fixing recursive fault but reboot is needed!
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf6e6932
  16. 22 9月, 2009 1 次提交
  17. 11 9月, 2009 1 次提交
  18. 13 7月, 2009 1 次提交
  19. 11 5月, 2009 1 次提交
  20. 28 4月, 2009 1 次提交
  21. 07 4月, 2009 1 次提交
  22. 01 4月, 2009 1 次提交
    • J
      loop: add ioctl to resize a loop device · 53d66608
      J. R. Okajima 提交于
      Add the ability to 'resize' the loop device on the fly.
      
      One practical application is a loop file with XFS filesystem, already
      mounted: You can easily enlarge the file (append some bytes) and then call
      ioctl(fd, LOOP_SET_CAPACITY, new); The loop driver will learn about the
      new size and you can use xfs_growfs later on, which will allow you to use
      full capacity of the loop file without the need to unmount.
      
      Test app:
      
      #include <linux/fs.h>
      #include <linux/loop.h>
      #include <sys/ioctl.h>
      #include <sys/stat.h>
      #include <sys/types.h>
      #include <assert.h>
      #include <errno.h>
      #include <fcntl.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <unistd.h>
      
      #define _GNU_SOURCE
      #include <getopt.h>
      
      char *me;
      
      void usage(FILE *f)
      {
      	fprintf(f, "%s [options] loop_dev [backend_file]\n"
      		"-s, --set new_size_in_bytes\n"
      		"\twhen backend_file is given, "
      		"it will be expanded too while keeping the original contents\n",
      		me);
      }
      
      struct option opts[] = {
      	{
      		.name		= "set",
      		.has_arg	= 1,
      		.flag		= NULL,
      		.val		= 's'
      	},
      	{
      		.name		= "help",
      		.has_arg	= 0,
      		.flag		= NULL,
      		.val		= 'h'
      	}
      };
      
      void err_size(char *name, __u64 old)
      {
      	fprintf(stderr, "size must be larger than current %s (%llu)\n",
      		name, old);
      }
      
      int main(int argc, char *argv[])
      {
      	int fd, err, c, i, bfd;
      	ssize_t ssz;
      	size_t sz;
      	__u64 old, new, append;
      	char a[BUFSIZ];
      	struct stat st;
      	FILE *out;
      	char *backend, *dev;
      
      	err = EINVAL;
      	out = stderr;
      	me = argv[0];
      	new = 0;
      	while ((c = getopt_long(argc, argv, "s:h", opts, &i)) != -1) {
      		switch (c) {
      		case 's':
      			errno = 0;
      			new = strtoull(optarg, NULL, 0);
      			if (errno) {
      				err = errno;
      				perror(argv[i]);
      				goto out;
      			}
      			break;
      
      		case 'h':
      			err = 0;
      			out = stdout;
      			goto err;
      
      		default:
      			perror(argv[i]);
      			goto err;
      		}
      	}
      
      	if (optind < argc)
      		dev = argv[optind++];
      	else
      		goto err;
      
      	fd = open(dev, O_RDONLY);
      	if (fd < 0) {
      		err = errno;
      		perror(dev);
      		goto out;
      	}
      
      	err = ioctl(fd, BLKGETSIZE64, &old);
      	if (err) {
      		err = errno;
      		perror("ioctl BLKGETSIZE64");
      		goto out;
      	}
      
      	if (!new) {
      		printf("%llu\n", old);
      		goto out;
      	}
      
      	if (new < old) {
      		err = EINVAL;
      		err_size(dev, old);
      		goto out;
      	}
      
      	if (optind < argc) {
      		backend = argv[optind++];
      		bfd = open(backend, O_WRONLY|O_APPEND);
      		if (bfd < 0) {
      			err = errno;
      			perror(backend);
      			goto out;
      		}
      		err = fstat(bfd, &st);
      		if (err) {
      			err = errno;
      			perror(backend);
      			goto out;
      		}
      		if (new < st.st_size) {
      			err = EINVAL;
      			err_size(backend, st.st_size);
      			goto out;
      		}
      		append = new - st.st_size;
      		sz = sizeof(a);
      		while (append > 0) {
      			if (append < sz)
      				sz = append;
      			ssz = write(bfd, a, sz);
      			if (ssz != sz) {
      				err = errno;
      				perror(backend);
      				goto out;
      			}
      			append -= sz;
      		}
      		err = fsync(bfd);
      		if (err) {
      			err = errno;
      			perror(backend);
      			goto out;
      		}
      	}
      
      	err = ioctl(fd, LOOP_SET_CAPACITY, new);
      	if (err) {
      		err = errno;
      		perror("ioctl LOOP_SET_CAPACITY");
      	}
      	goto out;
      
       err:
      	usage(out);
       out:
      	return err;
      }
      Signed-off-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
      Signed-off-by: NTomas Matejicek <tomas@slax.org>
      Cc: <util-linux-ng@vger.kernel.org>
      Cc: Karel Zak <kzak@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Akinobu Mita <akinobu.mita@gmail.com>
      Cc: <linux-api@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53d66608
  23. 26 3月, 2009 1 次提交
    • N
      loop: fix circular locking in loop_clr_fd() · f028f3b2
      Nikanth Karthikesan 提交于
      With CONFIG_PROVE_LOCKING enabled
      
      $ losetup /dev/loop0 file
      $ losetup -o 32256 /dev/loop1 /dev/loop0
      
      $ losetup -d /dev/loop1
      $ losetup -d /dev/loop0
      
      triggers a [ INFO: possible circular locking dependency detected ]
      
      I think this warning is a false positive.
      
      Open/close on a loop device acquires bd_mutex of the device before
      acquiring lo_ctl_mutex of the same device. For ioctl(LOOP_CLR_FD) after
      acquiring lo_ctl_mutex, fput on the backing_file might acquire the bd_mutex of
      a device, if backing file is a device and this is the last reference to the
      file being dropped . But it is guaranteed that it is impossible to have a
      circular list of backing devices.(say loop2->loop1->loop0->loop2 is not
      possible), which guarantees that this can never deadlock.
      
      So this warning should be suppressed. It is very difficult to annotate lockdep
      not to warn here in the correct way. A simple way to silence lockdep could be
      to mark the lo_ctl_mutex in ioctl to be a sub class, but this might mask some
      other real bugs.
      
      @@ -1164,7 +1164,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
       	struct loop_device *lo = bdev->bd_disk->private_data;
       	int err;
      
      -	mutex_lock(&lo->lo_ctl_mutex);
      +	mutex_lock_nested(&lo->lo_ctl_mutex, 1);
       	switch (cmd) {
       	case LOOP_SET_FD:
       		err = loop_set_fd(lo, mode, bdev, arg);
      
      Or actually marking the bd_mutex after lo_ctl_mutex as a sub class could be
      a better solution.
      
      Luckily it is easy to avoid calling fput on backing file with lo_ctl_mutex
      held, so no lockdep annotation is required.
      
      If you do not like the special handling of the lo_ctl_mutex just for the
      LOOP_CLR_FD ioctl in lo_ioctl(), the mutex handling could be moved inside
      each of the individual ioctl handlers and I could send you another patch.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f028f3b2
  24. 24 3月, 2009 1 次提交
  25. 05 3月, 2009 1 次提交
  26. 29 12月, 2008 2 次提交
    • M
      loop: Do not call loop_unplug for not configured loop device. · 8ae30b89
      Milan Broz 提交于
      In loop_unplug() function is expected that mapping is set
      and lo->lo_backing_file is not NULL.
      
      Unfortunately loop_set_fd() set the request queue unplug function,
      but loop_clr_fd() doesn't clear that.
      
      Loop device allows open of non-configured loop in some situations.
      If the unplug on request queue is called, loop module oopses because
      of missing lo_backing_file.
      
      Simple reproducer:
      	losetup /dev/loop0 /xxx
      	losetup -d /dev/loop0
      	dmsetup create x --table "0 1 linear /dev/loop0 0"
      
       EIP is at loop_unplug+0x1d/0x3b
       ...
        Call Trace:
         blk_unplug+0x57/0x5e
         dm_table_unplug_all+0x34/0x77 [dm_mod]
         destroy_inode+0x27/0x38
         generic_delete_inode+0xd5/0xd9
         iput+0x4b/0x4e
         dm_resume+0xca/0xfe [dm_mod]
         dev_suspend+0x143/0x165 [dm_mod]
         dm_ctl_ioctl+0x18e/0x1cf [dm_mod]
         dev_suspend+0x0/0x165 [dm_mod]
         dm_ctl_ioctl+0x0/0x1cf [dm_mod]
         vfs_ioctl+0x22/0x69
         do_vfs_ioctl+0x39d/0x3c7
         trace_hardirqs_on+0xb/0xd
         remove_vma+0x50/0x56
         do_munmap+0x21c/0x237
         sys_ioctl+0x2c/0x45
         sysenter_do_call+0x12/0x31
      
      Several reports here
      http://www.kerneloops.org/search.php?search=loop_unplug
      
      Fix it by simply clear unplug function together with
      removing of backing file.
      Signed-off-by: NMilan Broz <mbroz@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      8ae30b89
    • M
      loop: Flush possible running bios when loop device is released. · 14f27939
      Milan Broz 提交于
      When there are still queued bios and reference count
      drops to zero, loop device must flush all queued bios.
      
      Otherwise it can lead to situation that caller
      closes the device, but some bios are still running
      and endio() function call later OOpses when uses
      unallocated mempool.
      
      This happens for example when running dm-crypt over loop,
      here is typical oops backtrace:
      
       Oops: 0000 [#1] PREEMPT SMP
       EIP is at mempool_free+0x12/0x6b
      ...
       crypt_dec_pending+0x50/0x54 [dm_crypt]
       crypt_endio+0x9f/0xa7 [dm_crypt]
       crypt_endio+0x0/0xa7 [dm_crypt]
       bio_endio+0x2b/0x2e
       loop_thread+0x37a/0x3b1
       do_lo_send_aops+0x0/0x165
       autoremove_wake_function+0x0/0x33
       loop_thread+0x0/0x3b1
       kthread+0x3b/0x61
       kthread+0x0/0x61
       kernel_thread_helper+0x7/0x10
      
      (But crash is reproducible with different dm targets
      running over loop device too.)
      
      Patch fixes it by flushing the bios in release call,
      reusing the flush mechanism for switching backing store.
      Signed-off-by: NMilan Broz <mbroz@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      14f27939
  27. 14 11月, 2008 1 次提交
  28. 31 10月, 2008 1 次提交
  29. 21 10月, 2008 3 次提交
    • A
      511de73f
    • A
      [PATCH] switch loop · bb214884
      Al Viro 提交于
      ioctl doesn't need BKL here
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bb214884
    • A
      [PATCH] beginning of methods conversion · d4430d62
      Al Viro 提交于
      To keep the size of changesets sane we split the switch by drivers;
      to keep the damn thing bisectable we do the following:
      	1) rename the affected methods, add ones with correct
      prototypes, make (few) callers handle both.  That's this changeset.
      	2) for each driver convert to new methods.  *ALL* drivers
      are converted in this series.
      	3) kill the old (renamed) methods.
      
      Note that it _is_ a flagday; all in-tree drivers are converted and by the
      end of this series no trace of old methods remain.  The only reason why
      we do that this way is to keep the damn thing bisectable and allow per-driver
      debugging if anything goes wrong.
      
      New methods:
      	open(bdev, mode)
      	release(disk, mode)
      	ioctl(bdev, mode, cmd, arg)		/* Called without BKL */
      	compat_ioctl(bdev, mode, cmd, arg)
      	locked_ioctl(bdev, mode, cmd, arg)	/* Called with BKL, legacy */
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d4430d62
  30. 29 4月, 2008 1 次提交
  31. 21 4月, 2008 1 次提交
    • L
      loop: manage partitions in disk image · 476a4813
      Laurent Vivier 提交于
      This patch allows to use loop device with partitionned disk image.
      
      Original behavior of loop is not modified.
      
      A new parameter is introduced to define how many partition we want to be
      able to manage per loop device. This parameter is "max_part".
      
      For instance, to manage 63 partitions / loop device, we will do:
      # modprobe loop max_part=63
      # ls -l /dev/loop?*
      brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
      brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
      brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
      brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
      brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
      brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
      brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
      brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7
      
      And to attach a raw partitionned disk image, the original losetup is used:
      
      # losetup -f etch.img
      # ls -l /dev/loop?*
      brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
      brw-rw---- 1 root disk 7,   1 2008-03-05 14:57 /dev/loop0p1
      brw-rw---- 1 root disk 7,   2 2008-03-05 14:57 /dev/loop0p2
      brw-rw---- 1 root disk 7,   5 2008-03-05 14:57 /dev/loop0p5
      brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
      brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
      brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
      brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
      brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
      brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
      brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7
      # mount /dev/loop0p1 /mnt
      # ls /mnt
      bench  cdrom  home        lib         mnt   root     srv  usr
      bin    dev    initrd      lost+found  opt   sbin     sys  var
      boot   etc    initrd.img  media       proc  selinux  tmp  vmlinuz
      # umount /mnt
      # losetup -d /dev/loop0
      
      Of course, the same behavior can be done using kpartx on a loop device,
      but modifying loop avoids to stack several layers of block device (loop +
      device mapper), this is a very light modification (40% of modifications
      are to manage the new parameter).
      Signed-off-by: NLaurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      476a4813