1. 04 1月, 2012 1 次提交
  2. 02 12月, 2011 1 次提交
  3. 25 11月, 2011 1 次提交
  4. 16 11月, 2011 2 次提交
    • D
      loop: cleanup set_status interface · 7035b5df
      Dmitry Monakhov 提交于
      1) Anyone who has read access to loopdev has permission to call set_status
         and may change important parameters such as lo_offset, lo_sizelimit and
         so on, which contradicts to read access pattern and definitely equals
         to write access pattern.
      2) Add lo_offset over i_size check to prevent blkdev_size overflow.
         ##Testcase_bagin
         #dd if=/dev/zero of=./file bs=1k count=1
         #losetup /dev/loop0 ./file
         /* userspace_application */
         struct loop_info64 loinf;
         fd = open("/dev/loop0", O_RDONLY);
         ioctl(fd, LOOP_GET_STATUS64, &loinf);
         /* Set offset to any value which is bigger than i_size, and sizelimit
          * to nonzero value*/
         loinf.lo_offset = 4096*1024;
         loinf.lo_sizelimit = 1024;
         ioctl(fd, LOOP_SET_STATUS64, &loinf);
         /* After this loop device will have size similar to 0x7fffffffffxxxx */
         #blockdev --getsz /dev/loop0
         ##OUTPUT: 36028797018955968
         ##Testcase_end
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7035b5df
    • D
      loop: prevent information leak after failed read · 3bb90682
      Dmitry Monakhov 提交于
      If read was not fully successful we have to fail whole bio to prevent
      information leak of old pages
      
      ##Testcase_begin
      dd if=/dev/zero of=./file bs=1M count=1
      losetup /dev/loop0 ./file -o 4096
      truncate -s 0 ./file
      # OOps loop offset is now beyond i_size, so read will silently fail.
      # So bio's pages would not be cleared, may which result in information leak.
      hexdump -C /dev/loop0
      ##testcase_end
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3bb90682
  5. 17 10月, 2011 1 次提交
    • C
      loop: remove the incorrect write_begin/write_end shortcut · 456be148
      Christoph Hellwig 提交于
      Currently the loop device tries to call directly into write_begin/write_end
      instead of going through ->write if it can.  This is a fairly nasty shortcut
      as write_begin and write_end are only callbacks for the generic write code
      and expect to be called with filesystem specific locks held.
      
      This code currently causes various issues for clustered filesystems as it
      doesn't take the required cluster locks, and it also causes issues for XFS
      as it doesn't properly lock against the swapext ioctl as called by the
      defragmentation tools.  This in case causes data corruption if
      defragmentation hits a busy loop device in the wrong time window, as
      reported by RH QA.
      
      The reason why we have this shortcut is that it saves a data copy when
      doing a transformation on the loop device, which is the technical term
      for using cryptoloop (or an XOR transformation).  Given that cryptoloop
      has been deprecated in favour of dm-crypt my opinion is that we should
      simply drop this shortcut instead of finding complicated ways to to
      introduce a formal interface for this shortcut.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      456be148
  6. 21 9月, 2011 2 次提交
  7. 12 9月, 2011 1 次提交
  8. 24 8月, 2011 1 次提交
    • K
      loop: always allow userspace partitions and optionally support automatic scanning · e03c8dd1
      Kay Sievers 提交于
      Automatic partition scanning can be requested individually per loop
      device during its setup by setting LO_FLAGS_PARTSCAN. By default, no
      partition tables are scanned.
      
      Userspace can now always add and remove partitions from all loop
      devices, regardless if the in-kernel partition scanner is enabled or
      not.
      
      The needed partition minor numbers are allocated from the extended
      minors space, the main loop device numbers will continue to match the
      loop minors, regardless of the number of partitions used.
      
        # grep . /sys/class/block/loop1/loop/*
        /sys/block/loop1/loop/autoclear:0
        /sys/block/loop1/loop/backing_file:/home/kay/data/stuff/part.img
        /sys/block/loop1/loop/offset:0
        /sys/block/loop1/loop/partscan:1
        /sys/block/loop1/loop/sizelimit:0
      
        # ls -l /dev/loop*
        brw-rw---- 1 root disk   7,   0 Aug 14 20:22 /dev/loop0
        brw-rw---- 1 root disk   7,   1 Aug 14 20:23 /dev/loop1
        brw-rw---- 1 root disk 259,   0 Aug 14 20:23 /dev/loop1p1
        brw-rw---- 1 root disk 259,   1 Aug 14 20:23 /dev/loop1p2
        brw-rw---- 1 root disk   7,  99 Aug 14 20:23 /dev/loop99
        brw-rw---- 1 root disk 259,   2 Aug 14 20:23 /dev/loop99p1
        brw-rw---- 1 root disk 259,   3 Aug 14 20:23 /dev/loop99p2
        crw------T 1 root root  10, 237 Aug 14 20:22 /dev/loop-control
      
      Cc: Karel Zak  <kzak@redhat.com>
      Cc: Davidlohr Bueso <dave@gnu.org>
      Acked-By: NTejun Heo <tj@kernel.org>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      e03c8dd1
  9. 19 8月, 2011 1 次提交
    • L
      loop: add discard support for loop devices · dfaa2ef6
      Lukas Czerner 提交于
      This commit adds discard support for loop devices. Discard is usually
      supported by SSD and thinly provisioned devices as a method for
      reclaiming unused space. This is no different than trying to reclaim
      back space which is not used by the file system on the image, but it
      still occupies space on the host file system.
      
      We can do the reclamation on file system which does support hole
      punching. So when discard request gets to the loop driver we can
      translate that to punch a hole to the underlying file, hence reclaim
      the free space.
      
      This is very useful for trimming down the size of the image to only what
      is really used by the file system on that image. Fstrim may be used for
      that purpose.
      
      It has been tested on ext4, xfs and btrfs with the image file systems
      ext4, ext3, xfs and btrfs. ext4, or ext6 image on ext4 file system has
      some problems but it seems that ext4 punch hole implementation is
      somewhat flawed and it is unrelated to this commit.
      
      Also this is a very good method of validating file systems punch hole
      implementation.
      
      Note that when encryption is used, discard support is disabled, because
      using it might leak some information useful for possible attacker.
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      dfaa2ef6
  10. 01 8月, 2011 4 次提交
    • K
      loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other · 05eb0f25
      Kay Sievers 提交于
      LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
      files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
      waits for show() to finish to remove the sysfs file.
      
        cat /sys/class/block/loop0/loop/backing_file
          mutex_lock_nested+0x176/0x350
          ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
          ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
          loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
          dev_attr_show+0x1b/0x60
          ? sysfs_read_file+0x86/0x1a0
          ? __get_free_pages+0x12/0x50
          sysfs_read_file+0xaf/0x1a0
      
        ioctl(LOOP_CLR_FD):
          wait_for_common+0x12c/0x180
          ? try_to_wake_up+0x2a0/0x2a0
          wait_for_completion+0x18/0x20
          sysfs_deactivate+0x178/0x180
          ? sysfs_addrm_finish+0x43/0x70
          ? sysfs_addrm_start+0x1d/0x20
          sysfs_addrm_finish+0x43/0x70
          sysfs_hash_and_remove+0x85/0xa0
          sysfs_remove_group+0x59/0x100
          loop_clr_fd+0x1dc/0x3f0 [loop]
          lo_ioctl+0x223/0x7a0 [loop]
      
      Instead of taking the lo_ctl_mutex from sysfs code, take the inner
      lo->lo_lock, to protect the access to the backing_file data.
      
      Thanks to Tejun for help debugging and finding a solution.
      
      Cc: Milan Broz <mbroz@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      05eb0f25
    • K
      loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices · d134b00b
      Kay Sievers 提交于
      Instead of unconditionally creating a fixed number of dead loop
      devices which need to be investigated by storage handling services,
      even when they are never used, we allow distros start with 0
      loop devices and have losetup(8) and similar switch to the dynamic
      /dev/loop-control interface instead of searching /dev/loop%i for free
      devices.
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      d134b00b
    • K
      loop: add management interface for on-demand device allocation · 770fe30a
      Kay Sievers 提交于
      Loop devices today have a fixed pre-allocated number of usually 8.
      The number can only be changed at module init time. To find a free
      device to use, /dev/loop%i needs to be scanned, and all devices need
      to be opened until a free one is possibly found.
      
      This adds a new /dev/loop-control device node, that allows to
      dynamically find or allocate a free device, and to add and remove loop
      devices from the running system:
       LOOP_CTL_ADD adds a specific device. Arg is the number
       of the device. It returns the device i or a negative
       error code.
      
       LOOP_CTL_REMOVE removes a specific device, Arg is the
       number the device. It returns the device i or a negative
       error code.
      
       LOOP_CTL_GET_FREE finds the next unbound device or allocates
       a new one. No arg is given. It returns the device i or a
       negative error code.
      
      The loop kernel module gets automatically loaded when
      /dev/loop-control is accessed the first time. The alias
      specified in the module, instructs udev to create this
      'dead' device node, even when the module is not loaded.
      
      Example:
       cfd = open("/dev/loop-control", O_RDWR);
      
       # add a new specific loop device
       err = ioctl(cfd, LOOP_CTL_ADD, devnr);
      
       # remove a specific loop device
       err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);
      
       # find or allocate a free loop device to use
       devnr = ioctl(cfd, LOOP_CTL_GET_FREE);
      
       sprintf(loopname, "/dev/loop%i", devnr);
       ffd = open("backing-file", O_RDWR);
       lfd = open(loopname, O_RDWR);
       err = ioctl(lfd, LOOP_SET_FD, ffd);
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Karel Zak  <kzak@redhat.com>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      770fe30a
    • K
      loop: replace linked list of allocated devices with an idr index · 34dd82af
      Kay Sievers 提交于
      Replace the linked list, that keeps track of allocated devices, with an
      idr index to allow a more efficient lookup of devices.
      
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      34dd82af
  11. 27 5月, 2011 1 次提交
    • N
      loop: export module parameters · ac04fee0
      Namhyung Kim 提交于
      Export 'max_loop' and 'max_part' parameters to sysfs so user can know
      that how many devices are allowed and how many partitions are supported.
      
      If 'max_loop' is 0, there is no restriction on the number of loop devices.
      User can create/use the devices as many as minor numbers available. If
      'max_part' is 0, it means simply the device doesn't support partitioning.
      
      Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
      needed. User should check this value after the module loading if he/she
      want to use that number correctly (i.e. fdisk, mknod, etc.).
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      ac04fee0
  12. 24 5月, 2011 2 次提交
    • N
      loop: handle on-demand devices correctly · a1c15c59
      Namhyung Kim 提交于
      When finding or allocating a loop device, loop_probe() did not take
      partition numbers into account so that it can result to a different
      device. Consider following example:
      
      $ sudo modprobe loop max_part=15
      $ ls -l /dev/loop*
      brw-rw---- 1 root disk 7,   0 2011-05-24 22:16 /dev/loop0
      brw-rw---- 1 root disk 7,  16 2011-05-24 22:16 /dev/loop1
      brw-rw---- 1 root disk 7,  32 2011-05-24 22:16 /dev/loop2
      brw-rw---- 1 root disk 7,  48 2011-05-24 22:16 /dev/loop3
      brw-rw---- 1 root disk 7,  64 2011-05-24 22:16 /dev/loop4
      brw-rw---- 1 root disk 7,  80 2011-05-24 22:16 /dev/loop5
      brw-rw---- 1 root disk 7,  96 2011-05-24 22:16 /dev/loop6
      brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
      $ sudo mknod /dev/loop8 b 7 128
      $ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img
      $ sudo losetup -a
      /dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img)
      $ ls -l /dev/loop*
      brw-rw---- 1 root disk 7,    0 2011-05-24 22:16 /dev/loop0
      brw-rw---- 1 root disk 7,   16 2011-05-24 22:16 /dev/loop1
      brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128
      brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1
      brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2
      brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3
      brw-rw---- 1 root disk 7,   32 2011-05-24 22:16 /dev/loop2
      brw-rw---- 1 root disk 7,   48 2011-05-24 22:16 /dev/loop3
      brw-rw---- 1 root disk 7,   64 2011-05-24 22:16 /dev/loop4
      brw-rw---- 1 root disk 7,   80 2011-05-24 22:16 /dev/loop5
      brw-rw---- 1 root disk 7,   96 2011-05-24 22:16 /dev/loop6
      brw-rw---- 1 root disk 7,  112 2011-05-24 22:16 /dev/loop7
      brw-r--r-- 1 root root 7,  128 2011-05-24 22:17 /dev/loop8
      
      After this patch, /dev/loop8 - instead of /dev/loop128 - was
      accessed correctly.
      
      In addition, 'range' passed to blk_register_region() should
      include all range of dev_t that LOOP_MAJOR can address. It does
      not need to be limited by partition numbers unless 'max_loop'
      param was specified.
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      a1c15c59
    • N
      loop: limit 'max_part' module param to DISK_MAX_PARTS · 78f4bb36
      Namhyung Kim 提交于
      The 'max_part' parameter controls the number of maximum partition
      a loop block device can have. However if a user specifies very
      large value it would exceed the limitation of device minor number
      and can cause a kernel panic (or, at least, produce invalid
      device nodes in some cases).
      
      On my desktop system, following command kills the kernel. On qemu,
      it triggers similar oops but the kernel was alive:
      
      $ sudo modprobe loop max_part0000
       ------------[ cut here ]------------
       kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
       invalid opcode: 0000 [#1] SMP
       last sysfs file:
       CPU 0
       Modules linked in: loop(+)
      
       Pid: 43, comm: insmod Tainted: G        W   2.6.39-qemu+ #155 Bochs Bochs
       RIP: 0010:[<ffffffff8113ce61>]  [<ffffffff8113ce61>] internal_create_group=
      +0x2a/0x170
       RSP: 0018:ffff880007b3fde8  EFLAGS: 00000246
       RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
       RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
       RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
       R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
       R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
       FS:  0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
      00
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
       Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
      0)
       Stack:
        ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
        0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
        0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
       Call Trace:
        [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af
        [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e
        [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12
        [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16
        [<ffffffff8116a090>] blk_register_queue+0x47/0xf7
        [<ffffffff8116f527>] add_disk+0xdf/0x290
        [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop]
        [<ffffffffa0006000>] ? 0xffffffffa0005fff
        [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
        [<ffffffff81096804>] sys_init_module+0x9c/0x1e0
        [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b
       Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
       48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe =
      48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
       RIP  [<ffffffff8113ce61>] internal_create_group+0x2a/0x170
        RSP <ffff880007b3fde8>
       ---[ end trace a123eb592043acad ]---
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: Laurent Vivier <Laurent.Vivier@bull.net>
      Cc: stable@kernel.org
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      78f4bb36
  13. 10 3月, 2011 1 次提交
  14. 04 3月, 2011 1 次提交
  15. 03 3月, 2011 1 次提交
  16. 19 1月, 2011 1 次提交
    • S
      loop: queue_lock NULL pointer derefence in blk_throtl_exit · ee71a968
      Sergey Senozhatsky 提交于
      Performing
      $ sudo mount -o loop -o umask=0 /dev/sdb1 /mnt/
      mount: wrong fs type, bad option, bad superblock on /dev/loop0,
             missing codepage or helper program, or other error
             In some cases useful info is found in syslog - try
             dmesg | tail  or so
      
      $ sudo modprobe -r loop
      
      results in oops:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
       IP: [<ffffffff812479d4>] do_raw_spin_lock+0x14/0x122
       Process modprobe (pid: 6189, threadinfo ffff88009a898000, task ffff880154a88000)
       Call Trace:
        [<ffffffff81486788>] _raw_spin_lock_irq+0x4a/0x51
        [<ffffffff8123404b>] ? blk_throtl_exit+0x3b/0xa0
        [<ffffffff8105b120>] ? cancel_delayed_work_sync+0xd/0xf
        [<ffffffff8123404b>] blk_throtl_exit+0x3b/0xa0
        [<ffffffff81229bc8>] blk_release_queue+0x21/0x65
        [<ffffffff8123bb06>] kobject_release+0x51/0x66
        [<ffffffff8123bab5>] ? kobject_release+0x0/0x66
        [<ffffffff8123ce1e>] kref_put+0x43/0x4d
        [<ffffffff8123ba27>] kobject_put+0x47/0x4b
        [<ffffffff8122717c>] blk_cleanup_queue+0x56/0x5b
        [<ffffffffa01c3824>] loop_exit+0x68/0x844 [loop]
        [<ffffffff8107cccc>] sys_delete_module+0x1e8/0x25b
        [<ffffffff814864c9>] ? trace_hardirqs_on_thunk+0x3a/0x3f
        [<ffffffff81002112>] system_call_fastpath+0x16/0x1b
      
      because of an attempt to acquire NULL queue_lock.
      I added the same lines as in blk_queue_make_request -
      index 44e18c0..49e6a54 100644`fall back to embedded per-queue lock'.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      ee71a968
  17. 20 12月, 2010 1 次提交
  18. 17 12月, 2010 1 次提交
  19. 10 11月, 2010 1 次提交
    • C
      block: remove REQ_HARDBARRIER · 02e031cb
      Christoph Hellwig 提交于
      REQ_HARDBARRIER is dead now, so remove the leftovers.  What's left
      at this point is:
      
       - various checks inside the block layer.
       - sanity checks in bio based drivers.
       - now unused bio_empty_barrier helper.
       - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
         but Xen really needs to sort out it's barrier situaton.
       - setting of ordered tags in uas - dead code copied from old scsi
         drivers.
       - scsi different retry for barriers - it's dead and should have been
         removed when flushes were converted to FS requests.
       - blktrace handling of barriers - removed.  Someone who knows blktrace
         better should add support for REQ_FLUSH and REQ_FUA, though.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      02e031cb
  20. 28 10月, 2010 1 次提交
  21. 27 10月, 2010 1 次提交
  22. 05 10月, 2010 1 次提交
    • A
      block: autoconvert trivial BKL users to private mutex · 2a48fc0a
      Arnd Bergmann 提交于
      The block device drivers have all gained new lock_kernel
      calls from a recent pushdown, and some of the drivers
      were already using the BKL before.
      
      This turns the BKL into a set of per-driver mutexes.
      Still need to check whether this is safe to do.
      
      file=$1
      name=$2
      if grep -q lock_kernel ${file} ; then
          if grep -q 'include.*linux.mutex.h' ${file} ; then
                  sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
          else
                  sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
          fi
          sed -i ${file} \
              -e "/^#include.*linux.mutex.h/,$ {
                      1,/^\(static\|int\|long\)/ {
                           /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
      
      } }"  \
          -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
          -e '/[      ]*cycle_kernel_lock();/d'
      else
          sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                      -e '/cycle_kernel_lock()/d'
      fi
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      2a48fc0a
  23. 10 9月, 2010 3 次提交
    • T
      block/loop: implement REQ_FLUSH/FUA support · 6259f284
      Tejun Heo 提交于
      Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead.  Also,
      instead of checking file->f_op->fsync() directly, look at the value of
      vfs_fsync() and ignore -EINVAL return.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      6259f284
    • T
      block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() · 4913efe4
      Tejun Heo 提交于
      Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
      requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
      -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
      blk_queue_flush().
      
      blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
      device has write cache and can flush it, it should set REQ_FLUSH.  If
      the device can handle FUA writes, it should also set REQ_FUA.
      
      All blk_queue_ordered() users are converted.
      
      * ORDERED_DRAIN is mapped to 0 which is the default value.
      * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
      * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Pierre Ossman <drzeus@drzeus.cx>
      Cc: Stefan Weinhuber <wein@de.ibm.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      4913efe4
    • T
      block/loop: queue ordered mode should be DRAIN_FLUSH · 589d7ed0
      Tejun Heo 提交于
      loop implements FLUSH using fsync but was incorrectly setting its
      ordered mode to DRAIN.  Change it to DRAIN_FLUSH.  In practice, this
      doesn't change anything as loop doesn't make use of the block layer
      ordered implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      589d7ed0
  24. 23 8月, 2010 2 次提交
  25. 08 8月, 2010 3 次提交
  26. 22 5月, 2010 2 次提交
    • C
      sanitize vfs_fsync calling conventions · 8018ab05
      Christoph Hellwig 提交于
      Now that the last user passing a NULL file pointer is gone we can remove
      the redundant dentry argument and associated hacks inside vfs_fsynmc_range.
      
      The next step will be removig the dentry argument from ->fsync, but given
      the luck with the last round of method prototype changes I'd rather
      defer this until after the main merge window.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8018ab05
    • D
      generate "change" uevent for loop device · c3473c63
      David Zeuthen 提交于
      Recent udev versions probe loop devices for filesystems meaning that
      the /dev/disk hierarchy may contain useful entries such as
      
       $ ls -l /dev/disk/by-label/Fedora-12-x86_64-Live
       lrwxrwxrwx 1 root root 11 Mar 11 13:41 /dev/disk/by-label/Fedora-12-x86_64-Live -> ../../loop0
      
      Unfortunately, no "change" uevent is generated when the loop device is
      detached so the symlink persists. Additionally, no "change" uevent is
      guaranteed to be generated when attaching an fd or changing capacity.
      For example,  user space could open the loop device O_RDONLY (in fact,
      recent util-linux-ng does this) so udev's OPTIONS+="watch" machinery may
      not trigger the "change" uevent.
      
      This patch ensures that the "change" uevent is generated in all of
      these cases. As a result, the /dev/disk hierarchy works as expected
      for loop devices.
      Signed-off-by: NDavid Zeuthen <davidz@redhat.com>
      Acked-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      c3473c63
  27. 09 4月, 2010 1 次提交
  28. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6