1. 10 4月, 2018 1 次提交
    • T
      block/loop: fix deadlock after loop_set_status · 1e047eaa
      Tetsuo Handa 提交于
      syzbot is reporting deadlocks at __blkdev_get() [1].
      
      ----------------------------------------
      [   92.493919] systemd-udevd   D12696   525      1 0x00000000
      [   92.495891] Call Trace:
      [   92.501560]  schedule+0x23/0x80
      [   92.502923]  schedule_preempt_disabled+0x5/0x10
      [   92.504645]  __mutex_lock+0x416/0x9e0
      [   92.510760]  __blkdev_get+0x73/0x4f0
      [   92.512220]  blkdev_get+0x12e/0x390
      [   92.518151]  do_dentry_open+0x1c3/0x2f0
      [   92.519815]  path_openat+0x5d9/0xdc0
      [   92.521437]  do_filp_open+0x7d/0xf0
      [   92.527365]  do_sys_open+0x1b8/0x250
      [   92.528831]  do_syscall_64+0x6e/0x270
      [   92.530341]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [   92.931922] 1 lock held by systemd-udevd/525:
      [   92.933642]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x73/0x4f0
      ----------------------------------------
      
      The reason of deadlock turned out that wait_event_interruptible() in
      blk_queue_enter() got stuck with bdev->bd_mutex held at __blkdev_put()
      due to q->mq_freeze_depth == 1.
      
      ----------------------------------------
      [   92.787172] a.out           S12584   634    633 0x80000002
      [   92.789120] Call Trace:
      [   92.796693]  schedule+0x23/0x80
      [   92.797994]  blk_queue_enter+0x3cb/0x540
      [   92.803272]  generic_make_request+0xf0/0x3d0
      [   92.807970]  submit_bio+0x67/0x130
      [   92.810928]  submit_bh_wbc+0x15e/0x190
      [   92.812461]  __block_write_full_page+0x218/0x460
      [   92.815792]  __writepage+0x11/0x50
      [   92.817209]  write_cache_pages+0x1ae/0x3d0
      [   92.825585]  generic_writepages+0x5a/0x90
      [   92.831865]  do_writepages+0x43/0xd0
      [   92.836972]  __filemap_fdatawrite_range+0xc1/0x100
      [   92.838788]  filemap_write_and_wait+0x24/0x70
      [   92.840491]  __blkdev_put+0x69/0x1e0
      [   92.841949]  blkdev_close+0x16/0x20
      [   92.843418]  __fput+0xda/0x1f0
      [   92.844740]  task_work_run+0x87/0xb0
      [   92.846215]  do_exit+0x2f5/0xba0
      [   92.850528]  do_group_exit+0x34/0xb0
      [   92.852018]  SyS_exit_group+0xb/0x10
      [   92.853449]  do_syscall_64+0x6e/0x270
      [   92.854944]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [   92.943530] 1 lock held by a.out/634:
      [   92.945105]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x1e0
      ----------------------------------------
      
      The reason of q->mq_freeze_depth == 1 turned out that loop_set_status()
      forgot to call blk_mq_unfreeze_queue() at error paths for
      info->lo_encrypt_type != NULL case.
      
      ----------------------------------------
      [   37.509497] CPU: 2 PID: 634 Comm: a.out Tainted: G        W        4.16.0+ #457
      [   37.513608] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
      [   37.518832] RIP: 0010:blk_freeze_queue_start+0x17/0x40
      [   37.521778] RSP: 0018:ffffb0c2013e7c60 EFLAGS: 00010246
      [   37.524078] RAX: 0000000000000000 RBX: ffff8b07b1519798 RCX: 0000000000000000
      [   37.527015] RDX: 0000000000000002 RSI: ffffb0c2013e7cc0 RDI: ffff8b07b1519798
      [   37.529934] RBP: ffffb0c2013e7cc0 R08: 0000000000000008 R09: 47a189966239b898
      [   37.532684] R10: dad78b99b278552f R11: 9332dca72259d5ef R12: ffff8b07acd73678
      [   37.535452] R13: 0000000000004c04 R14: 0000000000000000 R15: ffff8b07b841e940
      [   37.538186] FS:  00007fede33b9740(0000) GS:ffff8b07b8e80000(0000) knlGS:0000000000000000
      [   37.541168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   37.543590] CR2: 00000000206fdf18 CR3: 0000000130b30006 CR4: 00000000000606e0
      [   37.546410] Call Trace:
      [   37.547902]  blk_freeze_queue+0x9/0x30
      [   37.549968]  loop_set_status+0x67/0x3c0 [loop]
      [   37.549975]  loop_set_status64+0x3b/0x70 [loop]
      [   37.549986]  lo_ioctl+0x223/0x810 [loop]
      [   37.549995]  blkdev_ioctl+0x572/0x980
      [   37.550003]  block_ioctl+0x34/0x40
      [   37.550006]  do_vfs_ioctl+0xa7/0x6d0
      [   37.550017]  ksys_ioctl+0x6b/0x80
      [   37.573076]  SyS_ioctl+0x5/0x10
      [   37.574831]  do_syscall_64+0x6e/0x270
      [   37.576769]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      ----------------------------------------
      
      [1] https://syzkaller.appspot.com/bug?id=cd662bc3f6022c0979d01a262c318fab2ee9b56fSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: Nsyzbot <bot+48594378e9851eab70bcd6f99327c7db58c5a28a@syzkaller.appspotmail.com>
      Fixes: ecdd0959 ("block/loop: fix race between I/O and set_status")
      Cc: Ming Lei <tom.leiming@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1e047eaa
  2. 28 3月, 2018 2 次提交
    • O
      loop: use killable lock in ioctls · 3148ffbd
      Omar Sandoval 提交于
      Even after the previous patch to drop lo_ctl_mutex while calling
      vfs_getattr(), there are other cases where we can end up sleeping for a
      long time while holding lo_ctl_mutex. Let's avoid the uninterruptible
      sleep from the ioctls.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3148ffbd
    • O
      loop: don't call into filesystem while holding lo_ctl_mutex · 2d1d4c1e
      Omar Sandoval 提交于
      We hit an issue where a loop device on NFS was stuck in
      loop_get_status() doing vfs_getattr() after the NFS server died, which
      caused a pile-up of uninterruptible processes waiting on lo_ctl_mutex.
      There's no reason to hold this lock while we wait on the filesystem;
      let's drop it so that other processes can do their thing. We need to
      grab a reference on lo_backing_file while we use it, and we can get rid
      of the check on lo_device, which has been unnecessary since commit
      a34c0ae9ebd6 ("[PATCH] loop: remove the bio remapping capability") in
      the linux-history tree.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2d1d4c1e
  3. 18 3月, 2018 1 次提交
    • B
      block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> · 233bde21
      Bart Van Assche 提交于
      It happens often while I'm preparing a patch for a block driver that
      I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
      available for this driver? Do I have to introduce definitions of these
      constants before I can use these constants? To avoid this confusion,
      move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
      <linux/blkdev.h> header file such that these become available for all
      block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
      header file conditional to avoid that including that header file after
      <linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
      redefinition.
      
      Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
      not been removed from uapi header files nor from NAND drivers in
      which these constants are used for another purpose than converting
      block layer offsets and sizes into a number of sectors.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      233bde21
  4. 09 3月, 2018 4 次提交
    • R
      loop: Fix lost writes caused by missing flag · 1d037577
      Ross Zwisler 提交于
      The following commit:
      
      commit aa4d8616 ("block: loop: switch to VFS ITER_BVEC")
      
      replaced __do_lo_send_write(), which used ITER_KVEC iterators, with
      lo_write_bvec() which uses ITER_BVEC iterators.  In this change, though,
      the WRITE flag was lost:
      
      -       iov_iter_kvec(&from, ITER_KVEC | WRITE, &kvec, 1, len);
      +       iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len);
      
      This flag is necessary for the DAX case because we make decisions based on
      whether or not the iterator is a READ or a WRITE in dax_iomap_actor() and
      in dax_iomap_rw().
      
      We end up going through this path in configurations where we combine a PMEM
      device with 4k sectors, a loopback device and DAX.  The consequence of this
      missed flag is that what we intend as a write actually turns into a read in
      the DAX code, so no data is ever written.
      
      The very simplest test case is to create a loopback device and try and
      write a small string to it, then hexdump a few bytes of the device to see
      if the write took.  Without this patch you read back all zeros, with this
      you read back the string you wrote.
      
      For XFS this causes us to fail or panic during the following xfstests:
      
      	xfs/074 xfs/078 xfs/216 xfs/217 xfs/250
      
      For ext4 we have a similar issue where writes never happen, but we don't
      currently have any xfstests that use loopback and show this issue.
      
      Fix this by restoring the WRITE flag argument to iov_iter_bvec().  This
      causes the xfstests to all pass.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      Fixes: commit aa4d8616 ("block: loop: switch to VFS ITER_BVEC")
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1d037577
    • M
      cdrom: do not call check_disk_change() inside cdrom_open() · 2bbea6e1
      Maurizio Lombardi 提交于
      when mounting an ISO filesystem sometimes (very rarely)
      the system hangs because of a race condition between two tasks.
      
      PID: 6766   TASK: ffff88007b2a6dd0  CPU: 0   COMMAND: "mount"
       #0 [ffff880078447ae0] __schedule at ffffffff8168d605
       #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
       #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
       #3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
       #4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
       #5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
       #6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
       #7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
       #8 [ffff880078447da8] mount_bdev at ffffffff81202570
       #9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
      #10 [ffff880078447e28] mount_fs at ffffffff81202d09
      #11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
      #12 [ffff880078447ea8] do_mount at ffffffff81220fee
      #13 [ffff880078447f28] sys_mount at ffffffff812218d6
      #14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
          RIP: 00007fd9ea914e9a  RSP: 00007ffd5d9bf648  RFLAGS: 00010246
          RAX: 00000000000000a5  RBX: ffffffff81698c49  RCX: 0000000000000010
          RDX: 00007fd9ec2bc210  RSI: 00007fd9ec2bc290  RDI: 00007fd9ec2bcf30
          RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000010
          R10: 00000000c0ed0001  R11: 0000000000000206  R12: 00007fd9ec2bc040
          R13: 00007fd9eb6b2380  R14: 00007fd9ec2bc210  R15: 00007fd9ec2bcf30
          ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b
      
      This task was trying to mount the cdrom.  It allocated and configured a
      super_block struct and owned the write-lock for the super_block->s_umount
      rwsem. While exclusively owning the s_umount lock, it called
      sr_block_ioctl and waited to acquire the global sr_mutex lock.
      
      PID: 6785   TASK: ffff880078720fb0  CPU: 0   COMMAND: "systemd-udevd"
       #0 [ffff880078417898] __schedule at ffffffff8168d605
       #1 [ffff880078417900] schedule at ffffffff8168dc59
       #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
       #3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
       #4 [ffff8800784179d0] down_read at ffffffff8168cde0
       #5 [ffff8800784179e8] get_super at ffffffff81201cc7
       #6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
       #7 [ffff880078417a40] flush_disk at ffffffff8123a94b
       #8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
       #9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
      #10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
      #11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
      #12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
      #13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
      #14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
      #15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
      #16 [ffff880078417d00] do_last at ffffffff8120d53d
      #17 [ffff880078417db0] path_openat at ffffffff8120e6b2
      #18 [ffff880078417e48] do_filp_open at ffffffff8121082b
      #19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
      #20 [ffff880078417f70] sys_open at ffffffff811fde4e
      #21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
          RIP: 00007f29438b0c20  RSP: 00007ffc76624b78  RFLAGS: 00010246
          RAX: 0000000000000002  RBX: ffffffff81698c49  RCX: 0000000000000000
          RDX: 00007f2944a5fa70  RSI: 00000000000a0800  RDI: 00007f2944a5fa70
          RBP: 00007f2944a5f540   R8: 0000000000000000   R9: 0000000000000020
          R10: 00007f2943614c40  R11: 0000000000000246  R12: ffffffff811fde4e
          R13: ffff880078417f78  R14: 000000000000000c  R15: 00007f2944a4b010
          ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b
      
      This task tried to open the cdrom device, the sr_block_open function
      acquired the global sr_mutex lock. The call to check_disk_change()
      then saw an event flag indicating a possible media change and tried
      to flush any cached data for the device.
      As part of the flush, it tried to acquire the super_block->s_umount
      lock associated with the cdrom device.
      This was the same super_block as created and locked by the previous task.
      
      The first task acquires the s_umount lock and then the sr_mutex_lock;
      the second task acquires the sr_mutex_lock and then the s_umount lock.
      
      This patch fixes the issue by moving check_disk_change() out of
      cdrom_open() and let the caller take care of it.
      Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2bbea6e1
    • B
      block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() · 8b904b5b
      Bart Van Assche 提交于
      This patch has been generated as follows:
      
      for verb in set_unlocked clear_unlocked set clear; do
        replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \
          $(git grep -lw queue_flag_${verb} drivers block/bsg*)
      done
      
      Except for protecting all queue flag changes with the queue lock
      this patch does not change any functionality.
      
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Ming Lei <ming.lei@redhat.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8b904b5b
    • B
      mtip32xx: Use the blk_queue_flag_*() functions · 4e699cb9
      Bart Van Assche 提交于
      Use the blk_queue_flag_*() functions instead of open-coding these.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Ming Lei <ming.lei@redhat.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4e699cb9
  5. 08 3月, 2018 1 次提交
  6. 07 3月, 2018 1 次提交
  7. 01 3月, 2018 6 次提交
  8. 28 2月, 2018 1 次提交
  9. 27 2月, 2018 1 次提交
  10. 02 2月, 2018 1 次提交
    • A
      block: skd: fix incorrect linux/slab_def.h inclusion · 1d518775
      Arnd Bergmann 提交于
      skd includes slab_def.h to get access to the slab cache object size.
      However, including this header breaks when we use SLUB or SLOB instead of
      the SLAB allocator, since the structure layout is completely different,
      as shown by this warning when we build this driver in one of the invalid
      configurations with link-time optimizations enabled:
      
      include/linux/slab.h:715:0: error: type of 'kmem_cache_size' does not match original declaration [-Werror=lto-type-mismatch]
       unsigned int kmem_cache_size(struct kmem_cache *s);
      
      mm/slab_common.c:77:14: note: 'kmem_cache_size' was previously declared here
       unsigned int kmem_cache_size(struct kmem_cache *s)
                    ^
      mm/slab_common.c:77:14: note: code may be misoptimized unless -fno-strict-aliasing is used
      include/linux/slab.h:147:0: error: type of 'kmem_cache_destroy' does not match original declaration [-Werror=lto-type-mismatch]
       void kmem_cache_destroy(struct kmem_cache *);
      
      mm/slab_common.c:858:6: note: 'kmem_cache_destroy' was previously declared here
       void kmem_cache_destroy(struct kmem_cache *s)
            ^
      mm/slab_common.c:858:6: note: code may be misoptimized unless -fno-strict-aliasing is used
      include/linux/slab.h:140:0: error: type of 'kmem_cache_create' does not match original declaration [-Werror=lto-type-mismatch]
       struct kmem_cache *kmem_cache_create(const char *name, size_t size,
      
      mm/slab_common.c:534:1: note: 'kmem_cache_create' was previously declared here
       kmem_cache_create(const char *name, size_t size, size_t align,
       ^
      
      This removes the header inclusion and instead uses the kmem_cache_size()
      interface to get the size in a reliable way.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1d518775
  11. 01 2月, 2018 1 次提交
    • S
      virtio_blk: print capacity at probe time · daf2a501
      Stefan Hajnoczi 提交于
      Print the capacity of the block device when the driver is probed.  Many
      users expect this since SCSI disks (sd) do it.  Moreover, kernel dmesg
      output is the primary source of troubleshooting information so it's
      helpful to include the disk size there.
      
      The capacity is already printed by virtio_blk when a resize event
      occurs.  Extract the code and reuse it from virtblk_probe().
      
      This patch also adds the block device name to the message so it can be
      correlated with a specific device:
      
        virtio_blk virtio0: [vda] 20971520 512-byte logical blocks (10.7 GB/10.0 GiB)
      
      Cc: Rodrigo A B Freire <rfreire@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      daf2a501
  12. 31 1月, 2018 1 次提交
    • M
      blk-mq: introduce BLK_STS_DEV_RESOURCE · 86ff7c2a
      Ming Lei 提交于
      This status is returned from driver to block layer if device related
      resource is unavailable, but driver can guarantee that IO dispatch
      will be triggered in future when the resource is available.
      
      Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
      returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
      a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
      3 ms because both scsi-mq and nvmefc are using that magic value.
      
      If a driver can make sure there is in-flight IO, it is safe to return
      BLK_STS_DEV_RESOURCE because:
      
      1) If all in-flight IOs complete before examining SCHED_RESTART in
      blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
      is run immediately in this case by blk_mq_dispatch_rq_list();
      
      2) if there is any in-flight IO after/when examining SCHED_RESTART
      in blk_mq_dispatch_rq_list():
      - if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
      - otherwise, this request will be dispatched after any in-flight IO is
        completed via blk_mq_sched_restart()
      
      3) if SCHED_RESTART is set concurently in context because of
      BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
      cases and make sure IO hang can be avoided.
      
      One invariant is that queue will be rerun if SCHED_RESTART is set.
      Suggested-by: NJens Axboe <axboe@kernel.dk>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86ff7c2a
  13. 30 1月, 2018 1 次提交
  14. 29 1月, 2018 3 次提交
  15. 27 1月, 2018 1 次提交
  16. 17 1月, 2018 1 次提交
    • T
      aoe: use ktime_t instead of timeval · 85cf955d
      Tina Ruchandani 提交于
      'struct frame' uses two variables to store the sent timestamp - 'struct
      timeval' and jiffies. jiffies is used to avoid discrepancies caused by
      updates to system time. 'struct timeval' is deprecated because it uses
      32-bit representation for seconds which will overflow in year 2038.
      
      This patch does the following:
      - Replace the use of 'struct timeval' and jiffies with ktime_t, which
        is the recommended type for timestamping
      - ktime_t provides both long range (like jiffies) and high resolution
        (like timeval). Using ktime_get (monotonic time) instead of wall-clock
        time prevents any discprepancies caused by updates to system time.
      
      [updates by Arnd below]
      The original patch from Tina never went anywhere as we discussed how
      to keep the impact on performance minimal. I've started over now but
      arrived at basically the same patch that she had originally, except for
      an slightly improved tsince_hr() function. I'm making it more robust
      against overflows, and also optimize explicitly for the common case
      in which a frame is less than 4.2 seconds old, using only a 32-bit
      division in that case.
      
      This should make the new version more efficient than the old code,
      since we replace the existing two 32-bit division in do_gettimeofday()
      plus one multiplication with a single single 32-bit division in
      tsince_hr() and drop the double bookkeeping. It's also more efficient
      than the ktime_get_us() API we discussed before, since that would
      also rely on multiple divisions.
      
      Link: https://lists.linaro.org/pipermail/y2038/2015-May/000276.htmlSigned-off-by: NTina Ruchandani <ruchandani.tina@gmail.com>
      Cc: Ed Cashin <ed.cashin@acm.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85cf955d
  17. 11 1月, 2018 2 次提交
    • A
      null_blk: remove explicit 'select FAULT_INJECTION' · 33f782c4
      Arnd Bergmann 提交于
      Selecting FAULT_INJECTION causes a Kconfig warning when CONFIG_DEBUG_KERNEL
      is not set:
      
      warning: (BLK_DEV_NULL_BLK && DRM_I915_SELFTEST) selects FAULT_INJECTION which has unmet direct dependencies (DEBUG_KERNEL)
      
      The other drivers that use FAULT_INJECTION tend to have a separate
      Kconfig symbol for turning on that feature, so let's do the same
      thing here. This may add a bit more complexity than we like, but
      it avoids the warning and is more consistent with the rest of the
      kernel.
      
      Fixes: 93b57046 ("null_blk: add option for managing IO timeouts")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      33f782c4
    • J
      null_blk: add option for managing IO timeouts · 93b57046
      Jens Axboe 提交于
      Use the fault injection framework to provide a way for null_blk
      to configure timeouts. This only works for queue_mode 1 and 2,
      since the bio mode doesn't have code for tracking timeouts.
      
      Let's say you want to have a 10% chance of timing out every
      100,000 requests, and for 5 total timeouts, you could do:
      
      modprobe null_blk timeout="100000,10,0,5"
      
      This is useful for adding blktests to test that IO timeouts
      are handled appropriately.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      93b57046
  18. 10 1月, 2018 3 次提交
  19. 07 1月, 2018 3 次提交
  20. 06 1月, 2018 2 次提交
    • B
      pktcdvd: Fix a recently introduced NULL pointer dereference · 882d4171
      Bart Van Assche 提交于
      Call bdev_get_queue(bdev) after bdev->bd_disk has been initialized
      instead of just before that pointer has been initialized. This patch
      avoids that the following command
      
      pktsetup 1 /dev/sr0
      
      triggers the following kernel crash:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000548
      IP: pkt_setup_dev+0x2db/0x670 [pktcdvd]
      CPU: 2 PID: 724 Comm: pktsetup Not tainted 4.15.0-rc4-dbg+ #1
      Call Trace:
       pkt_ctl_ioctl+0xce/0x1c0 [pktcdvd]
       do_vfs_ioctl+0x8e/0x670
       SyS_ioctl+0x3c/0x70
       entry_SYSCALL_64_fastpath+0x23/0x9a
      Reported-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Fixes: commit ca18d6f7 ("block: Make most scsi_req_init() calls implicit")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Tested-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: <stable@vger.kernel.org> # v4.13
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      882d4171
    • B
      pktcdvd: Fix pkt_setup_dev() error path · 5a0ec388
      Bart Van Assche 提交于
      Commit 523e1d39 ("block: make gendisk hold a reference to its queue")
      modified add_disk() and disk_release() but did not update any of the
      error paths that trigger a put_disk() call after disk->queue has been
      assigned. That introduced the following behavior in the pktcdvd driver
      if pkt_new_dev() fails:
      
      Kernel BUG at 00000000e98fd882 [verbose debug info unavailable]
      
      Since disk_release() calls blk_put_queue() anyway if disk->queue != NULL,
      fix this by removing the blk_cleanup_queue() call from the pkt_setup_dev()
      error path.
      
      Fixes: commit 523e1d39 ("block: make gendisk hold a reference to its queue")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: <stable@vger.kernel.org> # v3.2
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5a0ec388
  21. 05 1月, 2018 1 次提交
  22. 03 1月, 2018 1 次提交
  23. 21 12月, 2017 1 次提交
    • J
      null_blk: unalign call_single_data · 0864fe09
      Jens Axboe 提交于
      Commit 966a9671 randomly added alignment to this structure, but
      it's actually detrimental to performance of null_blk. Test case:
      
      Running on both the home and remote node shows a ~5% degradation
      in performance.
      
      While in there, move blk_status_t to the hole after the integer tag
      in the nullb_cmd structure. After this patch, we shrink the size
      from 192 to 152 bytes.
      
      Fixes: 966a9671 ("smp: Avoid using two cache lines for struct call_single_data")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0864fe09