1. 04 5月, 2017 4 次提交
    • O
      blk-mq-debugfs: clean up flag definitions · 1a435111
      Omar Sandoval 提交于
      Make sure the spelled out flag names match the definition. This also
      adds a missing hctx state, BLK_MQ_S_START_ON_RUN, and a missing
      cmd_flag, __REQ_NOUNMAP.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1a435111
    • O
      blk-mq-debugfs: separate flags with | · bec03d6b
      Omar Sandoval 提交于
      This reads more naturally than spaces.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bec03d6b
    • P
      block/mq: Cure cpu hotplug lock inversion · eabe0659
      Peter Zijlstra 提交于
      By poking at /debug/sched_features I triggered the following splat:
      
       [] ======================================================
       [] WARNING: possible circular locking dependency detected
       [] 4.11.0-00873-g964c8b7-dirty #694 Not tainted
       [] ------------------------------------------------------
       [] bash/2109 is trying to acquire lock:
       []  (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff8120cb8b>] static_key_slow_dec+0x1b/0x50
       []
       [] but task is already holding lock:
       []  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
       []
       [] which lock already depends on the new lock.
       []
       []
       [] the existing dependency chain (in reverse order) is:
       []
       [] -> #2 (&sb->s_type->i_mutex_key#4){+++++.}:
       []        lock_acquire+0x100/0x210
       []        down_write+0x28/0x60
       []        start_creating+0x5e/0xf0
       []        debugfs_create_dir+0x13/0x110
       []        blk_mq_debugfs_register+0x21/0x70
       []        blk_mq_register_dev+0x64/0xd0
       []        blk_register_queue+0x6a/0x170
       []        device_add_disk+0x22d/0x440
       []        loop_add+0x1f3/0x280
       []        loop_init+0x104/0x142
       []        do_one_initcall+0x43/0x180
       []        kernel_init_freeable+0x1de/0x266
       []        kernel_init+0xe/0x100
       []        ret_from_fork+0x31/0x40
       []
       [] -> #1 (all_q_mutex){+.+.+.}:
       []        lock_acquire+0x100/0x210
       []        __mutex_lock+0x6c/0x960
       []        mutex_lock_nested+0x1b/0x20
       []        blk_mq_init_allocated_queue+0x37c/0x4e0
       []        blk_mq_init_queue+0x3a/0x60
       []        loop_add+0xe5/0x280
       []        loop_init+0x104/0x142
       []        do_one_initcall+0x43/0x180
       []        kernel_init_freeable+0x1de/0x266
       []        kernel_init+0xe/0x100
       []        ret_from_fork+0x31/0x40
      
       []  *** DEADLOCK ***
       []
       [] 3 locks held by bash/2109:
       []  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff81292bcd>] vfs_write+0x17d/0x1a0
       []  #1:  (debugfs_srcu){......}, at: [<ffffffff8155a90d>] full_proxy_write+0x5d/0xd0
       []  #2:  (&sb->s_type->i_mutex_key#4){+++++.}, at: [<ffffffff81140216>] sched_feat_write+0x86/0x170
       []
       [] stack backtrace:
       [] CPU: 9 PID: 2109 Comm: bash Not tainted 4.11.0-00873-g964c8b7-dirty #694
       [] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
       [] Call Trace:
      
       []  lock_acquire+0x100/0x210
       []  get_online_cpus+0x2a/0x90
       []  static_key_slow_dec+0x1b/0x50
       []  static_key_disable+0x20/0x30
       []  sched_feat_write+0x131/0x170
       []  full_proxy_write+0x97/0xd0
       []  __vfs_write+0x28/0x120
       []  vfs_write+0xb5/0x1a0
       []  SyS_write+0x49/0xa0
       []  entry_SYSCALL_64_fastpath+0x23/0xc2
      
      This is because of the cpu hotplug lock rework. Break the chain at #1
      by reversing the lock acquisition order. This way i_mutex_key#4 no
      longer depends on cpu_hotplug_lock and things are good.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      eabe0659
    • J
      blk-mq: don't use sync workqueue flushing from drivers · 2719aa21
      Jens Axboe 提交于
      A previous commit introduced the sync flush, which we need from
      internal callers like blk_mq_quiesce_queue(). However, we also
      call the stop helpers from drivers, particularly from ->queue_rq()
      when we have to stop processing for a bit. We can't block from
      those locations, and we don't have to guarantee that we're
      fully flushed.
      
      Fixes: 9f993737 ("blk-mq: unify hctx delayed_run_work and run_work")
      Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2719aa21
  2. 03 5月, 2017 1 次提交
  3. 02 5月, 2017 3 次提交
  4. 28 4月, 2017 4 次提交
  5. 27 4月, 2017 10 次提交
  6. 24 4月, 2017 1 次提交
  7. 22 4月, 2017 2 次提交
    • I
      block: get rid of blk_integrity_revalidate() · 19b7ccf8
      Ilya Dryomov 提交于
      Commit 25520d55 ("block: Inline blk_integrity in struct gendisk")
      introduced blk_integrity_revalidate(), which seems to assume ownership
      of the stable pages flag and unilaterally clears it if no blk_integrity
      profile is registered:
      
          if (bi->profile)
                  disk->queue->backing_dev_info->capabilities |=
                          BDI_CAP_STABLE_WRITES;
          else
                  disk->queue->backing_dev_info->capabilities &=
                          ~BDI_CAP_STABLE_WRITES;
      
      It's called from revalidate_disk() and rescan_partitions(), making it
      impossible to enable stable pages for drivers that support partitions
      and don't use blk_integrity: while the call in revalidate_disk() can be
      trivially worked around (see zram, which doesn't support partitions and
      hence gets away with zram_revalidate_disk()), rescan_partitions() can
      be triggered from userspace at any time.  This breaks rbd, where the
      ceph messenger is responsible for generating/verifying CRCs.
      
      Since blk_integrity_{un,}register() "must" be used for (un)registering
      the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES
      setting there.  This way drivers that call blk_integrity_register() and
      use integrity infrastructure won't interfere with drivers that don't
      but still want stable pages.
      
      Fixes: 25520d55 ("block: Inline blk_integrity in struct gendisk")
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # 4.4+, needs backporting
      Tested-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      19b7ccf8
    • B
      blk-mq: Fix preempt count imbalance · abc25a69
      Bart Van Assche 提交于
      Avoid that the following kernel bug gets triggered:
      
      BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:349
      in_atomic(): 1, irqs_disabled(): 0, pid: 8019, name: find
      CPU: 10 PID: 8019 Comm: find Tainted: G        W I     4.11.0-rc4-dbg+ #2
      Call Trace:
       dump_stack+0x68/0x93
       ___might_sleep+0x16e/0x230
       __might_sleep+0x4a/0x80
       __ext4_get_inode_loc+0x1e0/0x4e0
       ext4_iget+0x70/0xbc0
       ext4_iget_normal+0x2f/0x40
       ext4_lookup+0xb6/0x1f0
       lookup_slow+0x104/0x1e0
       walk_component+0x19a/0x330
       path_lookupat+0x4b/0x100
       filename_lookup+0x9a/0x110
       user_path_at_empty+0x36/0x40
       vfs_statx+0x67/0xc0
       SYSC_newfstatat+0x20/0x40
       SyS_newfstatat+0xe/0x10
       entry_SYSCALL_64_fastpath+0x18/0xad
      
      This happens since the big if/else in blk_mq_make_request() doesn't
      have final else section that also drops the ctx. Add that.
      
      Fixes: b00c53e8 ("blk-mq: fix schedule-while-atomic with scheduler attached")
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Omar Sandoval <osandov@fb.com>
      
      Added a bit more to the commit log.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      abc25a69
  8. 21 4月, 2017 13 次提交
  9. 20 4月, 2017 2 次提交