1. 08 6月, 2018 2 次提交
  2. 05 6月, 2018 1 次提交
  3. 03 6月, 2018 1 次提交
  4. 31 5月, 2018 4 次提交
  5. 29 5月, 2018 4 次提交
  6. 17 5月, 2018 1 次提交
  7. 14 5月, 2018 1 次提交
  8. 09 5月, 2018 1 次提交
    • O
      block: consolidate struct request timestamp fields · 522a7775
      Omar Sandoval 提交于
      Currently, struct request has four timestamp fields:
      
      - A start time, set at get_request time, in jiffies, used for iostats
      - An I/O start time, set at start_request time, in ktime nanoseconds,
        used for blk-stats (i.e., wbt, kyber, hybrid polling)
      - Another start time and another I/O start time, used for cfq and bfq
      
      These can all be consolidated into one start time and one I/O start
      time, both in ktime nanoseconds, shaving off up to 16 bytes from struct
      request depending on the kernel config.
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      522a7775
  9. 04 5月, 2018 1 次提交
  10. 03 5月, 2018 6 次提交
    • C
      bcache: use pr_info() to inform duplicated CACHE_SET_IO_DISABLE set · 09a44ca2
      Coly Li 提交于
      It is possible that multiple I/O requests hits on failed cache device or
      backing device, therefore it is quite common that CACHE_SET_IO_DISABLE is
      set already when a task tries to set the bit from bch_cache_set_error().
      Currently the message "CACHE_SET_IO_DISABLE already set" is printed by
      pr_warn(), which might mislead users to think a serious fault happens in
      source code.
      
      This patch uses pr_info() to print the information in such situation,
      avoid extra worries. This information is helpful to understand bcache
      behavior in cache device failures, so I still keep them in source code.
      
      Fixes: 771f393e ("bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags")
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      09a44ca2
    • C
      bcache: set dc->io_disable to true in conditional_stop_bcache_device() · 4fd8e138
      Coly Li 提交于
      Commit 7e027ca4 ("bcache: add stop_when_cache_set_failed option to
      backing device") adds stop_when_cache_set_failed option and stops bcache
      device if stop_when_cache_set_failed is auto and there is dirty data on
      broken cache device. There might exists a small time gap that the cache
      set is released and set to NULL but bcache device is not released yet
      (because they are released in parallel). During this time gap, dc->c is
      NULL so CACHE_SET_IO_DISABLE won't be checked, and dc->io_disable is still
      false, so new coming I/O requests will be accepted and directly go into
      backing device as no cache set attached to. If there is dirty data on
      cache device, this behavior may introduce potential inconsistent data.
      
      This patch sets dc->io_disable to true before calling bcache_device_stop()
      to make sure the backing device will reject new coming I/O request as
      well, so even in the small time gap no I/O will directly go into backing
      device to corrupt data consistency.
      
      Fixes: 7e027ca4 ("bcache: add stop_when_cache_set_failed option to backing device")
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4fd8e138
    • C
      bcache: add wait_for_kthread_stop() in bch_allocator_thread() · ecb2ba8c
      Coly Li 提交于
      When CACHE_SET_IO_DISABLE is set on cache set flags, bcache allocator
      thread routine bch_allocator_thread() may stop the while-loops and
      exit. Then it is possible to observe the following kernel oops message,
      
      [  631.068366] bcache: bch_btree_insert() error -5
      [  631.069115] bcache: cached_dev_detach_finish() Caching disabled for sdf
      [  631.070220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [  631.070250] PGD 0 P4D 0
      [  631.070261] Oops: 0002 [#1] SMP PTI
      [snipped]
      [  631.070578] Workqueue: events cache_set_flush [bcache]
      [  631.070597] RIP: 0010:exit_creds+0x1b/0x50
      [  631.070610] RSP: 0018:ffffc9000705fe08 EFLAGS: 00010246
      [  631.070626] RAX: 0000000000000001 RBX: ffff880a622ad300 RCX: 000000000000000b
      [  631.070645] RDX: 0000000000000601 RSI: 000000000000000c RDI: 0000000000000000
      [  631.070663] RBP: ffff880a622ad300 R08: ffffea00190c66e0 R09: 0000000000000200
      [  631.070682] R10: ffff880a48123000 R11: ffff880000000000 R12: 0000000000000000
      [  631.070700] R13: ffff880a4b160e40 R14: ffff880a4b160000 R15: 0ffff880667e2530
      [  631.070719] FS:  0000000000000000(0000) GS:ffff880667e00000(0000) knlGS:0000000000000000
      [  631.070740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  631.070755] CR2: 0000000000000000 CR3: 000000000200a001 CR4: 00000000003606e0
      [  631.070774] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  631.070793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  631.070811] Call Trace:
      [  631.070828]  __put_task_struct+0x55/0x160
      [  631.070845]  kthread_stop+0xee/0x100
      [  631.070863]  cache_set_flush+0x11d/0x1a0 [bcache]
      [  631.070879]  process_one_work+0x146/0x340
      [  631.070892]  worker_thread+0x47/0x3e0
      [  631.070906]  kthread+0xf5/0x130
      [  631.070917]  ? max_active_store+0x60/0x60
      [  631.070930]  ? kthread_bind+0x10/0x10
      [  631.070945]  ret_from_fork+0x35/0x40
      [snipped]
      [  631.071017] RIP: exit_creds+0x1b/0x50 RSP: ffffc9000705fe08
      [  631.071033] CR2: 0000000000000000
      [  631.071045] ---[ end trace 011c63a24b22c927 ]---
      [  631.071085] bcache: bcache_device_free() bcache0 stopped
      
      The reason is when cache_set_flush() tries to call kthread_stop() to stop
      allocator thread, but it exits already due to cache device I/O errors.
      
      This patch adds wait_for_kthread_stop() at tail of bch_allocator_thread(),
      to prevent the thread routine exiting directly. Then the allocator thread
      can be blocked at wait_for_kthread_stop() and wait for cache_set_flush()
      to stop it by calling kthread_stop().
      
      changelog:
      v3: add Reviewed-by from Hannnes.
      v2: not directly return from allocator_wait(), move 'return 0' to tail of
          bch_allocator_thread().
      v1: initial version.
      
      Fixes: 771f393e ("bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags")
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ecb2ba8c
    • C
      bcache: count backing device I/O error for writeback I/O · bf78980f
      Coly Li 提交于
      Commit c7b7bd07 ("bcache: add io_disable to struct cached_dev")
      counts backing device I/O requets and set dc->io_disable to true if error
      counters exceeds dc->io_error_limit. But it only counts I/O errors for
      regular I/O request, neglects errors of write back I/Os when backing device
      is offline.
      
      This patch counts the errors of writeback I/Os, in dirty_endio() if
      bio->bi_status is  not 0, it means error happens when writing dirty keys
      to backing device, then bch_count_backing_io_errors() is called.
      
      By this fix, even there is no reqular I/O request coming, if writeback I/O
      errors exceed dc->io_error_limit, the bcache device may still be stopped
      for the broken backing device.
      
      Fixes: c7b7bd07 ("bcache: add io_disable to struct cached_dev")
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bf78980f
    • C
      bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error() · 6147305c
      Coly Li 提交于
      Commit c7b7bd07 ("bcache: add io_disable to struct cached_dev") tries
      to stop bcache device by calling bcache_device_stop() when too many I/O
      errors happened on backing device. But if there is internal I/O happening
      on cache device (writeback scan, garbage collection, etc), a regular I/O
      request triggers the internal I/Os may still holds a refcount of dc->count,
      and the refcount may only be dropped after the internal I/O stopped.
      
      By this patch, bch_cached_dev_error() will check if the backing device is
      attached to a cache set, if yes that CACHE_SET_IO_DISABLE will be set to
      flags of this cache set. Then internal I/Os on cache device will be
      rejected and stopped immediately, and the bcache device can be stopped.
      
      For people who are not familiar with the interesting refcount dependance,
      let me explain a bit more how the fix works. Example the writeback thread
      will scan cache device for dirty data writeback purpose. Before it stopps,
      it holds a refcount of dc->count. When CACHE_SET_IO_DISABLE bit is set,
      the internal I/O will stopped and the while-loop in bch_writeback_thread()
      quits and calls cached_dev_put() to drop dc->count. If this is the last
      refcount to drop, then cached_dev_detach_finish() will be called. In this
      call back function, in turn closure_put(dc->disk.cl) is called to drop a
      refcount of closure dc->disk.cl. If this is the last refcount of this
      closure to drop, then cached_dev_flush() will be called. Then the cached
      device is freed. So if CACHE_SET_IO_DISABLE is not set, the bache device
      can not be stopped until all inernal cache device I/O stopped. For large
      size cache device, and writeback thread competes locks with gc thread,
      there might be a quite long time to wait.
      
      Fixes: c7b7bd07 ("bcache: add io_disable to struct cached_dev")
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6147305c
    • C
      bcache: store disk name in struct cache and struct cached_dev · 6e916a7e
      Coly Li 提交于
      Current code uses bdevname() or bio_devname() to reference gendisk
      disk name when bcache needs to display the disk names in kernel message.
      It was safe before bcache device failure handling patch set merged in,
      because when devices are failed, there was deadlock to prevent bcache
      printing error messages with gendisk disk name. But after the failure
      handling patch set merged, the deadlock is fixed, so it is possible
      that the gendisk structure bdev->hd_disk is released when bdevname() is
      called to reference bdev->bd_disk->disk_name[]. This is why I receive
      bug report of NULL pointers deference panic.
      
      This patch stores gendisk disk name in a buffer inside struct cache and
      struct cached_dev, then print out the offline device name won't reference
      bdev->hd_disk anymore. And this patch also avoids extra function calls
      of bdevname() and bio_devnmae().
      
      Changelog:
      v3, add Reviewed-by from Hannes.
      v2, call bdevname() earlier in register_bdev()
      v1, first version with segguestion from Junhui Tang.
      
      Fixes: c7b7bd07 ("bcache: add io_disable to struct cached_dev")
      Fixes: 5138ac67 ("bcache: fix misleading error message in bch_count_io_errors()")
      Signed-off-by: NColy Li <colyli@suse.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6e916a7e
  11. 01 5月, 2018 2 次提交
  12. 30 4月, 2018 2 次提交
  13. 09 4月, 2018 3 次提交
  14. 05 4月, 2018 4 次提交
  15. 04 4月, 2018 7 次提交