1. 25 7月, 2020 9 次提交
  2. 01 7月, 2020 3 次提交
  3. 15 6月, 2020 4 次提交
    • C
      bcache: pr_info() format clean up in bcache_device_init() · 4b25bbf5
      Coly Li 提交于
      scripts/checkpatch.pl reports following warning for patch
      ("bcache: check and adjust logical block size for backing devices"),
          WARNING: quoted string split across lines
          #146: FILE: drivers/md/bcache/super.c:896:
          +  pr_info("%s: sb/logical block size (%u) greater than page size "
          +	       "(%lu) falling back to device logical block size (%u)",
      
      There are two things to fix up,
      - The kernel message print should be in a single line.
      - pr_info() won't automatically add new line since v5.8, a '\n' should
        be added.
      
      This patch just does the above cleanup in bcache_device_init().
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4b25bbf5
    • C
      bcache: use delayed kworker fo asynchronous devices registration · ee4a36f4
      Coly Li 提交于
      This patch changes the asynchronous registration kworker to a delayed
      kworker. There is probability queue_work() queues the async registration
      kworker to the same CPU (even though very little), then the process
      which writing sysfs interface to reigster bcache device may won't return
      immeidately. queue_delayed_work() in this patch will delay 10 jiffies
      before insert the kworker to run queue, which makes sure the registering
      process may always returns to user space in time.
      
      Fixes: 9e23ccf8 ("bcache: asynchronous devices registration")
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ee4a36f4
    • M
      bcache: check and adjust logical block size for backing devices · dcacbc12
      Mauricio Faria de Oliveira 提交于
      It's possible for a block driver to set logical block size to
      a value greater than page size incorrectly; e.g. bcache takes
      the value from the superblock, set by the user w/ make-bcache.
      
      This causes a BUG/NULL pointer dereference in the path:
      
        __blkdev_get()
        -> set_init_blocksize() // set i_blkbits based on ...
           -> bdev_logical_block_size()
              -> queue_logical_block_size() // ... this value
        -> bdev_disk_changed()
           ...
           -> blkdev_readpage()
              -> block_read_full_page()
                 -> create_page_buffers() // size = 1 << i_blkbits
                    -> create_empty_buffers() // give size/take pointer
                       -> alloc_page_buffers() // return NULL
                       .. BUG!
      
      Because alloc_page_buffers() is called with size > PAGE_SIZE,
      thus it initializes head = NULL, skips the loop, return head;
      then create_empty_buffers() gets (and uses) the NULL pointer.
      
      This has been around longer than commit ad6bf88a ("block:
      fix an integer overflow in logical block size"); however, it
      increased the range of values that can trigger the issue.
      
      Previously only 8k/16k/32k (on x86/4k page size) would do it,
      as greater values overflow unsigned short to zero, and queue_
      logical_block_size() would then use the default of 512.
      
      Now the range with unsigned int is much larger, and users w/
      the 512k value, which happened to be zero'ed previously and
      work fine, started to hit this issue -- as the zero is gone,
      and queue_logical_block_size() does return 512k (>PAGE_SIZE.)
      
      Fix this by checking the bcache device's logical block size,
      and if it's greater than page size, fallback to the backing/
      cached device's logical page size.
      
      This doesn't affect cache devices as those are still checked
      for block/page size in read_super(); only the backing/cached
      devices are not.
      
      Apparently it's a regression from commit 2903381f ("bcache:
      Take data offset from the bdev superblock."), moving the check
      into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks
      of backing devices out there with this larger value, we cannot
      refuse to load them (i.e., have a similar check in _BDEV.)
      
      Ideally perhaps bcache should use all values from the backing
      device (physical/logical/io_min block size)? But for now just
      fix the problematic case.
      
      Test-case:
      
          # IMG=/root/disk.img
          # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G
          # DEV=$(losetup --find --show $IMG)
          # make-bcache --bdev $DEV --block 8k
            < see dmesg >
      
      Before:
      
          # uname -r
          5.7.0-rc7
      
          [   55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000
          ...
          [   55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4
          ...
          [   55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100
          ...
          [   55.966434] Call Trace:
          [   55.967021]  create_page_buffers+0x48/0x50
          [   55.967834]  block_read_full_page+0x49/0x380
          [   55.972181]  do_read_cache_page+0x494/0x610
          [   55.974780]  read_part_sector+0x2d/0xaa
          [   55.975558]  read_lba+0x10e/0x1e0
          [   55.977904]  efi_partition+0x120/0x5a6
          [   55.980227]  blk_add_partitions+0x161/0x390
          [   55.982177]  bdev_disk_changed+0x61/0xd0
          [   55.982961]  __blkdev_get+0x350/0x490
          [   55.983715]  __device_add_disk+0x318/0x480
          [   55.984539]  bch_cached_dev_run+0xc5/0x270
          [   55.986010]  register_bcache.cold+0x122/0x179
          [   55.987628]  kernfs_fop_write+0xbc/0x1a0
          [   55.988416]  vfs_write+0xb1/0x1a0
          [   55.989134]  ksys_write+0x5a/0xd0
          [   55.989825]  do_syscall_64+0x43/0x140
          [   55.990563]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [   55.991519] RIP: 0033:0x7f7d60ba3154
          ...
      
      After:
      
          # uname -r
          5.7.0.bcachelbspgsz
      
          [   31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512)
          [   31.675133] bcache: register_bdev() registered backing device loop0
      
          # grep ^ /sys/block/bcache0/queue/*_block_size
          /sys/block/bcache0/queue/logical_block_size:512
          /sys/block/bcache0/queue/physical_block_size:8192
      Reported-by: NRyan Finnie <ryan@finnie.org>
      Reported-by: NSebastian Marsching <sebastian@marsching.com>
      Signed-off-by: NMauricio Faria de Oliveira <mfo@canonical.com>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dcacbc12
    • Z
      bcache: fix potential deadlock problem in btree_gc_coalesce · be23e837
      Zhiqiang Liu 提交于
      coccicheck reports:
        drivers/md//bcache/btree.c:1538:1-7: preceding lock on line 1417
      
      In btree_gc_coalesce func, if the coalescing process fails, we will goto
      to out_nocoalesce tag directly without releasing new_nodes[i]->write_lock.
      Then, it will cause a deadlock when trying to acquire new_nodes[i]->
      write_lock for freeing new_nodes[i] before return.
      
      btree_gc_coalesce func details as follows:
      	if alloc new_nodes[i] fails:
      		goto out_nocoalesce;
      	// obtain new_nodes[i]->write_lock
      	mutex_lock(&new_nodes[i]->write_lock)
      	// main coalescing process
      	for (i = nodes - 1; i > 0; --i)
      		[snipped]
      		if coalescing process fails:
      			// Here, directly goto out_nocoalesce
      			 // tag will cause a deadlock
      			goto out_nocoalesce;
      		[snipped]
      	// release new_nodes[i]->write_lock
      	mutex_unlock(&new_nodes[i]->write_lock)
      	// coalesing succ, return
      	return;
      out_nocoalesce:
      	btree_node_free(new_nodes[i])	// free new_nodes[i]
      	// obtain new_nodes[i]->write_lock
      	mutex_lock(&new_nodes[i]->write_lock);
      	// set flag for reuse
      	clear_bit(BTREE_NODE_dirty, &ew_nodes[i]->flags);
      	// release new_nodes[i]->write_lock
      	mutex_unlock(&new_nodes[i]->write_lock);
      
      To fix the problem, we add a new tag 'out_unlock_nocoalesce' for
      releasing new_nodes[i]->write_lock before out_nocoalesce tag. If
      coalescing process fails, we will go to out_unlock_nocoalesce tag
      for releasing new_nodes[i]->write_lock before free new_nodes[i] in
      out_nocoalesce tag.
      
      (Coly Li helps to clean up commit log format.)
      
      Fixes: 2a285686 ("bcache: btree locking rework")
      Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      be23e837
  4. 27 5月, 2020 6 次提交
    • C
      bcache: use bio_{start,end}_io_acct · 85750aeb
      Christoph Hellwig 提交于
      Switch bcache to use the nicer bio accounting helpers, and call the
      routines where we also sample the start time to give coherent accounting
      results.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85750aeb
    • C
      bcache: configure the asynchronous registertion to be experimental · 0c8d3fce
      Coly Li 提交于
      In order to avoid the experimental async registration interface to
      be treated as new kernel ABI for common users, this patch makes it
      as an experimental kernel configure BCACHE_ASYNC_REGISTRAION.
      
      This interface is for extreme large cached data situation, to make sure
      the bcache device can always created without the udev timeout issue. For
      normal users the async or sync registration does not make difference.
      
      In future when we decide to use the asynchronous registration as default
      behavior, this experimental interface may be removed.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0c8d3fce
    • C
      bcache: asynchronous devices registration · 9e23ccf8
      Coly Li 提交于
      When there is a lot of data cached on cache device, the bcach internal
      btree can take a very long to validate during the backing device and
      cache device registration. In my test, it may takes 55+ minutes to check
      all the internal btree nodes.
      
      The problem is that the registration is invoked by udev rules and the
      udevd has 180 seconds timeout by default. If the btree node checking
      time is longer than udevd timeout, the registering  process will be
      killed by udevd with SIGKILL. If the registering process has pending
      sigal, creating kthread for bcache will fail and the device registration
      will fail. The result is, for bcache device which cached a lot of data
      on cache device, the bcache device node like /dev/bcache<N> won't create
      always due to the very long btree checking time.
      
      A solution to avoid the udevd 180 seconds timeout is to register devices
      in an asynchronous way. Which is, after writing cache or backing device
      path into /sys/fs/bcache/register_async, the kernel code will create a
      kworker and move all the btree node checking (for cache device) or dirty
      data counting (for cached device) in the kwork context. Then the kworder
      is scheduled on system_wq and the registration code just returned to
      user space udev rule task. By this asynchronous way, the udev task for
      bcache rule will complete in seconds, no matter how long time spent in
      the kworker context, it won't be killed by udevd for a timeout.
      
      After all the checking and counting are done asynchronously in the
      kworker, the bcache device will eventually be created successfully.
      
      This patch does the above chagne and add a register sysfs file
      /sys/fs/bcache/register_async. Writing the registering device path into
      this sysfs file will do the asynchronous registration.
      
      The register_async interface is for very rare condition and won't be
      used for common users. In future I plan to make the asynchronous
      registration as default behavior, which depends on feedback for this
      patch.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9e23ccf8
    • C
      bcache: fix refcount underflow in bcache_device_free() · 86da9f73
      Coly Li 提交于
      The problematic code piece in bcache_device_free() is,
      
       785 static void bcache_device_free(struct bcache_device *d)
       786 {
       787     struct gendisk *disk = d->disk;
       [snipped]
       799     if (disk) {
       800             if (disk->flags & GENHD_FL_UP)
       801                     del_gendisk(disk);
       802
       803             if (disk->queue)
       804                     blk_cleanup_queue(disk->queue);
       805
       806             ida_simple_remove(&bcache_device_idx,
       807                               first_minor_to_idx(disk->first_minor));
       808             put_disk(disk);
       809         }
       [snipped]
       816 }
      
      At line 808, put_disk(disk) may encounter kobject refcount of 'disk'
      being underflow.
      
      Here is how to reproduce the issue,
      - Attche the backing device to a cache device and do random write to
        make the cache being dirty.
      - Stop the bcache device while the cache device has dirty data of the
        backing device.
      - Only register the backing device back, NOT register cache device.
      - The bcache device node /dev/bcache0 won't show up, because backing
        device waits for the cache device shows up for the missing dirty
        data.
      - Now echo 1 into /sys/fs/bcache/pendings_cleanup, to stop the pending
        backing device.
      - After the pending backing device stopped, use 'dmesg' to check kernel
        message, a use-after-free warning from KASA reported the refcount of
        kobject linked to the 'disk' is underflow.
      
      The dropping refcount at line 808 in the above code piece is added by
      add_disk(d->disk) in bch_cached_dev_run(). But in the above condition
      the cache device is not registered, bch_cached_dev_run() has no chance
      to be called and the refcount is not added. The put_disk() for a non-
      added refcount of gendisk kobject triggers a underflow warning.
      
      This patch checks whether GENHD_FL_UP is set in disk->flags, if it is
      not set then the bcache device was not added, don't call put_disk()
      and the the underflow issue can be avoided.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86da9f73
    • J
      bcache: Convert pr_<level> uses to a more typical style · 46f5aa88
      Joe Perches 提交于
      Remove the trailing newline from the define of pr_fmt and add newlines
      to the uses.
      
      Miscellanea:
      
      o Convert bch_bkey_dump from multiple uses of pr_err to pr_cont
        as the earlier conversion was inappropriate done causing multiple
        lines to be emitted where only a single output line was desired
      o Use vsprintf extension %pV in bch_cache_set_error to avoid multiple
        line output where only a single line output was desired
      o Coalesce formats
      
      Fixes: 6ae63e35 ("bcache: replace printk() by pr_*() routines")
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      46f5aa88
    • C
      bcache: remove redundant variables i and n · 3b5b7b1f
      Colin Ian King 提交于
      Variables i and n are being assigned but are never used. They are
      redundant and can be removed.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NColy Li <colyli@suse.de>
      Addresses-Coverity: ("Unused value")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3b5b7b1f
  5. 25 4月, 2020 1 次提交
  6. 28 3月, 2020 2 次提交
  7. 25 3月, 2020 1 次提交
  8. 23 3月, 2020 7 次提交
    • C
      bcache: optimize barrier usage for atomic operations · eb9b6666
      Coly Li 提交于
      The idea of this patch is from Davidlohr Bueso, he posts a patch
      for bcache to optimize barrier usage for read-modify-write atomic
      bitops. Indeed such optimization can also apply on other locations
      where smp_mb() is used before or after an atomic operation.
      
      This patch replaces smp_mb() with smp_mb__before_atomic() or
      smp_mb__after_atomic() in btree.c and writeback.c,  where it is used
      to synchronize memory cache just earlier on other cores. Although
      the locations are not on hot code path, it is always not bad to mkae
      things a little better.
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      eb9b6666
    • D
      bcache: optimize barrier usage for Rmw atomic bitops · b004aa86
      Davidlohr Bueso 提交于
      We can avoid the unnecessary barrier on non LL/SC architectures,
      such as x86. Instead, use the smp_mb__after_atomic().
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b004aa86
    • T
      bcache: Use scnprintf() for avoiding potential buffer overflow · 9876e386
      Takashi Iwai 提交于
      Since snprintf() returns the would-be-output size instead of the
      actual output size, the succeeding calls may go beyond the given
      buffer limit.  Fix it by replacing with scnprintf().
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9876e386
    • C
      bcache: make bch_sectors_dirty_init() to be multithreaded · b144e45f
      Coly Li 提交于
      When attaching a cached device (a.k.a backing device) to a cache
      device, bch_sectors_dirty_init() is called to count dirty sectors
      and stripes (see what bcache_dev_sectors_dirty_add() does) on the
      cache device.
      
      The counting is done by a single thread recursive function
      bch_btree_map_keys() to iterate all the bcache btree nodes.
      If the btree has huge number of nodes, bch_sectors_dirty_init() will
      take quite long time. In my testing, if the registering cache set has
      a existed UUID which matches a already registered cached device, the
      automatical attachment during the registration may take more than
      55 minutes. This is too long for waiting the bcache to work in real
      deployment.
      
      Fortunately when bch_sectors_dirty_init() is called, no other thread
      will access the btree yet, it is safe to do a read-only parallelized
      dirty sectors counting by multiple threads.
      
      This patch tries to create multiple threads, and each thread tries to
      one-by-one count dirty sectors from the sub-tree indexed by a root
      node key which the thread fetched. After the sub-tree is counted, the
      counting thread will continue to fetch another root node key, until
      the fetched key is NULL. How many threads in parallel depends on
      the number of keys from the btree root node, and the number of online
      CPU core. The thread number will be the less number but no more than
      BCH_DIRTY_INIT_THRD_MAX. If there are only 2 keys in root node, it
      can only be 2x times faster by this patch. But if there are 10 keys
      in the root node, with this patch it can be 10x times faster.
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b144e45f
    • C
      bcache: make bch_btree_check() to be multithreaded · 8e710227
      Coly Li 提交于
      When registering a cache device, bch_btree_check() is called to check
      all btree nodes, to make sure the btree is consistent and not
      corrupted.
      
      bch_btree_check() is recursively executed in a single thread, when there
      are a lot of data cached and the btree is huge, it may take very long
      time to check all the btree nodes. In my testing, I observed it took
      around 50 minutes to finish bch_btree_check().
      
      When checking the bcache btree nodes, the cache set is not running yet,
      and indeed the whole tree is in read-only state, it is safe to create
      multiple threads to check the btree in parallel.
      
      This patch tries to create multiple threads, and each thread tries to
      one-by-one check the sub-tree indexed by a key from the btree root node.
      The parallel thread number depends on how many keys in the btree root
      node. At most BCH_BTR_CHKTHREAD_MAX (64) threads can be created, but in
      practice is should be min(cpu-number/2, root-node-keys-number).
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8e710227
    • C
      bcache: add bcache_ prefix to btree_root() and btree() macros · feac1a70
      Coly Li 提交于
      This patch changes macro btree_root() and btree() to bcache_btree_root()
      and bcache_btree(), to avoid potential generic name clash in future.
      
      NOTE: for product kernel maintainers, this patch can be skipped if
      you feel the rename stuffs introduce inconvenince to patch backport.
      Suggested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      feac1a70
    • C
      bcache: move macro btree() and btree_root() into btree.h · 253a99d9
      Coly Li 提交于
      In order to accelerate bcache registration speed, the macro btree()
      and btree_root() will be referenced out of btree.c. This patch moves
      them from btree.c into btree.h with other relative function declaration
      in btree.h, for the following changes.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      253a99d9
  9. 03 3月, 2020 1 次提交
  10. 13 2月, 2020 3 次提交
    • C
      bcache: remove macro nr_to_fifo_front() · 4ec31cb6
      Coly Li 提交于
      Macro nr_to_fifo_front() is only used once in btree_flush_write(),
      it is unncessary indeed. This patch removes this macro and does
      calculation directly in place.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4ec31cb6
    • C
      bcache: Revert "bcache: shrink btree node cache after bch_btree_check()" · 309cc719
      Coly Li 提交于
      This reverts commit 1df3877f.
      
      In my testing, sometimes even all the cached btree nodes are freed,
      creating gc and allocator kernel threads may still fail. Finally it
      turns out that kthread_run() may fail if there is pending signal for
      current task. And the pending signal is sent from OOM killer which
      is triggered by memory consuption in bch_btree_check().
      
      Therefore explicitly shrinking bcache btree node here does not help,
      and after the shrinker callback is improved, as well as pending signals
      are ignored before creating kernel threads, now such operation is
      unncessary anymore.
      
      This patch reverts the commit 1df3877f ("bcache: shrink btree node
      cache after bch_btree_check()") because we have better improvement now.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      309cc719
    • C
      bcache: ignore pending signals when creating gc and allocator thread · 0b96da63
      Coly Li 提交于
      When run a cache set, all the bcache btree node of this cache set will
      be checked by bch_btree_check(). If the bcache btree is very large,
      iterating all the btree nodes will occupy too much system memory and
      the bcache registering process might be selected and killed by system
      OOM killer. kthread_run() will fail if current process has pending
      signal, therefore the kthread creating in run_cache_set() for gc and
      allocator kernel threads are very probably failed for a very large
      bcache btree.
      
      Indeed such OOM is safe and the registering process will exit after
      the registration done. Therefore this patch flushes pending signals
      during the cache set start up, specificly in bch_cache_allocator_start()
      and bch_gc_thread_start(), to make sure run_cache_set() won't fail for
      large cahced data set.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0b96da63
  11. 01 2月, 2020 3 次提交
    • C
      bcache: check return value of prio_read() · 49d08d59
      Coly Li 提交于
      Now if prio_read() failed during starting a cache set, we can print
      out error message in run_cache_set() and handle the failure properly.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      49d08d59
    • C
      bcache: fix incorrect data type usage in btree_flush_write() · d1c3cc34
      Coly Li 提交于
      Dan Carpenter points out that from commit 2aa8c529 ("bcache: avoid
      unnecessary btree nodes flushing in btree_flush_write()"), there is a
      incorrect data type usage which leads to the following static checker
      warning:
      	drivers/md/bcache/journal.c:444 btree_flush_write()
      	warn: 'ref_nr' unsigned <= 0
      
      drivers/md/bcache/journal.c
         422  static void btree_flush_write(struct cache_set *c)
         423  {
         424          struct btree *b, *t, *btree_nodes[BTREE_FLUSH_NR];
         425          unsigned int i, nr, ref_nr;
                                          ^^^^^^
      
         426          atomic_t *fifo_front_p, *now_fifo_front_p;
         427          size_t mask;
         428
         429          if (c->journal.btree_flushing)
         430                  return;
         431
         432          spin_lock(&c->journal.flush_write_lock);
         433          if (c->journal.btree_flushing) {
         434                  spin_unlock(&c->journal.flush_write_lock);
         435                  return;
         436          }
         437          c->journal.btree_flushing = true;
         438          spin_unlock(&c->journal.flush_write_lock);
         439
         440          /* get the oldest journal entry and check its refcount */
         441          spin_lock(&c->journal.lock);
         442          fifo_front_p = &fifo_front(&c->journal.pin);
         443          ref_nr = atomic_read(fifo_front_p);
         444          if (ref_nr <= 0) {
                          ^^^^^^^^^^^
      Unsigned can't be less than zero.
      
         445                  /*
         446                   * do nothing if no btree node references
         447                   * the oldest journal entry
         448                   */
         449                  spin_unlock(&c->journal.lock);
         450                  goto out;
         451          }
         452          spin_unlock(&c->journal.lock);
      
      As the warning information indicates, local varaible ref_nr in unsigned
      int type is wrong, which does not matche atomic_read() and the "<= 0"
      checking.
      
      This patch fixes the above error by defining local variable ref_nr as
      int type.
      
      Fixes: 2aa8c529 ("bcache: avoid unnecessary btree nodes flushing in btree_flush_write()")
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d1c3cc34
    • C
      bcache: add readahead cache policy options via sysfs interface · 038ba8cc
      Coly Li 提交于
      In year 2007 high performance SSD was still expensive, in order to
      save more space for real workload or meta data, the readahead I/Os
      for non-meta data was bypassed and not cached on SSD.
      
      In now days, SSD price drops a lot and people can find larger size
      SSD with more comfortable price. It is unncessary to alway bypass
      normal readahead I/Os to save SSD space for now.
      
      This patch adds options for readahead data cache policies via sysfs
      file /sys/block/bcache<N>/readahead_cache_policy, the options are,
      - "all": cache all readahead data I/Os.
      - "meta-only": only cache meta data, and bypass other regular I/Os.
      
      If users want to make bcache continue to only cache readahead request
      for metadata and bypass regular data readahead, please set "meta-only"
      to this sysfs file. By default, bcache will back to cache all read-
      ahead requests now.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NColy Li <colyli@suse.de>
      Acked-by: NEric Wheeler <bcache@linux.ewheeler.net>
      Cc: Michael Lyle <mlyle@lyle.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      038ba8cc