1. 27 7月, 2018 8 次提交
    • F
      bcache: do not assign in if condition register_bcache() · a56489d4
      Florian Schmaus 提交于
      Fixes an error condition reported by checkpatch.pl which is caused by
      assigning a variable in an if condition.
      Signed-off-by: NFlorian Schmaus <flo@geekplace.eu>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a56489d4
    • T
      bcache: fix I/O significant decline while backend devices registering · 94f71c16
      Tang Junhui 提交于
      I attached several backend devices in the same cache set, and produced lots
      of dirty data by running small rand I/O writes in a long time, then I
      continue run I/O in the others cached devices, and stopped a cached device,
      after a mean while, I register the stopped device again, I see the running
      I/O in the others cached devices dropped significantly, sometimes even
      jumps to zero.
      
      In currently code, bcache would traverse each keys and btree node to count
      the dirty data under read locker, and the writes threads can not get the
      btree write locker, and when there is a lot of keys and btree node in the
      registering device, it would last several seconds, so the write I/Os in
      others cached device are blocked and declined significantly.
      
      In this patch, when a device registering to a ache set, which exist others
      cached devices with running I/Os, we get the amount of dirty data of the
      device in an incremental way, and do not block other cached devices all the
      time.
      
      Patch v2: Rename some variables and macros name as Coly suggested.
      Signed-off-by: NTang Junhui <tang.junhui@zte.com.cn>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      94f71c16
    • T
      bcache: calculate the number of incremental GC nodes according to the total of btree nodes · 7f4a59de
      Tang Junhui 提交于
      This patch base on "[PATCH] bcache: finish incremental GC".
      
      Since incremental GC would stop 100ms when front side I/O comes, so when
      there are many btree nodes, if GC only processes constant (100) nodes each
      time, GC would last a long time, and the front I/Os would run out of the
      buckets (since no new bucket can be allocated during GC), and I/Os be
      blocked again.
      
      So GC should not process constant nodes, but varied nodes according to the
      number of btree nodes. In this patch, GC is divided into constant (100)
      times, so when there are many btree nodes, GC can process more nodes each
      time, otherwise GC will process less nodes each time (but no less than
      MIN_GC_NODES).
      Signed-off-by: NTang Junhui <tang.junhui@zte.com.cn>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7f4a59de
    • T
      bcache: finish incremental GC · 5c25c4fc
      Tang Junhui 提交于
      In GC thread, we record the latest GC key in gc_done, which is expected
      to be used for incremental GC, but in currently code, we didn't realize
      it. When GC runs, front side IO would be blocked until the GC over, it
      would be a long time if there is a lot of btree nodes.
      
      This patch realizes incremental GC, the main ideal is that, when there
      are front side I/Os, after GC some nodes (100), we stop GC, release locker
      of the btree node, and go to process the front side I/Os for some times
      (100 ms), then go back to GC again.
      
      By this patch, when we doing GC, I/Os are not blocked all the time, and
      there is no obvious I/Os zero jump problem any more.
      
      Patch v2: Rename some variables and macros name as Coly suggested.
      Signed-off-by: NTang Junhui <tang.junhui@zte.com.cn>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5c25c4fc
    • T
      bcache: simplify the calculation of the total amount of flash dirty data · 99a27d59
      Tang Junhui 提交于
      Currently we calculate the total amount of flash only devices dirty data
      by adding the dirty data of each flash only device under registering
      locker. It is very inefficient.
      
      In this patch, we add a member flash_dev_dirty_sectors in struct cache_set
      to record the total amount of flash only devices dirty data in real time,
      so we didn't need to calculate the total amount of dirty data any more.
      Signed-off-by: NTang Junhui <tang.junhui@zte.com.cn>
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      99a27d59
    • M
      readahead: stricter check for bdi io_pages · dc30b96a
      Markus Stockhausen 提交于
      ondemand_readahead() checks bdi->io_pages to cap the maximum pages
      that need to be processed. This works until the readit section. If
      we would do an async only readahead (async size = sync size) and
      target is at beginning of window we expand the pages by another
      get_next_ra_size() pages. Btrace for large reads shows that kernel
      always issues a doubled size read at the beginning of processing.
      Add an additional check for io_pages in the lower part of the func.
      The fix helps devices that hard limit bio pages and rely on proper
      handling of max_hw_read_sectors (e.g. older FusionIO cards). For
      that reason it could qualify for stable.
      
      Fixes: 9491ae4a ("mm: don't cap request size based on read-ahead setting")
      Cc: stable@vger.kernel.org
      Signed-off-by: Markus Stockhausen stockhausen@collogia.de
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dc30b96a
    • G
      scsi: virtio_scsi: fix pi_bytes{out,in} on 4 KiB block size devices · cdcdcaae
      Greg Edwards 提交于
      When the underlying device is a 4 KiB logical block size device with a
      protection interval exponent of 0, i.e. 4096 bytes data + 8 bytes PI, the
      driver miscalculates the pi_bytes{out,in} by a factor of 8x (64 bytes).
      
      This leads to errors on all reads and writes on 4 KiB logical block size
      devices when CONFIG_BLK_DEV_INTEGRITY is enabled and the
      VIRTIO_SCSI_F_T10_PI feature bit has been negotiated.
      
      Fixes: e6dc783a ("virtio-scsi: Enable DIF/DIX modes in SCSI host LLD")
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Edwards <gedwards@ddn.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cdcdcaae
    • G
      block: move bio_integrity_{intervals,bytes} into blkdev.h · 359f6427
      Greg Edwards 提交于
      This allows bio_integrity_bytes() to be called from drivers instead of
      open coding it.
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NGreg Edwards <gedwards@ddn.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      359f6427
  2. 25 7月, 2018 10 次提交
  3. 24 7月, 2018 10 次提交
  4. 23 7月, 2018 12 次提交