From 37b44dc84bc891fa816eefcfd02e13c5ae6f58b1 Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Thu, 14 Feb 2019 19:30:45 +0800 Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting mainline inclusion from mainline-5.0-rc4 commit a1e1cb72d96491277ede8d257ce6b48a381dd336 category: bugfix bugzilla: 7221 CVE: NA --------------------------- The risk of redundant IO accounting was not taken into consideration when commit 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk") introduced IO splitting in terms of recursion via generic_make_request(). Fix this by subtracting the split bio's payload from the IO stats that were already accounted for by start_io_acct() upon dm_make_request() entry. This repeat oscillation of the IO accounting, up then down, isn't ideal but refactoring DM core's IO splitting to pre-split bios _before_ they are accounted turned out to be an excessive amount of change that will need a full development cycle to refine and verify. Before this fix: /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so bios are split on 32k boundaries. # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \ --iodepth=1 --ioengine=libaio --direct=1 --refill_buffers with debugging added: [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128 [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio: [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64 ... 16M written yet 136M (278528 * 512b) accounted: # cat /sys/block/dm-2/stat | awk '{ print $7 }' 278528 After this fix: 16M written and 16M (32768 * 512b) accounted: # cat /sys/block/dm-2/stat | awk '{ print $7 }' 32768 Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk") Cc: stable@vger.kernel.org # 4.16+ Reported-by: Bryan Gurney Reviewed-by: Ming Lei Signed-off-by: Mike Snitzer [Conflict: drivers/md/dm.c Since patch 1226b8dd("block: switch to per-cpu in-flight counters") has not been included. ] Signed-off-by: yangerkun@huawei.com Reviewed-by: Hou Tao Signed-off-by: Yang Yingliang --- drivers/md/dm.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 07d2949a8746..7b3d048a2856 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1622,9 +1622,23 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md, * the usage of io->orig_bio in dm_remap_zone_report() * won't be affected by this reassignment. */ + int cpu; struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count, GFP_NOIO, &md->queue->bio_split); ci.io->orig_bio = b; + + /* + * Adjust IO stats for each split, otherwise upon queue + * reentry there will be redundant IO accounting. + * NOTE: this is a stop-gap fix, a proper fix involves + * significant refactoring of DM core's bio splitting + * (by eliminating DM's splitting and just using bio_split) + */ + cpu = part_stat_lock(); + __part_stat_add(cpu, &dm_disk(md)->part0, + sectors[op_stat_group(bio_op(bio))], -(ci.sector_count)); + part_stat_unlock(); + bio_chain(b, bio); ret = generic_make_request(bio); break; -- GitLab