From 177f8bd063b45ccb4604836287a73c8a37922a8f Mon Sep 17 00:00:00 2001 From: Andrew Kryczka Date: Tue, 1 Sep 2020 19:32:59 -0700 Subject: [PATCH] Bound L0->Lbase fanout in dynamic leveled compaction (#7325) Summary: L0 score is based on size target and number of files. The size target used is `max_bytes_for_level_base`. However, the base level's size can dynamically expand in write burst mode. In fact, it can expand so much that L0->Lbase becomes the highest fanout in target sizes. This doesn't make sense from an efficiency perspective, so this PR bounds the L0->Lbase fanout to the smoothed level multiplier. The L0 scoring based on file count remains unchanged. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7325 Test Plan: contrived benchmark that exhibits the problem: ``` $ TEST_TMPDIR=/data/users/andrewkr/ ./db_bench -benchmarks=filluniquerandom,readrandom -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -level0_file_num_compaction_trigger=4 -level_compaction_dynamic_level_bytes=true -compression_type=none -max_background_jobs=12 -rate_limiter_bytes_per_sec=104857600 -benchmark_write_rate_limit=10485760 -num=100000000 ``` Results: - "Burst W-Amp" is the write-amp near the end of the fillrandom benchmark - "Total W-Amp" is the write-amp after readrandom has run a while and all levels no longer need compaction Branch | Burst W-Amp | Total W-Amp | fillrandom (MB/s) -- | -- | -- | -- master | 20.2 | 21.5 | 4.7 dynamic-l0-score | 12.6 | 14.1 | 7.2 Reviewed By: siying Differential Revision: D23412935 Pulled By: ajkr fbshipit-source-id: f91f2067188e432dd39deab02f1c56f195057a0e --- HISTORY.md | 1 + db/version_set.cc | 18 +++++++++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/HISTORY.md b/HISTORY.md index 70c70e801..c41f9b6bf 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -15,6 +15,7 @@ ### Performance Improvements * Reduce thread number for multiple DB instances by re-using one global thread for statistics dumping and persisting. +* Reduce write-amp in heavy write bursts in `kCompactionStyleLevel` compaction style with `level_compaction_dynamic_level_bytes` set. ### Public API Change * Expose kTypeDeleteWithTimestamp in EntryType and update GetEntryType() accordingly. diff --git a/db/version_set.cc b/db/version_set.cc index 1a0793e2d..4530b689a 100644 --- a/db/version_set.cc +++ b/db/version_set.cc @@ -2477,9 +2477,21 @@ void VersionStorageInfo::ComputeCompactionScore( // Level-based involves L0->L0 compactions that can lead to oversized // L0 files. Take into account size as well to avoid later giant // compactions to the base level. - score = std::max( - score, static_cast(total_size) / - mutable_cf_options.max_bytes_for_level_base); + uint64_t l0_target_size = mutable_cf_options.max_bytes_for_level_base; + if (immutable_cf_options.level_compaction_dynamic_level_bytes && + level_multiplier_ != 0.0) { + // Prevent L0 to Lbase fanout from growing larger than + // `level_multiplier_`. This prevents us from getting stuck picking + // L0 forever even when it is hurting write-amp. That could happen + // in dynamic level compaction's write-burst mode where the base + // level's target size can grow to be enormous. + l0_target_size = + std::max(l0_target_size, + static_cast(level_max_bytes_[base_level_] / + level_multiplier_)); + } + score = + std::max(score, static_cast(total_size) / l0_target_size); } } } else { -- GitLab