提交 17606375 编写于 作者: S sdong 提交者: Facebook GitHub Bot

CompactRange() refit level should confirm destination level is not empty (#7261)

Summary:
There is potential data race related CompactRange() with level refitting. After the compaction step and refitting step, some automatic compaction could put data to the destination level and cause the DB to be corrupted. Fix the bug by checking the target level to be empty.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7261

Test Plan: Add a unit test, which would fail with "Corruption: L1 have overlapping ranges '666F6F' seq:6, type:1 vs. '626172' seq:2, type:1", and now it succeeds.

Reviewed By: ajkr

Differential Revision: D23142269

fbshipit-source-id: 28bc14d5ac934c192260b23a4ce3f10a95e3ee91
上级 500eeb6f
......@@ -3,6 +3,7 @@
### Bug fixes
* Fix a performance regression introduced in 6.4 that makes a upper bound check for every Next() even if keys are within a data block that is within the upper bound.
* Fix a possible corruption to the LSM state (overlapping files within a level) when a `CompactRange()` for refitting levels (`CompactRangeOptions::change_level == true`) and another manual compaction are executed in parallel.
* Fix a bug where a level refitting in CompactRange() might race with an automatic compaction that puts the data to the target level of the refitting. The bug has been there for years.
### New Features
* A new option `std::shared_ptr<FileChecksumGenFactory> file_checksum_gen_factory` is added to `BackupableDBOptions`. The default value for this option is `nullptr`. If this option is null, the default backup engine checksum function (crc32c) will be used for creating, verifying, or restoring backups. If it is not null and is set to the DB custom checksum factory, the custom checksum function used in DB will also be used for creating, verifying, or restoring backups, in addition to the default checksum function (crc32c). If it is not null and is set to a custom checksum factory different than the DB custom checksum factory (which may be null), BackupEngine will return `Status::InvalidArgument()`.
......
......@@ -52,6 +52,16 @@ class DBCompactionDirectIOTest : public DBCompactionTest,
DBCompactionDirectIOTest() : DBCompactionTest() {}
};
// Param = true : target level is non-empty
// Param = false: level between target level and source level
// is not empty.
class ChangeLevelConflictsWithAuto
: public DBCompactionTest,
public ::testing::WithParamInterface<bool> {
public:
ChangeLevelConflictsWithAuto() : DBCompactionTest() {}
};
namespace {
class FlushedFileCollector : public EventListener {
......@@ -5442,6 +5452,73 @@ TEST_F(DBCompactionTest, UpdateUniversalSubCompactionTest) {
ASSERT_TRUE(has_compaction);
}
TEST_P(ChangeLevelConflictsWithAuto, TestConflict) {
// A `CompactRange()` may race with an automatic compaction, we'll need
// to make sure it doesn't corrupte the data.
Options options = CurrentOptions();
options.level0_file_num_compaction_trigger = 2;
Reopen(options);
ASSERT_OK(Put("foo", "v1"));
ASSERT_OK(Put("bar", "v1"));
ASSERT_OK(Flush());
ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable());
{
CompactRangeOptions cro;
cro.change_level = true;
cro.target_level = 2;
ASSERT_OK(dbfull()->CompactRange(cro, nullptr, nullptr));
}
ASSERT_EQ("0,0,1", FilesPerLevel(0));
// Run a qury to refitting to level 1 while another thread writing to
// the same level.
SyncPoint::GetInstance()->LoadDependency({
// The first two dependencies ensure the foreground creates an L0 file
// between the background compaction's L0->L1 and its L1->L2.
{
"DBImpl::CompactRange:BeforeRefit:1",
"AutoCompactionFinished1",
},
{
"AutoCompactionFinished2",
"DBImpl::CompactRange:BeforeRefit:2",
},
});
SyncPoint::GetInstance()->EnableProcessing();
std::thread auto_comp([&] {
TEST_SYNC_POINT("AutoCompactionFinished1");
ASSERT_OK(Put("bar", "v2"));
ASSERT_OK(Put("foo", "v2"));
ASSERT_OK(Flush());
ASSERT_OK(Put("bar", "v3"));
ASSERT_OK(Put("foo", "v3"));
ASSERT_OK(Flush());
dbfull()->TEST_WaitForCompact();
TEST_SYNC_POINT("AutoCompactionFinished2");
});
{
CompactRangeOptions cro;
cro.change_level = true;
cro.target_level = GetParam() ? 1 : 0;
// This should return non-OK, but it's more important for the test to
// make sure that the DB is not corrupted.
dbfull()->CompactRange(cro, nullptr, nullptr);
}
auto_comp.join();
// Refitting didn't happen.
SyncPoint::GetInstance()->DisableProcessing();
// Write something to DB just make sure that consistency check didn't
// fail and make the DB readable.
}
INSTANTIATE_TEST_CASE_P(ChangeLevelConflictsWithAuto,
ChangeLevelConflictsWithAuto, testing::Bool());
TEST_F(DBCompactionTest, ChangeLevelCompactRangeConflictsWithManual) {
// A `CompactRange()` with `change_level == true` needs to execute its final
// step, `ReFitLevel()`, in isolation. Previously there was a bug where
......
......@@ -845,6 +845,9 @@ Status DBImpl::CompactRange(const CompactRangeOptions& options,
}
if (options.change_level) {
TEST_SYNC_POINT("DBImpl::CompactRange:BeforeRefit:1");
TEST_SYNC_POINT("DBImpl::CompactRange:BeforeRefit:2");
ROCKS_LOG_INFO(immutable_db_options_.info_log,
"[RefitLevel] waiting for background threads to stop");
DisableManualCompaction();
......@@ -1280,20 +1283,29 @@ Status DBImpl::ReFitLevel(ColumnFamilyData* cfd, int level, int target_level) {
}
auto* vstorage = cfd->current()->storage_info();
if (to_level > level) {
if (level == 0) {
return Status::NotSupported(
"Cannot change from level 0 to other levels.");
}
// Check levels are empty for a trivial move
for (int l = level + 1; l <= to_level; l++) {
if (vstorage->NumLevelFiles(l) > 0) {
if (to_level != level) {
if (to_level > level) {
if (level == 0) {
return Status::NotSupported(
"Levels between source and target are not empty for a move.");
"Cannot change from level 0 to other levels.");
}
// Check levels are empty for a trivial move
for (int l = level + 1; l <= to_level; l++) {
if (vstorage->NumLevelFiles(l) > 0) {
return Status::NotSupported(
"Levels between source and target are not empty for a move.");
}
}
} else {
// to_level < level
// Check levels are empty for a trivial move
for (int l = to_level; l < level; l++) {
if (vstorage->NumLevelFiles(l) > 0) {
return Status::NotSupported(
"Levels between source and target are not empty for a move.");
}
}
}
}
if (to_level != level) {
ROCKS_LOG_DEBUG(immutable_db_options_.info_log,
"[%s] Before refitting:\n%s", cfd->GetName().c_str(),
cfd->current()->DebugString().data());
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册