提交 b78ed046 编写于 作者: A Andrew Kryczka 提交者: Facebook Github Bot

fix ReadaheadRandomAccessFile/iterator prefetch bug

Summary:
`ReadaheadRandomAccessFile` is used by iterators for file reads in several cases, like in compaction when `compaction_readahead_size > 0` or `use_direct_io_for_flush_and_compaction == true`, or in user iterator when `ReadOptions::readahead_size > 0`. `ReadaheadRandomAccessFile` maintains an internal buffer for readahead data. It assumes that, if the buffer's length is less than `ReadaheadRandomAccessFile::readahead_size_`, which is fixed in the constructor, then EOF has been reached so it doesn't try reading further.

Recently, d938226a started calling `RandomAccessFile::Prefetch` with various lengths: 8KB, 16KB, etc. When the `RandomAccessFile` is a `ReadaheadRandomAccessFile`, it triggers the above condition and incorrectly determines EOF. If a block is partially in the readahead buffer and EOF is incorrectly decided, the result is a truncated data block.

The problem is reproducible:

```
TEST_TMPDIR=/data/compaction_bench ./db_bench -benchmarks=fillrandom -write_buffer_size=1048576 -target_file_size_base=1048576 -block_size=18384 -use_direct_io_for_flush_and_compaction=true
...
put error: Corruption: truncated block read from /data/compaction_bench/dbbench/000014.sst offset 20245, expected 10143 bytes, got 8427
```
Closes https://github.com/facebook/rocksdb/pull/3454

Differential Revision: D6869405

Pulled By: ajkr

fbshipit-source-id: 87001c299e7600a37c0dcccbd0368e0954c929cf
上级 813719e9
......@@ -541,6 +541,11 @@ class ReadaheadRandomAccessFile : public RandomAccessFile {
}
virtual Status Prefetch(uint64_t offset, size_t n) override {
if (n < readahead_size_) {
// Don't allow smaller prefetches than the configured `readahead_size_`.
// `Read()` assumes a smaller prefetch buffer indicates EOF was reached.
return Status::OK();
}
size_t prefetch_offset = TruncateToPageBoundary(alignment_, offset);
if (prefetch_offset == buffer_offset_) {
return Status::OK();
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册