提交 · 37e97b129795fdd02418ce67c3b2b63445450423 · kvdb / rocksdb

30 4月, 2013 1 次提交

Allocate the LogReporter from heap. Summary: · 41cb922b

由 Abhishek Kona 提交于 4月 29, 2013

Summary:
The current code has a bug that take address of stack allocated LogReporter.
It is causing SIGSEGV because the stack address is no longer valid when referenced.

Test Plan: Tested on prod.

Reviewers: haobo, dhruba, heyongqiang

Reviewed By: heyongqiang

Differential Revision: https://reviews.facebook.net/D10557

41cb922b

26 4月, 2013 1 次提交

[RocksDB] Print all internally collected histograms in db_bench. Also print p95 · fb96ec16

由 Abhishek Kona 提交于 4月 25, 2013

Summary: $title

Test Plan: make db_bench . run db_bench and check for expected output

Reviewers: haobo, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10521

fb96ec16

23 4月, 2013 2 次提交

[RocksDB] Move table.h to table/ · eb6d1396

由 Haobo Xu 提交于 4月 22, 2013

Summary:
- don't see a point exposing table.h to the public.
- fixed make clean to remove also *.d files.

Test Plan: make check; db_stress

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10479

eb6d1396

[RocksDB] Fix ReadMissing in db_bench · 344e832f

由 Abhishek Kona 提交于 4月 22, 2013

Summary: D8943 Broke read_missing. Fix it by adding a "." at the end of the generated key

Test Plan: generate, print and check the key has a "."

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10455

344e832f

21 4月, 2013 3 次提交

[RocksDB] CompactionFilter cleanup · b4243e5a

由 Haobo Xu 提交于 4月 17, 2013

Summary:
- removed the compaction_filter_value from the callback interface. Restrict compaction filter to purging values.
- modify some comments to reflect curent status.

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10335

b4243e5a

Add --writes_per_second rate limit, print p99.99 in histogram · b1ff9ac9

由 Mark Callaghan 提交于 4月 17, 2013

Summary:
Adds the --writes_per_second rate limit for the readwhilewriting test.
The purpose is to optionally avoid saturating storage with writes & compaction
and test read response time when some writes are being done.

Changes the histogram code to also print the p99.99 value

Task ID: #

Blame Rev:

Test Plan:
make check, ran db_bench with it

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10305

b1ff9ac9

[RocksDB] Add stacktrace signal handler · 1255dcd4

由 Haobo Xu 提交于 4月 11, 2013

Summary:
This diff provides the ability to print out a stacktrace when the process receives certain signals.
Currently, we enable this for the following signals (program error related):
SIGILL SIGSEGV SIGBUS SIGABRT
Application simply #include "util/stack_trace.h" and call leveldb::InstallStackTraceHandler() during initialization, if signal handler is needed. It's not done automatically when openning db, because it's the application(process)'s responsibility to install signal handler and some applications might already have their own (like fbcode).

Sample output:
Received signal 11 (Segmentation fault)
#0  0x408ff0 ./signal_test() [0x408ff0] /home/haobo/rocksdb/util/signal_test.cc:4
#1  0x40827d ./signal_test() [0x40827d] /home/haobo/rocksdb/util/signal_test.cc:24
#2  0x7f8bb183172e /usr/local/fbcode/gcc-4.7.1-glibc-2.14.1/lib/libc.so.6(__libc_start_main+0x10e) [0x7f8bb183172e] ??:0
#3  0x408ebc ./signal_test() [0x408ebc] /home/engshare/third-party/src/glibc/glibc-2.14.1/glibc-2.14.1/csu/../sysdeps/x86_64/elf/start.S:113
Segmentation fault (core dumped)

For each frame, we print the raw pointer, the symbol provided by backtrace_symbols (still not good enough), and the source file/line. Note that address translation is done by directly shell out to addr2line. ??:0 means addr2line fails to do the translation. Hacky, but I think it's good for now.

Test Plan: signal_test.cc

Reviewers: dhruba, MarkCallaghan

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10173

1255dcd4

16 4月, 2013 2 次提交

[Rockdsdb] Better Error messages. Closing db instead of deleting db · 7c6c3c0f

由 Abhishek Kona 提交于 4月 15, 2013

Summary: A better error message. A local change. Did not look at other places where this could be done.

Test Plan: compile

Reviewers: dhruba, MarkCallaghan

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10251

7c6c3c0f

Simplified level_ptrs by using a std:vector · 9b81d3c4

由 Dhruba Borthakur 提交于 4月 15, 2013

Summary: Simplified level_ptrs by using a std:vector

Test Plan: make check

Reviewers: sheki, emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10245

9b81d3c4

13 4月, 2013 1 次提交

[RocksDB] [Performance] Speed up FindObsoleteFiles · 013e9ebb

由 Haobo Xu 提交于 4月 11, 2013

Summary:
FindObsoleteFiles was slow, holding the single big lock, resulted in bad p99 behavior.
Didn't profile anything, but several things could be improved:
1. VersionSet::AddLiveFiles works with std::set, which is by itself slow (a tree).
   You also don't know how many dynamic allocations occur just for building up this tree.
   switched to std::vector, also added logic to pre-calculate total size and do just one allocation
2. Don't see why env_->GetChildren() needs to be mutex proteced, moved to PurgeObsoleteFiles where
   mutex could be unlocked.
3. switched std::set to std:unordered_set, the conversion from vector is also inside PurgeObsoleteFiles
I have a feeling this should pretty much fix it.

Test Plan: make check;  db_stress

Reviewers: dhruba, heyongqiang, MarkCallaghan

Reviewed By: dhruba

CC: leveldb, zshao

Differential Revision: https://reviews.facebook.net/D10197

013e9ebb

12 4月, 2013 1 次提交

Fix memory leak for probableWALfiles in db_impl.cc · 94d86b25

由 Mayank Agarwal 提交于 4月 09, 2013

Summary: using unique_ptr to have automatic delete for probableWALfiles in db_impl.cc

Test Plan: make

Reviewers: sheki, dhruba

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10083

94d86b25

11 4月, 2013 1 次提交

Prevent segfault in OpenCompactionOutputFile · 77305871

由 Dhruba Borthakur 提交于 4月 10, 2013

Summary:
The segfault was happening because the program was unable to open a new
sst file (as part of the compaction) because the process ran out of
file descriptors.

The fix is to check the return status of the file creation before taking
any other action.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fabf03f9700 (LWP 29904)]
leveldb::DBImpl::OpenCompactionOutputFile (this=this@entry=0x7fabf9011400, compact=compact@entry=0x7fabf741a2b0) at db/db_impl.cc:1399
1399    db/db_impl.cc: No such file or directory.
(gdb) where

Test Plan: make check

Reviewers: MarkCallaghan, sheki

Reviewed By: MarkCallaghan

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10101

77305871

09 4月, 2013 1 次提交

[RocksDB][Bug] Look at all the files, not just the first file in... · 574b76f7

由 Abhishek Kona 提交于 4月 08, 2013

[RocksDB][Bug] Look at all the files, not just the first file in TransactionLogIter as BatchWrites can leave it in Limbo

Summary:
Transaction Log Iterator did not move to the next file in the series if there was a write batch at the end of the currentFile.
The solution is if the last seq no. of the current file is < RequestedSeqNo. Assume the first seqNo. of the next file has to satisfy the request.

Also major refactoring around the code. Moved opening the logreader to a seperate function, got rid of goto.

Test Plan: added a unit test for it.

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb, emayanke

Differential Revision: https://reviews.facebook.net/D10029

574b76f7

06 4月, 2013 1 次提交

[Rocksdb] Remove useless struct TableAndFile · 0e40185a

由 Abhishek Kona 提交于 4月 05, 2013

Summary:
TableAndFile was a struct used earlier to delete the file as we did not have std::unique_ptr in the codebase.
With Chip introducing C++11 hotness like std::unique_ptr we can do away with the struct.

Test Plan: make all check

Reviewers: haobo, heyongqiang

Reviewed By: heyongqiang

CC: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D9975

0e40185a

03 4月, 2013 2 次提交

[Rocksdb] Recover last updated sequence number from manifest also. · ca789a10

由 Abhishek Kona 提交于 4月 02, 2013

Summary:
During recovery, last_updated_manifest number was not set if there were no records in the Write-ahead log.
Now check for the recovered manifest also and set last_updated_manifest file to the max value.

Test Plan: unit test

Reviewers: heyongqiang

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9891

ca789a10

[RocksDB] Replace iterator based loop with range based loop for stl containers · 67631108

由 Haobo Xu 提交于 3月 28, 2013

Summary:
As title.
Code is shorter and cleaner
See https://our.dev.facebook.com/intern/tasks/?t=2233981

Test Plan: make check

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb, zshao

Differential Revision: https://reviews.facebook.net/D9789

67631108

29 3月, 2013 4 次提交

Let's get rid of delete as much as possible, here are some examples. · 645ff8f2

由 Haobo Xu 提交于 3月 28, 2013

Summary:
If a class owns an object:
 - If the object can be null => use a unique_ptr. no delete
 - If the object can not be null => don't even need new, let alone delete
 - for runtime sized array => use vector, no delete.

Test Plan: make check

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb, zshao, sheki, emayanke, MarkCallaghan

Differential Revision: https://reviews.facebook.net/D9783

645ff8f2

[RocksDB] Fix binary search while finding probable wal files · 3b51605b

由 Abhishek Kona 提交于 3月 28, 2013

Summary:
RocksDB does a binary search to look at the files which might contain the requested sequence number at the call GetUpdatesSince.
There was a bug in the binary search => when the file pointed by the middle index of bsearch was empty/corrupt it needst to resize the vector and update indexes.
This now fixes that.

Test Plan: existing unit tests pass.

Reviewers: heyongqiang, dhruba

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9777

3b51605b

[Rocksdb] Fix Crash on finding a db with no log files. Error out instead · 8e9c781a

由 Abhishek Kona 提交于 3月 28, 2013

Summary:
If the vector returned by GetUpdatesSince is empty, it is still returned to the
user. This causes it throw an std::range error.
The probable file list is checked and it returns an IOError status instead of OK now.

Test Plan: added a unit test.

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9771

8e9c781a

Use non-mmapd files for Write-Ahead Files · 7fdd5f5b

由 Abhishek Kona 提交于 3月 28, 2013

Summary:
Use non mmapd files for Write-Ahead log.
Earlier use of MMaped files. made the log iterator read ahead and miss records.
Now the reader and writer will point to the same physical location.

There is no perf regression :
./db_bench --benchmarks=fillseq --db=/dev/shm/mmap_test --num=$(million 20) --use_existing_db=0 --threads=2
with This diff :
fillseq      :      10.756 micros/op 185281 ops/sec;   20.5 MB/s
without this dif :
fillseq      :      11.085 micros/op 179676 ops/sec;   19.9 MB/s

Test Plan: unit test included

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9741

7fdd5f5b

28 3月, 2013 1 次提交

memory manage statistics · 63f216ee

由 Abhishek Kona 提交于 3月 27, 2013

Summary:
Earlier Statistics object was a raw pointer. This meant the user had to clear up
the Statistics object after creating the database. In most use cases the database is created in a function and the statistics pointer is out of scope. Hence the statistics object would never be deleted.
Now Using a shared_ptr to manage this.

Want this in before the next release.

Test Plan: make all check.

Reviewers: dhruba, emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9735

63f216ee

27 3月, 2013 1 次提交

[RocksDB] Minimize Mutex protected code section in the critical path · ecd8db02

由 Haobo Xu 提交于 3月 25, 2013

Summary: rocksdb uses a single global lock to protect in memory metadata. We should minimize the mutex protected code section to increase the effective parallelism of the program. See https://our.intern.facebook.com/intern/tasks/?t=2218928

Test Plan:
make check
db_bench

Reviewers: dhruba, heyongqiang

CC: zshao, leveldb

Differential Revision: https://reviews.facebook.net/D9705

ecd8db02

22 3月, 2013 2 次提交

Disable Unit Test for TransactionLogIteratorStall · 9b70529c

由 Abhishek Kona 提交于 3月 21, 2013

Summary:
The unit test fails as our solution does not work with MMap'd files.
Disable the failing unit test. Put it back with the next diff which should fix the problem.

Test Plan: db_test

Reviewers: heyongqiang

CC: dhruba

Differential Revision: https://reviews.facebook.net/D9645

9b70529c

TransactionLogIter should stall at the last record. Currently it errors out · 27c15fb6

由 Abhishek Kona 提交于 3月 21, 2013

Summary:
* Add a method to check if the log reader is at EOF.
* If we know a record has been flushed force the log_reader to believe it is not at EOF, using a new method UnMarkEof().

This does not work with MMpaed files.

Test Plan: added a unit test.

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9567

27c15fb6

21 3月, 2013 2 次提交

Run compactions even if workload is readonly or read-mostly. · d0798f67

由 Dhruba Borthakur 提交于 3月 20, 2013

Summary:
The events that trigger compaction:
* opening the database
* Get -> only if seek compaction is not disabled and other checks are true
* MakeRoomForWrite -> when memtable is full
* BackgroundCall ->
  If the background thread is about to do a compaction run, it schedules
  a new background task to trigger a possible compaction. This will cause
  additional background threads to find and process other compactions that
  can run concurrently.

Test Plan: ran db_bench with overwrite and readonly alternatively.

Reviewers: sheki, MarkCallaghan

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9579

d0798f67

Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. · ad96563b

由 Dhruba Borthakur 提交于 3月 14, 2013

Summary:
This patch allows an application to specify whether to use bufferedio,
reads-via-mmaps and writes-via-mmaps per database. Earlier, there
was a global static variable that was used to configure this functionality.

The default setting remains the same (and is backward compatible):
 1. use bufferedio
 2. do not use mmaps for reads
 3. use mmap for writes
 4. use readaheads for reads needed for compaction

I also added a parameter to db_bench to be able to explicitly specify
whether to do readaheads for compactions or not.

Test Plan: make check

Reviewers: sheki, heyongqiang, MarkCallaghan

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9429

ad96563b

20 3月, 2013 4 次提交

Fix more signed-unsigned comparisons · b1bea584

由 Mayank Agarwal 提交于 3月 19, 2013

Summary: Some comparisons left in log_test.cc and db_test.cc complained by make

Test Plan: make

Reviewers: dhruba, sheki

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9537

b1bea584

Fixed sign-comparison in rocksdb code-base and fixed Makefile · 487168cd

由 Mayank Agarwal 提交于 3月 14, 2013

Summary: Makefile had options to ignore sign-comparisons and unused-parameters, which should be there. Also fixed the specific errors in the code-base

Test Plan: make

Reviewers: chip, dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9531

487168cd

add --benchmarks=levelstats option to db_bench, prevent "nan" in stats output · 72d14eaf

由 Mark Callaghan 提交于 3月 19, 2013

Summary:
Add --benchmarks=levelstats option to report per-level stats (#files, #bytes)
Change readwhilewriting test to report response time for writes but exclude
them from the stats merged by all threads.
Prevent "NaN" in stats output by preventing division by 0.
Remove "o" file I committed by mistake.

Task ID: #

Blame Rev:

Test Plan:
make check

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9513

72d14eaf

Ignore a zero-sized file while looking for a seq-no in GetUpdatesSince · 02c45980

由 Abhishek Kona 提交于 3月 18, 2013

Summary:
Rocksdb can create 0 sized log files when it is opened and closed without any operations.
The GetUpdatesSince fails currently if there is a log file of size zero.

This diff fixes this. If there is a log file is 0, it is removed form the probable_file_list

Test Plan: unit test

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9507

02c45980

19 3月, 2013 1 次提交

DO not report level size as zero when there are no files in L0 · 7b9db9c9

由 Abhishek Kona 提交于 3月 18, 2013

Summary:
Instead of checking for number of files in L0. Check for number of files in the requested level.

Bug introduced in D4929 (diff trying to do too many things).

Test Plan: db_test.

Reviewers: dhruba, MarkCallaghan

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9483

7b9db9c9

15 3月, 2013 1 次提交

Enhance db_bench · 5a8c8845

由 Mark Callaghan 提交于 3月 01, 2013

Summary:
Add --benchmarks=updaterandom for read-modify-write workloads. This is different
from --benchmarks=readrandomwriterandom in a few ways. First, an "operation" is the
combined time to do the read & write rather than treating them as two ops. Second,
the same key is used for the read & write.

Change RandomGenerator to support rows larger than 1M. That was using "assert"
to fail and assert is compiled-away when -DNDEBUG is used.

Add more options to db_bench
--duration - sets the number of seconds for tests to run. When not set the
operation count continues to be the limit. This is used by random operation
tests.

--use_snapshot - when set GetSnapshot() is called prior to each random read.
This is to measure the overhead from using snapshots.

--get_approx - when set GetApproximateSizes() is called prior to each random
read. This is to measure the overhead for a query optimizer.

Task ID: #

Blame Rev:

Test Plan:
run db_bench

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9267

5a8c8845

13 3月, 2013 1 次提交

Fix valgrind errors in rocksdb tests: auto_roll_logger_test, reduce_levels_test · 5b278b53

由 Mayank Agarwal 提交于 3月 08, 2013

Summary: Fix for memory leaks in rocksdb tests. Also modified the variable NUM_FAILED_TESTS to print the actual number of failed tests.

Test Plan: make <test>; valgrind --leak-check=full ./<test>

Reviewers: sheki, dhruba

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9333

5b278b53

12 3月, 2013 1 次提交

Prevent segfault because SizeUnderCompaction was called without any locks. · ebf16f57

由 Dhruba Borthakur 提交于 3月 11, 2013

Summary:
SizeBeingCompacted was called without any lock protection. This causes
crashes, especially when running db_bench with value_size=128K.
The fix is to compute SizeUnderCompaction while holding the mutex and
passing in these values into the call to Finalize.

(gdb) where
#4  leveldb::VersionSet::SizeBeingCompacted (this=this@entry=0x7f0b490931c0, level=level@entry=4) at db/version_set.cc:1827
#5  0x000000000043a3c8 in leveldb::VersionSet::Finalize (this=this@entry=0x7f0b490931c0, v=v@entry=0x7f0b3b86b480) at db/version_set.cc:1420
#6  0x00000000004418d1 in leveldb::VersionSet::LogAndApply (this=0x7f0b490931c0, edit=0x7f0b3dc8c200, mu=0x7f0b490835b0, new_descriptor_log=<optimized out>) at db/version_set.cc:1016
#7  0x00000000004222b2 in leveldb::DBImpl::InstallCompactionResults (this=this@entry=0x7f0b49083400, compact=compact@entry=0x7f0b2b8330f0) at db/db_impl.cc:1473
#8  0x0000000000426027 in leveldb::DBImpl::DoCompactionWork (this=this@entry=0x7f0b49083400, compact=compact@entry=0x7f0b2b8330f0) at db/db_impl.cc:1757
#9  0x0000000000426690 in leveldb::DBImpl::BackgroundCompaction (this=this@entry=0x7f0b49083400, madeProgress=madeProgress@entry=0x7f0b41bf2d1e, deletion_state=...) at db/db_impl.cc:1268
#10 0x0000000000428f42 in leveldb::DBImpl::BackgroundCall (this=0x7f0b49083400) at db/db_impl.cc:1170
#11 0x000000000045348e in BGThread (this=0x7f0b49023100) at util/env_posix.cc:941
#12 leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper (arg=0x7f0b49023100) at util/env_posix.cc:874
#13 0x00007f0b4a7cf10d in start_thread (arg=0x7f0b41bf3700) at pthread_create.c:301
#14 0x00007f0b49b4b11d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Test Plan:
make check

I am running db_bench with a value size of 128K to see if the segfault is fixed.

Reviewers: MarkCallaghan, sheki, emayanke

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9279

ebf16f57

08 3月, 2013 1 次提交

A mechanism to detect manifest file write errors and put db in readonly mode. · 6d812b6a

由 Dhruba Borthakur 提交于 3月 06, 2013

Summary:
If there is an error while writing an edit to the manifest file, the manifest
file is closed and reopened to check if the edit made it in. However, if the
re-opening of the manifest is unsuccessful and options.paranoid_checks is set
t true, then the db refuses to accept new puts, effectively putting the db
in readonly mode.

In a future diff, I would like to make the default value of paranoid_check
to true.

Test Plan: make check

Reviewers: sheki

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9201

6d812b6a

07 3月, 2013 2 次提交

Do not allow Transaction Log Iterator to fall ahead when writer is writing the same file · d68880a1

由 Abhishek Kona 提交于 3月 04, 2013

Summary:
Store the last flushed, seq no. in db_impl. Check against it in
transaction Log iterator. Do not attempt to read ahead if we do not know
if the data is flushed completely.
Does not work if flush is disabled. Any ideas on fixing that?
* Minor change, iter->Next is called the first time automatically for
* the first time.

Test Plan:
existing test pass.
More ideas on testing this?
Planning to run some stress test.

Reviewers: dhruba, heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9087

d68880a1

Fox db_stress crash by copying keys before changing sequencenum to zero. · afed6093

由 Dhruba Borthakur 提交于 3月 05, 2013

Summary:
The compaction process zeros out sequence numbers if the output is
part of the bottommost level.
The Slice is supposed to refer to an immutable data buffer. The
merger that implements the priority queue while reading kvs as
the input of a compaction run reies on this fact. The bug was that
were updating the sequence number of a record in-place and that was
causing suceeding invocations of the merger to return kvs in
arbitrary order of sequence numbers.
The fix is to copy the key to a local memory buffer before setting
its seqno to 0.

Test Plan:
Set Options.purge_redundant_kvs_while_flush = false and then run
db_stress --ops_per_thread=1000 --max_key=320

Reviewers: emayanke, sheki

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9147

afed6093

05 3月, 2013 1 次提交

Removed unnecesary file object in table_cache. · f5896681

由 Dhruba Borthakur 提交于 3月 04, 2013

Summary:
TableCache->file is not used. remove it.
I kept the TableAndFile structure and will clean it up in a future patch.

Test Plan: make clean check

Reviewers: sheki, chip

Reviewed By: chip

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9075

f5896681

04 3月, 2013 2 次提交

Add rate_delay_limit_milliseconds · 993543d1

由 Mark Callaghan 提交于 3月 02, 2013

Summary:
This adds the rate_delay_limit_milliseconds option to make the delay
configurable in MakeRoomForWrite when the max compaction score is too high.
This delay is called the Ln slowdown. This change also counts the Ln slowdown
per level to make it possible to see where the stalls occur.

From IO-bound performance testing, the Level N stalls occur:
* with compression -> at the largest uncompressed level. This makes sense
                      because compaction for compressed levels is much
                      slower. When Lx is uncompressed and Lx+1 is compressed
                      then files pile up at Lx because the (Lx,Lx+1)->Lx+1
                      compaction process is the first to be slowed by
                      compression.
* without compression -> at level 1

Task ID: #1832108

Blame Rev:

Test Plan:
run with real data, added test

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9045

993543d1

Ability for rocksdb to compact when flushing the in-memory memtable to a file in L0. · 806e2643

由 Dhruba Borthakur 提交于 2月 28, 2013

Summary:
Rocks accumulates recent writes and deletes in the in-memory memtable.
When the memtable is full, it writes the contents on the memtable to
a file in L0.

This patch removes redundant records at the time of the flush. If there
are multiple versions of the same key in the memtable, then only the
most recent one is dumped into the output file. The purging of
redundant records occur only if the most recent snapshot is earlier
than the earliest record in the memtable.

Should we switch on this feature by default or should we keep this feature
turned off in the default settings?

Test Plan: Added test case to db_test.cc

Reviewers: sheki, vamsi, emayanke, heyongqiang

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8991

806e2643

kvdb / rocksdb 11 个月 前同步成功

kvdb / rocksdb
11 个月前同步成功