提交 · fe3713961ec573a80e742a334747571549ad4369 · kvdb / rocksdb

15 10月, 2013 3 次提交

Features in Transaction log iterator · fe371396

由 Mayank Agarwal 提交于 10月 13, 2013

Summary:
* Logstore requests a valid change of reutrning an empty iterator and not an error in case of no log files.
* Changed the code to return the writebatch containing the sequence number requested from GetupdatesSince even if it lies in the middle. Earlier we used to return the next writebatch,. This also allows me oto guarantee that no files played upon by the iterator are redundant. I mean the starting log file has at least a sequence number >= the sequence number requested form GetupdatesSince.
* Cleaned up redundant logic in Iterator::Next and made a new function SeekToStartSequence for greater readability and maintainibilty.
* Modified a test in db_test accordingly
Please check the logic carefully and suggest improvements. I have a separate patch out for more improvements like restricting reader to read till written sequences.

Test Plan:
* transaction log iterator tests in db_test,
* db_repl_stress.
* rocks_log_iterator_test in fbcode/wormhole/rocksdb/test - 2 tests thriving on hacks till now can get simplified
* testing on the shadow setup for sigma with replication

Reviewers: dhruba, haobo, kailiu, sdong

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13437

fe371396

Add statistics to sst file · 86ef6c3f

由 Kai Liu 提交于 10月 10, 2013

Summary:
So far we only have key/value pairs as well as bloom filter stored in the
sst file.  It will be great if we are able to store more metadata about
this table itself, for example, the entry size, bloom filter name, etc.

This diff is the first step of this effort. It allows table to keep the
basic statistics mentioned in http://fburl.com/14995441, as well as
allowing writing user-collected stats to stats block.

After this diff, we will figure out the interface of how to allow user to collect their interested statistics.

Test Plan:
1. Added several unit tests.
2. Ran `make check` to ensure it doesn't break other tests.

Reviewers: dhruba, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13419

86ef6c3f

Change Function names from Compaction->Flush When they really mean Flush · 88f2f890

由 Siying Dong 提交于 10月 14, 2013

Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance.

Test Plan: Run test suites.

Reviewers: haobo, dhruba, xjin

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13431

88f2f890

12 10月, 2013 1 次提交

LRUCache to try to clean entries not referenced first. · f8509653

由 sdong 提交于 10月 09, 2013

Summary:
With this patch, when LRUCache.Insert() is called and the cache is full, it will first try to free up entries whose reference counter is 1 (would become 0 after remo\
ving from the cache). We do it in two passes, in the first pass, we only try to release those unreferenced entries. If we cannot free enough space after traversing t\
he first remove_scan_cnt_ entries, we start from the beginning again and remove those entries being used.

Test Plan: add two unit tests to cover the codes

Reviewers: dhruba, haobo, emayanke

Reviewed By: emayanke

CC: leveldb, emayanke, xjin

Differential Revision: https://reviews.facebook.net/D13377

f8509653

11 10月, 2013 3 次提交

Bad nfs file checked in a long time back. · c0ce562c

由 Dhruba Borthakur 提交于 10月 10, 2013

Summary:
Bad nfs file checked in a long time back.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:

c0ce562c

Fixing error in ParseFileName causing DestroyDB to fail on archive directory · a8b4a69d

由 Mayank Agarwal 提交于 10月 10, 2013

Summary:
This careless error was causing ASSERT_OK(DestroyDB) to fail in db_test.
Basically .. was being returned as a child of db/archive and ParseFileName returned false on that,
but 'type' was set to LogFile from earlier and not reset. The return of ParseFileName was not being checked to delete the log file or not.

Test Plan: make all check

Reviewers: dhruba, haobo, xjin, kailiu, nkg-

Reviewed By: nkg-

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13413

a8b4a69d

Minor: Fix a lint error in cache_test.cc · 40a1e31f

由 Siying Dong 提交于 10月 10, 2013

Summary:
As title. Fix an lint error:

Lint: CppLint Error
Single-argument constructor 'Value(int v)' may inadvertently be used as a type conversion constructor. Prefix the function with the 'explicit' keyword to avoid this, or add an /* implicit */ comment to suppress this warning.

Test Plan: N/A

Reviewers: emayanke, haobo, dhruba

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13401

40a1e31f

10 10月, 2013 4 次提交

Fixing build failure · d2ca2bd1

由 Igor Canadi 提交于 10月 10, 2013

Summary: virtual NewRandomRWFile is not implemented on EnvHdfs, causing build failure.

Test Plan: make clean; make all check

Reviewers: dhruba, haobo, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13383

d2ca2bd1

Env class that can randomly read and write · d0beadd4

由 Igor Canadi 提交于 10月 10, 2013

Summary: I have implemented basic simple use case that I need for External Value Store I'm working on. There is a potential for making this prettier by refactoring/combining WritableFile and RandomAccessFile, avoiding some copypasta. However, I decided to implement just the basic functionality, so I can continue working on the other diff.

Test Plan: Added a unittest

Reviewers: dhruba, haobo, kailiu

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13365

d0beadd4

Add draft logo. · 7ac3c796

由 Dhruba Borthakur 提交于 10月 09, 2013

Summary:
Add draft logo in jpg format.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:

7ac3c796

A bare-bones rocksdb logo. · 6d5f6a4b

由 Dhruba Borthakur 提交于 10月 09, 2013

Summary:
A hand-crafted rocksdb logo.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:

6d5f6a4b

09 10月, 2013 3 次提交

Remove obsolete namespace mappings. · 3c37955a

由 Dhruba Borthakur 提交于 10月 08, 2013

Summary:
The previous release 2.4 had a mapping to alias the older
namespace to rocksdb. This mapping is not needed in the new
release.

Test Plan:
make check
make release

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13359

3c37955a

Add option for storing transaction logs in a separate dir · cbf4a064

由 Naman Gupta 提交于 10月 01, 2013

Summary: In some cases, you might not want to store the data log (write ahead log) files in the same dir as the sst files. An example use case is leaf, which stores sst files in tmpfs. And would like to save the log files in a separate dir (disk) to save memory.

Test Plan: make all. Ran db_test test. A few test failing. P2785018. If you guys don't see an obvious problem with the code, maybe somebody from the rocksdb team could help me debug the issue here. Running this on leaf worked well. I could see logs stored on disk, and deleted appropriately after compactions. Obviously this is only one set of options. The unit tests cover different options. Seems like I'm missing some edge cases.

Reviewers: dhruba, haobo, leveldb

CC: xinyaohu, sumeet

Differential Revision: https://reviews.facebook.net/D13239

cbf4a064

Make db_test more robust · 11607141

由 Naman Gupta 提交于 10月 04, 2013

Summary: While working on D13239, I noticed that the same options are not used for opening and destroying at db. So adding that. Also added asserts for successful DestroyDB calls.

Test Plan: Ran unit tests. Atleast 1 unit test is failing. They failures are a result of some past logic change. I'm not really planning to fix those. But I would like to check this in. And hopefully the respective unit test owners can fix the broken tests

Reviewers: leveldb, haobo

CC: xinyaohu, sumeet, dhruba

Differential Revision: https://reviews.facebook.net/D13329

11607141

08 10月, 2013 2 次提交

Fix a bug in table builder · 1f8ade6b

由 Kai Liu 提交于 10月 07, 2013

Summary:
In talbe.cc, when reading the metablock, it uses BytewiseComparator();
However in table_builder.cc, we use r->options.comparator. After tracing
the creation of r->options.comparator, I found this comparator is an
InternalKeyComparator, which wraps the user defined comparator(details
can be found in DBImpl::SanitizeOptions().

I encountered this problem when adding metadata about "bloom filter"
before. With different comparator, we may fail to do the binary sort.

Current code works well since there is only one entry in meta block.

Test Plan:
make all check

I've also tested this change in https://reviews.facebook.net/D8283 before.

Reviewers: dhruba, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13335

1f8ade6b

Move delete and free outside of crtical section · fa46ddb4

由 Igor Canadi 提交于 10月 07, 2013

Summary: Split Unref into two parts -> cheap and expensive. Try to call expensive Unref outside of critical section to decrease lock contention.

Test Plan: unittests

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb, kailiu

Differential Revision: https://reviews.facebook.net/D13299

fa46ddb4

06 10月, 2013 3 次提交

Unit test failure in DBTest.NumImmutableMemTable. · 1a8c1b08

由 Dhruba Borthakur 提交于 10月 06, 2013

Summary:
Previous patch introduced a unit test failure in
DBTest.NumImmutableMemTable because of change in property names.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:

1a8c1b08

Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. · 4463b11c

由 Dhruba Borthakur 提交于 10月 04, 2013

Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix.

Test Plan: make check

Reviewers: emayanke, haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13311

4463b11c

[RocksDB] Added a property "leveldb.num-immutable-mem-table" so that Flush can... · bf89edf7

由 Haobo Xu 提交于 10月 04, 2013

[RocksDB] Added a property "leveldb.num-immutable-mem-table" so that Flush can be called without blocking, and application still has a way to check when it's done also without blocking.

Summary: as title

Test Plan: DBTest.NumImmutableMemTable

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13305

bf89edf7

05 10月, 2013 5 次提交

Removed scribe, thrift and java modules. · 0a9f873f

由 Dhruba Borthakur 提交于 10月 04, 2013

Summary: Removed scribe, thrift and java modules.

Test Plan:
make release
make check

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13293

0a9f873f

M
Updating README.fb to have newest verison 2.4 · aad21108
由 Mayank Agarwal 提交于 10月 04, 2013
```
Summary:

Test Plan: visual
```
aad21108

Change namespace from leveldb to rocksdb · a143ef9b

由 Dhruba Borthakur 提交于 10月 03, 2013

Summary:
Change namespace from leveldb to rocksdb. This allows a single
application to link in open-source leveldb code as well as
rocksdb code into the same process.

Test Plan: compile rocksdb

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13287

a143ef9b

Add a statistic to count the number of calls to GetUpdatesSince · b3ed0812

由 Mayank Agarwal 提交于 10月 04, 2013

Summary: This is useful to keep track of refreshes in transaction log iterator

Test Plan: make; db_stress --statistics=1 shows it

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13281

b3ed0812

Add backward compatible option in GetLiveFiles to choose whether to not Flush first · 854d2363

由 Mayank Agarwal 提交于 10月 03, 2013

Summary:
As explained in comments in GetLiveFiles in db.h, this option will cause flush to be skipped in GetLiveFiles because some use-cases use GetSortedWalFiles after GetLiveFiles to generate more complete snapshots.
Using GetSortedWalFiles after GetLiveFiles allows us to not Flush in GetLiveFiles first because wals have everything.
Note: file deletions will be disabled before calling GLF or GSWF so live logs will not move to archive logs or get delted.
Note: Manifest file is truncated to a proper value in GLF, so it will always reply from the proper wal files on a restart

Test Plan: make

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13257

854d2363

04 10月, 2013 2 次提交

[RocksDB] Still honor DisableFileDeletions when purge_log_after_memtable_flush is on · 200c05a2

由 Haobo Xu 提交于 10月 03, 2013

Summary: as title

Test Plan: make check

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13263

200c05a2

[Rocksdb] Submit mem table flush job in a different thread pool · fa798e9e

由 Haobo Xu 提交于 9月 13, 2013

Summary: As title. This is just a quick hack and not ready for commit. fails a lot of unit test. I will test/debug it directly in ViewState shadow .

Test Plan: Try it in shadow test.

Reviewers: dhruba, xjin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12933

fa798e9e

03 10月, 2013 3 次提交

Fix SIGSEGV issue in universal compaction · 658a3ce2

由 Xing Jin 提交于 10月 02, 2013

Summary:
We saw SIGSEGV when set options.num_levels=1 in universal compaction
style. Dug into this issue for a while, and finally found the root cause (thank Haobo for discussion).

Test Plan: Add new unit test. It throws SIGSEGV without this change. Also run "make all check".

Reviewers: haobo, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13251

658a3ce2

Triggering verify for gets also · 6b34021f

由 Mayank Agarwal 提交于 10月 01, 2013

Summary: Will use iterators to verify keys in the db for half of its keys and Gets for the other half.

Test Plan: ./db_stress --max_key=1000 --ops_per_thread=100

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13227

6b34021f

[RocksDB] Added perf counters to track skipped internal keys during iteration · 71046971

由 Haobo Xu 提交于 10月 02, 2013

Summary: as title. unit test not polished. this is for a quick live test

Test Plan: live

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13221

71046971

02 10月, 2013 1 次提交

Remove the hard-coded enum value in statistics.h · 861f6e48

由 Kai Liu 提交于 10月 01, 2013

Summary:
I am planning to add more to statistics classes but found current way of using enum is very verbose and unnecessarily increase the
difficulity of adding new statistics.

In this diff I removed the code that explicitly specifies the value of each enum entry. This will help us easily add new statistic
items more conveniently without manually adding the value of other enum entries by one.

Test Plan: make; make check;

Reviewers: haobo, dhruba, xjin, emayanke, vamsi

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13197

861f6e48

01 10月, 2013 1 次提交

Phase 2 of iterator stress test · 7edb92b8

由 Natalie Hildebrandt 提交于 9月 30, 2013

Summary: Using an iterator instead of the Get method, each thread goes through a portion of the database and verifies values by comparing to the shared state.

Test Plan:
./db_stress --db=/tmp/tmppp --max_key=10000 --ops_per_thread=10000

To test some basic cases, the following lines can be added (each set in turn) to the verifyDb method with the following expected results:

    // Should abort with "Unexpected value found"
    shared.Delete(start);

    // Should abort with "Value not found"
    WriteOptions write_opts;
    db_->Delete(write_opts, Key(start));

    // Should succeed
    WriteOptions write_opts;
    shared.Delete(start);
     db_->Delete(write_opts, Key(start));

    // Should abort with "Value not found"
    WriteOptions write_opts;
    db_->Delete(write_opts, Key(start + (end-start)/2));

    // Should abort with "Value not found"
    db_->Delete(write_opts, Key(end-1));

    // Should abort with "Unexpected value"
    shared.Delete(end-1);

    // Should abort with "Unexpected value"
    shared.Delete(start + (end-start)/2);

    // Should abort with "Value not found"
    db_->Delete(write_opts, Key(start));
    shared.Delete(start);
    db_->Delete(write_opts, Key(end-1));
    db_->Delete(write_opts, Key(end-2));

To test the out of range abort, change the key in the for loop to Key(i+1), so that the key defined by the index i is now outside of the supposed range of the database.

Reviewers: emayanke

Reviewed By: emayanke

CC: dhruba, xjin

Differential Revision: https://reviews.facebook.net/D13071

7edb92b8

29 9月, 2013 2 次提交

[RocksDB] print the name of options.memtable_factory in LOG so we know · 22bb7c75

由 Haobo Xu 提交于 9月 26, 2013

Summary: as title

Test Plan: make check

Reviewers: dhruba, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13179

22bb7c75

New unit test for iterator with snapshot · 8eb552bf

由 Xing Jin 提交于 9月 28, 2013

Summary:
I played with the reported bug about iterator with snapshot:
https://code.google.com/p/leveldb/issues/detail?id=200.

I turned the original test program
(https://code.google.com/p/leveldb/issues/attachmentText?id=200&aid=2000000000&name=test.cc&token=7uOUQW-HFlbAFMUm7EqtaAEy7Tw%3A1378320724136)
into a new unit test, but I cannot reproduce the problem. Notice lines
31-34 in above link. I have ran the new test with and without such Put()
operations. Both succeed.

So this diff simply adds the test, without changing any source codes.

Test Plan: run new test.

Reviewers: dhruba, haobo, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12735

8eb552bf

27 9月, 2013 2 次提交

[RocksDB] Move last_sequence and last_flushed_sequence_ update back into lock protected area · 0c404068

由 Haobo Xu 提交于 9月 26, 2013

Summary: A previous diff moved these outside of lock protected area. Moved back in now. Also moved tmp_batch_ update outside of lock protected area, as only the single write thread can access it.

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13137

0c404068

[RocksDB] Fix skiplist sequential insertion optimization · 08740b15

由 Haobo Xu 提交于 9月 26, 2013

Summary: The original optimization missed updating links other than the lowest level.

Test Plan: make check; perf_context_test

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, adsharma

Differential Revision: https://reviews.facebook.net/D13119

08740b15

26 9月, 2013 2 次提交

[RocbsDB] Add an option to enable set based memtable for perf_context_test · e0aa19a9

由 Haobo Xu 提交于 9月 25, 2013

Summary:
as title.
Some result:

-- Sequential insertion of 1M key/value with stock skip list (all in on memtable)
time ./perf_context_test  --total_keys=1000000  --use_set_based_memetable=0
Inserting 1000000 key/value pairs
...
Put uesr key comparison:
Count: 1000000  Average: 8.0179  StdDev: 176.34
Min: 0.0000  Median: 2.5555  Max: 88933.0000
Percentiles: P50: 2.56 P75: 2.83 P99: 58.21 P99.9: 133.62 P99.99: 987.50
Get uesr key comparison:
Count: 1000000  Average: 43.4465  StdDev: 379.03
Min: 2.0000  Median: 36.0195  Max: 88939.0000
Percentiles: P50: 36.02 P75: 43.66 P99: 112.98 P99.9: 824.84 P99.99: 7615.38
real	0m21.345s
user	0m14.723s
sys	0m5.677s

-- Sequential insertion of 1M key/value with set based memtable (all in on memtable)
time ./perf_context_test  --total_keys=1000000  --use_set_based_memetable=1
Inserting 1000000 key/value pairs
...
Put uesr key comparison:
Count: 1000000  Average: 61.5022  StdDev: 6.49
Min: 0.0000  Median: 62.4295  Max: 71.0000
Percentiles: P50: 62.43 P75: 66.61 P99: 71.00 P99.9: 71.00 P99.99: 71.00
Get uesr key comparison:
Count: 1000000  Average: 29.3810  StdDev: 3.20
Min: 1.0000  Median: 29.1801  Max: 34.0000
Percentiles: P50: 29.18 P75: 32.06 P99: 34.00 P99.9: 34.00 P99.99: 34.00
real	0m28.875s
user	0m21.699s
sys	0m5.749s

Worst case comparison for a Put is 88933 (skiplist) vs 71 (set based memetable)

Of course, there's other in-efficiency in set based memtable implementation, which lead to the overall worst performance. However, P99 behavior advantage is very very obvious.

Test Plan: ./perf_context_test and viewstate shadow testing

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13095

e0aa19a9

The vector rep implementation was segfaulting because of incorrect initialization of vector. · f1a60e5c

由 Dhruba Borthakur 提交于 9月 24, 2013

Summary:
The constructor for Vector memtable has a parameter called 'count'
that specifies the capacity of the vector to be reserved at allocation
time. It was incorrectly used to initialize the size of the vector.

Test Plan: Enhanced db_test.

Reviewers: haobo, xjin, emayanke

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13083

f1a60e5c

24 9月, 2013 1 次提交

Implement apis in the Environment to clear out pages in the OS cache. · 87d6eb2f

由 Dhruba Borthakur 提交于 9月 20, 2013

Summary:
Added a new api to the Environment that allows clearing out not-needed
pages from the OS cache. This will be helpful when the compressed
block cache replaces the OS cache.

Test Plan: EnvPosixTest.InvalidateCache

Reviewers: haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13041

87d6eb2f

21 9月, 2013 1 次提交

Fixing crashing tests to include iterpercent param · 9262061b

由 Natalie Hildebrandt 提交于 9月 20, 2013

Summary: Adding in the iterpercent flag to tests.

Test Plan: make crash_test

Reviewers: emayanke

Reviewed By: emayanke

Differential Revision: https://reviews.facebook.net/D13035

9262061b

20 9月, 2013 1 次提交

Better locking in vectorrep that increases throughput to match speed of storage. · 5e9f3a9a

由 Dhruba Borthakur 提交于 9月 17, 2013

Summary:
There is a use-case where we want to insert data into rocksdb as
fast as possible. Vector rep is used for this purpose.

The background flush thread needs to flush the vectorrep to
storage. It acquires the dblock then sorts the vector, releases
the dblock and then writes the sorted vector to storage. This is
suboptimal because the lock is held during the sort, which
prevents new writes for occuring.

This patch moves the sorting of the vector rep to outside the
db mutex. Performance is now as fastas the underlying storage
system. If you are doing buffered writes to rocksdb files, then
you can observe throughput upwards of 200 MB/sec writes.

This is an early draft and not yet ready to be reviewed.

Test Plan:
make check

Task ID: #

Blame Rev:

Reviewers: haobo

Reviewed By: haobo

CC: leveldb, haobo

Differential Revision: https://reviews.facebook.net/D12987

5e9f3a9a

kvdb / rocksdb 11 个月 前同步成功

kvdb / rocksdb
11 个月前同步成功