1. 02 11月, 2013 3 次提交
    • D
      Implement a compressed block cache. · b4ad5e89
      Dhruba Borthakur 提交于
      Summary:
      Rocksdb can now support a uncompressed block cache, or a compressed
      block cache or both. Lookups first look for a block in the
      uncompressed cache, if it is not found only then it is looked up
      in the compressed cache. If it is found in the compressed cache,
      then it is uncompressed and inserted into the uncompressed cache.
      
      It is possible that the same block resides in the compressed cache
      as well as the uncompressed cache at the same time. Both caches
      have their own individual LRU policy.
      
      Test Plan: Unit test case attached.
      
      Reviewers: kailiu, sdong, haobo, leveldb
      
      Reviewed By: haobo
      
      CC: xjin, haobo
      
      Differential Revision: https://reviews.facebook.net/D12675
      b4ad5e89
    • P
      Task #3071144 Enhance ldb (db dump tool for leveldb) to report row counters for each row type · 1e4375d2
      Piyush Garg 提交于
      Summary: Added an option --count_delim=<char> which takes the given character as delimiter ('.' by default) and reports count of each row type found in the db
      
      Test Plan:
      1. Created test in file (for DBDumperCommand) rocksdb/tools/ldb_test.py which puts various key value pair in db and checks the output using dump --count_delim ,--count_delim="." and --count_delim=",".
      2. Created test in file (for InternalDumperCommand) rocksdb/tools/ldb_test.py which puts various key value pair in db and checks the output using dump --count_delim ,--count_delim="." and --count_delim=",".
      3. Manually created a database with several keys of several type and verified by running the command
         ./ldb db=<path> dump --count_delim="<char>"
         ./ldb db=<path> idump --count_delim="<char>"
      
      Reviewers: vamsi, dhruba, emayanke, kailiu
      
      Reviewed By: vamsi
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13815
      1e4375d2
    • I
      Move I/O outside of lock · beeb74be
      Igor Canadi 提交于
      Summary:
      I'm figuring out how Version[Set, Edit, ] classes work and I stumbled on this.
      
      It doesn't seem that the comment is accurate anymore. What I read is when the manifest grows too big, create a new file (and not only when we call LogAndApply for the first time).
      
      Test Plan: make check (currently running)
      
      Reviewers: dhruba, haobo, kailiu, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13839
      beeb74be
  2. 01 11月, 2013 7 次提交
    • I
      Flush Log every 5 seconds · b572e81f
      Igor Canadi 提交于
      Summary: This might help with p99 performance, but does not solve the real problem. More discussion on #2947135
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13809
      b572e81f
    • S
      Fix a bug of table_reader_bench · 82b7e37f
      Siying Dong 提交于
      Summary: Iterator benchmark case is timed incorrectly. Fix it
      
      Test Plan: Run the benchmark
      
      Reviewers: haobo, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13845
      82b7e37f
    • S
      A very simple benchmark to measure Table implemenation's Get() And Iterator performance · 7caadf2e
      Siying Dong 提交于
      Summary: It is a very simple benchmark to measure a Table implementation's Get() and iterator performance if all the data is in memory.
      
      Test Plan: N/A
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13743
      7caadf2e
    • H
      [RocksDB] Add OnCompactionStart to CompactionFilter class · 8cbe5bb5
      Haobo Xu 提交于
      Summary: This is to give application compaction filter a chance to access context information of a specific compaction run. For example, depending on whether a compaction goes through all data files, the application could do things differently.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13683
      8cbe5bb5
    • N
      b4fab3be
    • I
      Fix make release · 138a8eee
      Igor Canadi 提交于
      Summary: Don't define if already defined.
      
      Test Plan: Running make release in parallel, will not commit if it fails.
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13833
      138a8eee
    • N
      In-place updates for equal keys and similar sized values · fe250702
      Naman Gupta 提交于
      Summary:
      Currently for each put, a fresh memory is allocated, and a new entry is added to the memtable with a new sequence number irrespective of whether the key already exists in the memtable. This diff is an attempt to update the value inplace for existing keys. It currently handles a very simple case:
      1. Key already exists in the current memtable. Does not inplace update values in immutable memtable or snapshot
      2. Latest value type is a 'put' ie kTypeValue
      3. New value size is less than existing value, to avoid reallocating memory
      
      TODO: For a put of an existing key, deallocate memory take by values, for other value types till a kTypeValue is found, ie. remove kTypeMerge.
      TODO: Update the transaction log, to allow consistent reload of the memtable.
      
      Test Plan: Added a unit test verifying the inplace update. But some other unit tests broken due to invalid sequence number checks. WIll fix them next.
      
      Reviewers: xinyaohu, sumeet, haobo, dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12423
      
      Automatic commit by arc
      fe250702
  3. 31 10月, 2013 1 次提交
    • S
      Follow-up Cleaning-up After D13521 · f03b2df0
      Siying Dong 提交于
      Summary:
      This patch is to address @haobo's comments on D13521:
      1. rename Table to be TableReader and make its factory function to be GetTableReader
      2. move the compression type selection logic out of TableBuilder but to compaction logic
      3. more accurate comments
      4. Move stat name constants into BlockBasedTable implementation.
      5. remove some uncleaned codes in simple_table_db_test
      
      Test Plan: pass test suites.
      
      Reviewers: haobo, dhruba, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13785
      f03b2df0
  4. 30 10月, 2013 1 次提交
  5. 29 10月, 2013 8 次提交
    • K
      Change a typo in method signature · 79d8dad3
      Kai Liu 提交于
      79d8dad3
    • S
      Make "Table" pluggable · d4eec30e
      Siying Dong 提交于
      Summary: This patch makes Table and TableBuilder a abstract class and make all the implementation of the current table into BlockedBasedTable and BlockedBasedTable Builder.
      
      Test Plan: Make db_test.cc to work with block based table. Add a new test simple_table_db_test.cc where a different simple table format is implemented.
      
      Reviewers: dhruba, haobo, kailiu, emayanke, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13521
      d4eec30e
    • I
      Run benchmark with no debug · 8ace6b0f
      Igor Canadi 提交于
      Summary: assert(Overlap) significantly slows down the benchmark. Ignore assertions when executing blob_store_bench.
      
      Test Plan: Ran the benchmark
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13737
      8ace6b0f
    • I
      Fix data race in BlobStore benchmark · 17991cd5
      Igor Canadi 提交于
      Summary: Apparently C++ doesn't like it if you copy around its atomic<> variables. When running a benchmark for a longer time, benchmark used to stall. Changed WorkerThread in config to WorkerThread*. It works now.
      
      Test Plan: Ran benchmark
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13731
      17991cd5
    • K
      Support user-defined table stats collector · 994575c1
      Kai Liu 提交于
      Summary:
      1. Added a new option that support user-defined table stats collection.
      2. Added a deleted key stats collector in `utilities`
      
      Test Plan:
      Added a unit test for newly added code.
      Also ran make check to make sure other tests are not broken.
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13491
      994575c1
    • K
      Fix a valgrind warning · 7e91b86f
      Kai Liu 提交于
      Summary:
      A latest valgrind test found a recently added unit test has memory leak,
      which is because DB is not closed at the end of the test.
      
      Test Plan: re-run the valgrind locally and make sure there's no memory leakage any more.
      
      Reviewers: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13725
      7e91b86f
    • I
      If a Put fails, fail all other puts · 100fa8e0
      Igor Canadi 提交于
      Summary:
      When a Put fails, it can leave database in a messy state. We don't want to pretend that everything is OK when it may not be. We fail every write following the failed one.
      
      I added checks for corruption to DBImpl::Write(). Is there anywhere else I need to add them?
      
      Test Plan: Corruption unit test.
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13671
      100fa8e0
    • K
      Fix a bug that index block's restart_block_interval is not 1 · 1ca86f03
      Kai Liu 提交于
      Summary:
      
      This bug may affect the seek performance.
      
      Test Plan:
      
      make
      make check
      
      Also gdb into some index block builder to make sure the restart_block_interval is `1`.
      1ca86f03
  6. 28 10月, 2013 2 次提交
  7. 25 10月, 2013 2 次提交
    • M
      Unify DeleteFile and DeleteWalFiles · 56305221
      Mayank Agarwal 提交于
      Summary:
      This is to simplify rocksdb public APIs and improve the code quality.
      Created an additional parameter to ParseFileName for log sub type and improved the code for deleting a wal file.
      Wrote exhaustive unit-tests in delete_file_test
      Unification of other redundant APIs can be taken up in a separate diff
      
      Test Plan: Expanded delete_file test
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13647
      56305221
    • K
      Fix the log number bug when updating MANIFEST file · c17607a2
      Kai Liu 提交于
      Summary:
      Crash may occur during the flushes of more than two mem tables.
      
      As the info log suggested, even when both were successfully flushed,
      the recovery process still pick up one of the memtable's log for recovery.
      
      This diff fix the problem by setting the correct "log number" in MANIFEST.
      
      Test Plan: make test; deployed to leaf4 and make sure it doesn't result in crashes of this type.
      
      Reviewers: haobo, dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13659
      c17607a2
  8. 24 10月, 2013 5 次提交
  9. 23 10月, 2013 5 次提交
    • K
      Improve the comment for the shared library in Make file · b37fda84
      Kai Liu 提交于
      b37fda84
    • I
      Enable blobs to be fragmented · 30f1b97a
      Igor Canadi 提交于
      Summary:
      I have implemented a FreeList version that supports fragmented blob chunks. Each block gets allocated and freed in FIFO order. Since the idea for the blocks to be big, we will not take a big hit of non-sequential IO. Free list is also faster, taking only O(k) size in both free and allocate instead of O(N) as before.
      
      See more info on the task: https://our.intern.facebook.com/intern/tasks/?t=2990558
      
      Also, I'm taking Slice instead of const char * and size in Put function.
      
      Test Plan: unittests
      
      Reviewers: haobo, kailiu, dhruba, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13569
      30f1b97a
    • K
      Update the latest rocksdb version · 70e87f78
      Kai Liu 提交于
      70e87f78
    • M
      Dbid feature · 9b50106f
      Mayank Agarwal 提交于
      Summary:
      Create a new type of file on startup if it doesn't already exist called DBID.
      This will store a unique number generated from boost library's uuid header file.
      The use-case is to identify the case of a db losing all its data and coming back up either empty or from an image(backup/live replica's recovery)
      the key point to note is that DBID is not stored in a backup or db snapshot
      It's preferable to use Boost for uuid because:
      1) A non-standard way of generating uuid is not good
      2) /proc/sys/kernel/random/uuid generates a uuid but only on linux environments and the solution would not be clean
      3) c++ doesn't have any direct way to get a uuid
      4) Boost is a very good library that was already having linkage in rocksdb from third-party
      Note: I had to update the TOOLCHAIN_REV in build files to get latest verison of boost from third-party as the older version had a bug.
      I had to put Wno-uninitialized in Makefile because boost-1.51 has an unitialized variable and rocksdb would not comiple otherwise. Latet open-source for boost is 1.54 but is not there in third-party. I have notified the concerned people in fbcode about it.
      @kailiu : While releasing to third-party, an additional dependency will need to be created for boost in TARGETS file. I can help identify.
      
      Test Plan:
      Expand db_test to test 2 cases
      1) Restarting db with Id file present - verify that no change to Id
      2)Restarting db with Id file deleted - verify that a different Id is there after reopen
      Also run make all check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13587
      9b50106f
    • M
      Disallow transaction log iterator to skip sequences · ae8e0770
      Mayank Agarwal 提交于
      Summary:
      This is expected to solve the "gaps in transaction log iterator" problem.
      * After a lot of observations on the gaps on the sigmafio machines I found that it is due to a race between log reader and writer always.
      * So when we drop the wormhole subscription and refresh the iterator, the gaps are not there.
      * It is NOT due to some boundary or corner case left unattended in the iterator logic because I checked many instances of the gaps against their log files with ldb. The log files are NOT corrupted also.
      * The solution is to not allow the iterator to read incompletely written sequences and detect gaps inside itself and invalidate it which will cause the application to refresh the iterator normally and seek to the required sequence properly.
      * Thus, the iterator can at least guarantee that it will not give any gaps.
      
      Test Plan:
      * db_test based log iterator tests
      * db_repl_stress
      * testing on sigmafio setup to see gaps go away
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13593
      ae8e0770
  10. 22 10月, 2013 1 次提交
  11. 21 10月, 2013 1 次提交
  12. 18 10月, 2013 4 次提交
    • I
      tmpfs does not support fallocate · bcc85579
      Igor Canadi 提交于
      Summary: This caused Siying's unit test to fail.
      
      Test Plan: Unittest
      
      Reviewers: dhruba, kailiu, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13539
      bcc85579
    • S
      Fix Bug: iterator.Prev() or iterator.SeekToLast() might return the first... · 65428b0c
      Siying Dong 提交于
      Fix Bug: iterator.Prev() or iterator.SeekToLast() might return the first element instead of the correct one
      
      Summary:
      Recent patch https://reviews.facebook.net/D11865 introduced a regression bug:
      
      DBIter::FindPrevUserEntry(), which is called by DBIter::Prev() (and also implicitly if calling iterator.SeekToLast())  might do issue a seek when having skipped too many entries. If the skipped entry just before the seek() is a delete, the saved key is erased so that it seeks to the front, so Prev() would return the first element.
      
      This patch fixes the bug by not doing seek() in DBIter::FindNextUserEntry() if saved key has been erased.
      
      Test Plan: Add a test DBTest.IterPrevMaxSkip which would fail without the patch and would pass with the change.
      
      Reviewers: dhruba, xjin, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13557
      65428b0c
    • S
      Universal Compaction to Have a Size Percentage Threshold To Decide Whether to Compress · 9edda370
      Siying Dong 提交于
      Summary:
      This patch adds a option for universal compaction to allow us to only compress output files if the files compacted previously did not yet reach a specified ratio, to save CPU costs in some cases.
      
      Compression is always skipped for flushing. This is because the size information is not easy to evaluate for flushing case. We can improve it later.
      
      Test Plan:
      add test
      DBTest.UniversalCompactionCompressRatio1 and DBTest.UniversalCompactionCompressRatio12
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13467
      9edda370
    • K
      Add bloom filter to predefined table stats · aac44226
      Kai Liu 提交于
      Summary: As title.
      
      Test Plan: Updated the unit tests to make sure new statistic is correctly written/read.
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13497
      aac44226