1. 13 11月, 2013 1 次提交
  2. 12 11月, 2013 2 次提交
  3. 09 11月, 2013 1 次提交
    • I
      Speed up FindObsoleteFiles · 1510339e
      Igor Canadi 提交于
      Summary:
      Here's one solution we discussed on speeding up FindObsoleteFiles. Keep a set of all files in DBImpl and update the set every time we create a file. I probably missed few other spots where we create a file.
      
      It might speed things up a bit, but makes code uglier. I don't really like it.
      
      Much better approach would be to abstract all file handling to a separate class. Think of it as layer between DBImpl and Env. Having a separate class deal with file namings and deletion would benefit both code cleanliness (especially with huge DBImpl) and speed things up. It will take a huge effort to do this, though.
      
      Let's discuss offline today.
      
      Test Plan: Ran ./db_stress, verified that files are getting deleted
      
      Reviewers: dhruba, haobo, kailiu, emayanke
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D13827
      1510339e
  4. 08 11月, 2013 2 次提交
    • K
      Provide mechanism to configure when to flush the block · fd075d6e
      Kai Liu 提交于
      Summary: Allow block based table to configure the way flushing the blocks. This feature will allow us to add support for prefix-aligned block.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, sdong, igor
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13875
      fd075d6e
    • I
      Flush the log outside of lock · 444cf88a
      Igor Canadi 提交于
      Summary:
      Added a new call LogFlush() that flushes the log contents to the OS buffers. We never call it with lock held.
      
      We call it once for every Read/Write and often in compaction/flush process so the frequency should not be a problem.
      
      Test Plan: db_test
      
      Reviewers: dhruba, haobo, kailiu, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13935
      444cf88a
  5. 07 11月, 2013 2 次提交
    • H
      [RocksDB] Generalize prefix-aware iterator to be used for more than one Seek · fd204488
      Haobo Xu 提交于
      Summary: Added a prefix_seek flag in ReadOptions to indicate that Seek is prefix aware(might not return data with different prefix), and also not bound to a specific prefix. Multiple Seeks and range scans can be invoked on the same iterator. If a specific prefix is specified, this flag will be ignored. Just a quick prototype that works for PrefixHashRep, the new lockless memtable could be easily extended with this support too.
      
      Test Plan: test it on Leaf
      
      Reviewers: dhruba, kailiu, sdong, igor
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13929
      fd204488
    • S
      WAL log retention policy based on archive size. · c2be2cba
      shamdor 提交于
      Summary:
      Archive cleaning will still happen every WAL_ttl seconds
      but archived logs will be deleted only if archive size
      is greater then a WAL_size_limit value.
      Empty archived logs will be deleted evety WAL_ttl.
      
      Test Plan:
      1. Unit tests pass.
      2. Benchmark.
      
      Reviewers: emayanke, dhruba, haobo, sdong, kailiu, igor
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13869
      c2be2cba
  6. 05 11月, 2013 1 次提交
    • M
      Making the transaction log iterator more robust · f837f5b1
      Mayank Agarwal 提交于
      Summary:
      strict essentially means that we MUST find the startsequence. Thus we should return if starteSequence is not found in the first file in case strict is set. This will take care of ending the iterator in case of permanent gaps due to corruptions in the log files
      Also created NextImpl function that will have internal variable to distinguish whether Next is being called from StartSequence or by application.
      Set NotFoudn::gaps status to give an indication of gaps happeneing.
      Polished the inline documentation at various places
      
      Test Plan:
      * db_repl_stress test
      * db_test relating to transaction log iterator
      * fbcode/wormhole/rocksdb/rocks_log_iterator
      * sigma production machine sigmafio032.prn1
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13689
      f837f5b1
  7. 02 11月, 2013 1 次提交
    • D
      Implement a compressed block cache. · b4ad5e89
      Dhruba Borthakur 提交于
      Summary:
      Rocksdb can now support a uncompressed block cache, or a compressed
      block cache or both. Lookups first look for a block in the
      uncompressed cache, if it is not found only then it is looked up
      in the compressed cache. If it is found in the compressed cache,
      then it is uncompressed and inserted into the uncompressed cache.
      
      It is possible that the same block resides in the compressed cache
      as well as the uncompressed cache at the same time. Both caches
      have their own individual LRU policy.
      
      Test Plan: Unit test case attached.
      
      Reviewers: kailiu, sdong, haobo, leveldb
      
      Reviewed By: haobo
      
      CC: xjin, haobo
      
      Differential Revision: https://reviews.facebook.net/D12675
      b4ad5e89
  8. 01 11月, 2013 1 次提交
    • H
      [RocksDB] Add OnCompactionStart to CompactionFilter class · 8cbe5bb5
      Haobo Xu 提交于
      Summary: This is to give application compaction filter a chance to access context information of a specific compaction run. For example, depending on whether a compaction goes through all data files, the application could do things differently.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13683
      8cbe5bb5
  9. 31 10月, 2013 1 次提交
    • S
      Follow-up Cleaning-up After D13521 · f03b2df0
      Siying Dong 提交于
      Summary:
      This patch is to address @haobo's comments on D13521:
      1. rename Table to be TableReader and make its factory function to be GetTableReader
      2. move the compression type selection logic out of TableBuilder but to compaction logic
      3. more accurate comments
      4. Move stat name constants into BlockBasedTable implementation.
      5. remove some uncleaned codes in simple_table_db_test
      
      Test Plan: pass test suites.
      
      Reviewers: haobo, dhruba, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13785
      f03b2df0
  10. 29 10月, 2013 3 次提交
    • S
      Make "Table" pluggable · d4eec30e
      Siying Dong 提交于
      Summary: This patch makes Table and TableBuilder a abstract class and make all the implementation of the current table into BlockedBasedTable and BlockedBasedTable Builder.
      
      Test Plan: Make db_test.cc to work with block based table. Add a new test simple_table_db_test.cc where a different simple table format is implemented.
      
      Reviewers: dhruba, haobo, kailiu, emayanke, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13521
      d4eec30e
    • K
      Support user-defined table stats collector · 994575c1
      Kai Liu 提交于
      Summary:
      1. Added a new option that support user-defined table stats collection.
      2. Added a deleted key stats collector in `utilities`
      
      Test Plan:
      Added a unit test for newly added code.
      Also ran make check to make sure other tests are not broken.
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13491
      994575c1
    • I
      If a Put fails, fail all other puts · 100fa8e0
      Igor Canadi 提交于
      Summary:
      When a Put fails, it can leave database in a messy state. We don't want to pretend that everything is OK when it may not be. We fail every write following the failed one.
      
      I added checks for corruption to DBImpl::Write(). Is there anywhere else I need to add them?
      
      Test Plan: Corruption unit test.
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13671
      100fa8e0
  11. 25 10月, 2013 2 次提交
    • M
      Unify DeleteFile and DeleteWalFiles · 56305221
      Mayank Agarwal 提交于
      Summary:
      This is to simplify rocksdb public APIs and improve the code quality.
      Created an additional parameter to ParseFileName for log sub type and improved the code for deleting a wal file.
      Wrote exhaustive unit-tests in delete_file_test
      Unification of other redundant APIs can be taken up in a separate diff
      
      Test Plan: Expanded delete_file test
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13647
      56305221
    • K
      Fix the log number bug when updating MANIFEST file · c17607a2
      Kai Liu 提交于
      Summary:
      Crash may occur during the flushes of more than two mem tables.
      
      As the info log suggested, even when both were successfully flushed,
      the recovery process still pick up one of the memtable's log for recovery.
      
      This diff fix the problem by setting the correct "log number" in MANIFEST.
      
      Test Plan: make test; deployed to leaf4 and make sure it doesn't result in crashes of this type.
      
      Reviewers: haobo, dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13659
      c17607a2
  12. 24 10月, 2013 1 次提交
  13. 23 10月, 2013 1 次提交
    • M
      Dbid feature · 9b50106f
      Mayank Agarwal 提交于
      Summary:
      Create a new type of file on startup if it doesn't already exist called DBID.
      This will store a unique number generated from boost library's uuid header file.
      The use-case is to identify the case of a db losing all its data and coming back up either empty or from an image(backup/live replica's recovery)
      the key point to note is that DBID is not stored in a backup or db snapshot
      It's preferable to use Boost for uuid because:
      1) A non-standard way of generating uuid is not good
      2) /proc/sys/kernel/random/uuid generates a uuid but only on linux environments and the solution would not be clean
      3) c++ doesn't have any direct way to get a uuid
      4) Boost is a very good library that was already having linkage in rocksdb from third-party
      Note: I had to update the TOOLCHAIN_REV in build files to get latest verison of boost from third-party as the older version had a bug.
      I had to put Wno-uninitialized in Makefile because boost-1.51 has an unitialized variable and rocksdb would not comiple otherwise. Latet open-source for boost is 1.54 but is not there in third-party. I have notified the concerned people in fbcode about it.
      @kailiu : While releasing to third-party, an additional dependency will need to be created for boost in TARGETS file. I can help identify.
      
      Test Plan:
      Expand db_test to test 2 cases
      1) Restarting db with Id file present - verify that no change to Id
      2)Restarting db with Id file deleted - verify that a different Id is there after reopen
      Also run make all check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13587
      9b50106f
  14. 18 10月, 2013 1 次提交
    • S
      Universal Compaction to Have a Size Percentage Threshold To Decide Whether to Compress · 9edda370
      Siying Dong 提交于
      Summary:
      This patch adds a option for universal compaction to allow us to only compress output files if the files compacted previously did not yet reach a specified ratio, to save CPU costs in some cases.
      
      Compression is always skipped for flushing. This is because the size information is not easy to evaluate for flushing case. We can improve it later.
      
      Test Plan:
      add test
      DBTest.UniversalCompactionCompressRatio1 and DBTest.UniversalCompactionCompressRatio12
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13467
      9edda370
  15. 17 10月, 2013 2 次提交
    • D
      Add appropriate LICENSE and Copyright message. · 9cd22109
      Dhruba Borthakur 提交于
      Summary:
      Add appropriate LICENSE and Copyright message.
      
      Test Plan:
      make check
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      9cd22109
    • S
      Enable background flush thread by default and fix issues related to it · 073cbfc8
      Siying Dong 提交于
      Summary:
      Enable background flush thread in this patch and fix unit tests with:
      (1) After background flush, schedule a background compaction if condition satisfied;
      (2) Fix a bug that if universal compaction is enabled and number of levels are set to be 0, compaction will not be automatically triggered
      (3) Fix unit tests to wait for compaction to finish instead of flush, before checking the compaction results.
      
      Test Plan: pass all unit tests
      
      Reviewers: haobo, xjin, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13461
      073cbfc8
  16. 15 10月, 2013 3 次提交
    • M
      Features in Transaction log iterator · fe371396
      Mayank Agarwal 提交于
      Summary:
      * Logstore requests a valid change of reutrning an empty iterator and not an error in case of no log files.
      * Changed the code to return the writebatch containing the sequence number requested from GetupdatesSince even if it lies in the middle. Earlier we used to return the next writebatch,. This also allows me oto guarantee that no files played upon by the iterator are redundant. I mean the starting log file has at least a sequence number >= the sequence number requested form GetupdatesSince.
      * Cleaned up redundant logic in Iterator::Next and made a new function SeekToStartSequence for greater readability and maintainibilty.
      * Modified a test in db_test accordingly
      Please check the logic carefully and suggest improvements. I have a separate patch out for more improvements like restricting reader to read till written sequences.
      
      Test Plan:
      * transaction log iterator tests in db_test,
      * db_repl_stress.
      * rocks_log_iterator_test in fbcode/wormhole/rocksdb/test - 2 tests thriving on hacks till now can get simplified
      * testing on the shadow setup for sigma with replication
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13437
      fe371396
    • K
      Add statistics to sst file · 86ef6c3f
      Kai Liu 提交于
      Summary:
      So far we only have key/value pairs as well as bloom filter stored in the
      sst file.  It will be great if we are able to store more metadata about
      this table itself, for example, the entry size, bloom filter name, etc.
      
      This diff is the first step of this effort. It allows table to keep the
      basic statistics mentioned in http://fburl.com/14995441, as well as
      allowing writing user-collected stats to stats block.
      
      After this diff, we will figure out the interface of how to allow user to collect their interested statistics.
      
      Test Plan:
      1. Added several unit tests.
      2. Ran `make check` to ensure it doesn't break other tests.
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13419
      86ef6c3f
    • S
      Change Function names from Compaction->Flush When they really mean Flush · 88f2f890
      Siying Dong 提交于
      Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance.
      
      Test Plan: Run test suites.
      
      Reviewers: haobo, dhruba, xjin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13431
      88f2f890
  17. 11 10月, 2013 1 次提交
  18. 09 10月, 2013 1 次提交
    • N
      Add option for storing transaction logs in a separate dir · cbf4a064
      Naman Gupta 提交于
      Summary: In some cases, you might not want to store the data log (write ahead log) files in the same dir as the sst files. An example use case is leaf, which stores sst files in tmpfs. And would like to save the log files in a separate dir (disk) to save memory.
      
      Test Plan: make all. Ran db_test test. A few test failing. P2785018. If you guys don't see an obvious problem with the code, maybe somebody from the rocksdb team could help me debug the issue here. Running this on leaf worked well. I could see logs stored on disk, and deleted appropriately after compactions. Obviously this is only one set of options. The unit tests cover different options. Seems like I'm missing some edge cases.
      
      Reviewers: dhruba, haobo, leveldb
      
      CC: xinyaohu, sumeet
      
      Differential Revision: https://reviews.facebook.net/D13239
      cbf4a064
  19. 06 10月, 2013 2 次提交
  20. 05 10月, 2013 3 次提交
  21. 04 10月, 2013 2 次提交
  22. 03 10月, 2013 1 次提交
    • X
      Fix SIGSEGV issue in universal compaction · 658a3ce2
      Xing Jin 提交于
      Summary:
      We saw SIGSEGV when set options.num_levels=1 in universal compaction
      style. Dug into this issue for a while, and finally found the root cause (thank Haobo for discussion).
      
      Test Plan: Add new unit test. It throws SIGSEGV without this change. Also run "make all check".
      
      Reviewers: haobo, dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13251
      658a3ce2
  23. 29 9月, 2013 1 次提交
  24. 27 9月, 2013 1 次提交
  25. 13 9月, 2013 1 次提交
    • H
      [RocksDB] Remove Log file immediately after memtable flush · 0e422308
      Haobo Xu 提交于
      Summary: As title. The DB log file life cycle is tied up with the memtable it backs. Once the memtable is flushed to sst and committed, we should be able to delete the log file, without holding the mutex. This is part of the bigger change to avoid FindObsoleteFiles at runtime. It deals with log files. sst files will be dealt with later.
      
      Test Plan: make check; db_bench
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11709
      0e422308
  26. 07 9月, 2013 1 次提交
    • D
      Flush was hanging because the configured options specified that more than 1... · 32c965d4
      Dhruba Borthakur 提交于
      Flush was hanging because the configured options specified that more than 1 memtable need to be merged.
      
      Summary:
      There is an config option called Options.min_write_buffer_number_to_merge
      that specifies the minimum number of write buffers to merge in memory
      before flushing to a file in L0. But in the the case when the db is
      being closed, we should not be using this config, instead we should
      flush whatever write buffers were available at that time.
      
      Test Plan: Unit test attached.
      
      Reviewers: haobo, emayanke
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12717
      32c965d4
  27. 05 9月, 2013 1 次提交