1. 17 7月, 2015 1 次提交
    • D
      Ensure Windows build w/o port/port.h in public headers · d1a45718
      Dmitri Smirnov 提交于
       - Remove make file defines from public headers and use _WIN32 because it is compiler defined
       - use __GNUC__ and __clang__ to guard non-portable attributes
       - add #include "port/port.h" to some new .cc files.
       - minor changes in CMakeLists to reflect recent changes
      d1a45718
  2. 16 7月, 2015 2 次提交
    • A
      move convenience.h out of utilities · 81d07262
      agiardullo 提交于
      Summary: Moved convenience.h out of utilities to remove a dependency on utilities in db.
      
      Test Plan: unit tests.  Also compiled a link to the old location to verify the _Pragma works.
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42201
      81d07262
    • P
      Fixing delete files in Trivial move of universal compaction · beb19ad0
      Poornima Chozhiyath Raman 提交于
      Summary:
      Trvial move in universal compaction was failing when trying to move files from levels other than 0.
      This was because the DeleteFile while trivially moving, was only deleting files of level 0 which caused duplication of same file in different levels.
      This is fixed by passing the right level as argument in the call of DeleteFile while doing trivial move.
      
      Test Plan: ./db_test ran successfully with the new test cases.
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42135
      beb19ad0
  3. 15 7月, 2015 3 次提交
    • L
      Replace std::priority_queue in MergingIterator with custom heap, take 2 · e1c99e10
      lovro 提交于
      Summary: Repeat of b6655a67 (reverted in b7a2369f) with a proper fix for the issue that 57d216ea was trying to fix.
      
      Test Plan:
      make check
      
      for i in $(seq 100); do ./db_stress --test_batches_snapshots=1 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=0 --reopen=20 --readpercent=45 --prefixpercent=5 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/tmp/rocksdb_crashtest_KdCI5F --max_key=100000000 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --progress_reports=0 --disable_wal=0 --disable_data_sync=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=0 --memtablerep=prefix_hash --prefix_size=7 --ops_per_thread=200 || break; done
      
      Reviewers: anthony, sdong, igor, yhchiang
      
      Reviewed By: igor, yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D41391
      e1c99e10
    • Y
      Make TransactionLogIterator related tests from db_test.cc to db_log_iter_test.cc · ce829c77
      Yueh-Hsuan Chiang 提交于
      Summary: Make TransactionLogIterator related tests from db_test.cc to db_log_iter_test.cc
      
      Test Plan:
      db_test
      db_log_iter_test
      
      Reviewers: sdong, IslamAbdelRahman, igor, anthony
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42045
      ce829c77
    • Y
      Block SyncPoint in util/db_test_util.h in released Windows mode. · 0936362a
      Yueh-Hsuan Chiang 提交于
      Summary: Block SyncPoint in util/db_test_util.h in released Windows mode.
      
      Test Plan: db_test
      
      Reviewers: igor, anthony, sdong, IslamAbdelRahman
      
      Reviewed By: sdong, IslamAbdelRahman
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42213
      0936362a
  4. 14 7月, 2015 6 次提交
    • I
      Deprecate purge_redundant_kvs_while_flush · a9c51095
      Igor Canadi 提交于
      Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.
      
      Test Plan: none
      
      Reviewers: sdong, yhchiang, anthony, dhruba
      
      Reviewed By: dhruba
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42063
      a9c51095
    • I
      Deprecate WriteOptions::timeout_hint_us · 5aea98dd
      Igor Canadi 提交于
      Summary:
      In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
      * Less code == better icache hit rate, smaller builds, simpler code
      * The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
      * Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.
      
      Test Plan: make check
      
      Reviewers: anthony, kradhakrishnan, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: sdong, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41937
      5aea98dd
    • Y
      Move global static functions in db_test_util to DBTestBase · 49f42ad0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Move global static functions in db_test_util to DBTestBase.
      This is to prevent unused function warning when decoupling
      db_test.cc into multiple files.
      
      Test Plan: db_test
      
      Reviewers: igor, sdong, anthony, IslamAbdelRahman, kradhakrishnan
      
      Reviewed By: kradhakrishnan
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D42009
      49f42ad0
    • Y
      Move reusable part of db_test.cc to util/db_test_util.h · 625467a0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Move reusable part of db_test.cc to util/db_test_util.h.
      This makes it more possible to partition db_test.cc into
      multiple smaller test files.
      
      Also, fixed many old lint errors in db_test.
      
      Test Plan: db_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong, kradhakrishnan
      
      Reviewed By: sdong, kradhakrishnan
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41973
      625467a0
    • A
      Add tombstone information in CompactionJobStats · 8bca83e5
      Ari Ekmekji 提交于
      Summary:
      Added new statistics in CompactionJobStats to keep track of
      deletion entries and the expiration of those entries. Updated these
      fields in compaction_job.cc as compaction took place and wrote a new
      test in compaction_job_stats_test.cc to verify accuracy.
      
      Test Plan:
      Wrote new test DeletionStatsTest in
      compaction_job_stats_test.cc to verify
      
      Reviewers: sdong, igor, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41355
      8bca83e5
    • S
      "make format" against last 10 commits · f9728640
      sdong 提交于
      Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
      
      Test Plan: Build it.
      
      Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41961
      f9728640
  5. 12 7月, 2015 1 次提交
  6. 11 7月, 2015 3 次提交
  7. 10 7月, 2015 1 次提交
  8. 08 7月, 2015 2 次提交
    • D
      Commit both PR and internal code review changes · ef4b87f1
      Dmitri Smirnov 提交于
      ef4b87f1
    • Y
      Revert "Replace std::priority_queue in MergingIterator with custom heap" · b7a2369f
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch reverts "Replace std::priority_queue in MergingIterator
      with custom heap" (commit commit b6655a67)
      as it causes db_stress failure.
      
      Test Plan: ./db_stress --test_batches_snapshots=1 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=0 --reopen=20 --readpercent=45 --prefixpercent=5 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/tmp/rocksdb_crashtest_KdCI5F --max_key=100000000 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --progress_reports=0 --disable_wal=0 --disable_data_sync=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=0 --memtablerep=prefix_hash --prefix_size=7 --ops_per_thread=200 --kill_random_test=97
      
      Reviewers: igor, anthony, lovro, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41343
      b7a2369f
  9. 06 7月, 2015 1 次提交
    • L
      Replace std::priority_queue in MergingIterator with custom heap · b6655a67
      lovro 提交于
      Summary:
      While profiling compaction in our service I noticed a lot of CPU (~15% of compaction) being spent in MergingIterator and key comparison.  Looking at the code I found MergingIterator was (understandably) using std::priority_queue for the multiway merge.
      
      Keys in our dataset include sequence numbers that increase with time.  Adjacent keys in an L0 file are very likely to be adjacent in the full database.  Consequently, compaction will often pick a chunk of rows from the same L0 file before switching to another one.  It would be great to avoid the O(log K) operation per row while compacting.
      
      This diff replaces std::priority_queue with a custom binary heap implementation.  It has a "replace top" operation that is cheap when the new top is the same as the old one (i.e. the priority of the top entry is decreased but it still stays on top).
      
      Test Plan:
      make check
      
      To test the effect on performance, I generated databases with data patterns that mimic what I describe in the summary (rows have a mostly increasing sequence number).  I see a 10-15% CPU decrease for compaction (and a matching throughput improvement on tmpfs).  The exact improvement depends on the number of L0 files and the amount of locality.  Performance on randomly distributed keys seems on par with the old code.
      
      Reviewers: kailiu, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: yoshinorim, dhruba, tnovak
      
      Differential Revision: https://reviews.facebook.net/D29133
      b6655a67
  10. 03 7月, 2015 4 次提交
    • D
      Arena needs mman header for mmap · e25ee32e
      Dmitri Smirnov 提交于
      e25ee32e
    • D
      Merge the latest changes from github/master · d2f0912b
      Dmitri Smirnov 提交于
      d2f0912b
    • A
      Introduce InfoLogLevel::HEADER_LEVEL · 35cd75c3
      Ari Ekmekji 提交于
      Summary:
       Introduced a new category in the enum InfoLogLevel in env.h.
       Modifed Log() in env.cc to use the Header()
       when the InfoLogLevel == HEADER_LEVEL.
       Updated tests in auto_roll_logger_test to ensure
       the header is handled properly in these cases.
      
      Test Plan: Augment existing tests in auto_roll_logger_test
      
      Reviewers: igor, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41067
      35cd75c3
    • A
      Multithreaded backup and restore in BackupEngineImpl · a69bc91e
      Aaron Feldman 提交于
      Summary:
      Add a new field: BackupableDBOptions.max_background_copies.
      CreateNewBackup() and RestoreDBFromBackup() will use this number of threads to perform copies.
      If there is a backup rate limit, then max_background_copies must be 1.
      Update backupable_db_test.cc to test multi-threaded backup and restore.
      Update backupable_db_test.cc to test backups when the backup environment is not the same as the database environment.
      
      Test Plan:
      Run ./backupable_db_test
      Run valgrind ./backupable_db_test
      Run with TSAN and ASAN
      
      Reviewers: yhchiang, rven, anthony, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: yhchiang, anthony, sdong, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D40725
      a69bc91e
  11. 02 7月, 2015 3 次提交
    • D
    • D
      Address GCC compilation issues · ca2fe2c1
      Dmitri Smirnov 提交于
       invalid suffix on literal
       no return statement in function returning non-void CuckooStep::operator=
       extra qualification ‘rocksdb::spatial::Variant::
       dereferencing type-punned pointer will break strict-aliasing rules
      ca2fe2c1
    • D
      Windows Port from Microsoft · 18285c1e
      Dmitri Smirnov 提交于
       Summary: Make RocksDb build and run on Windows to be functionally
       complete and performant. All existing test cases run with no
       regressions. Performance numbers are in the pull-request.
      
       Test plan: make all of the existing unit tests pass, obtain perf numbers.
      
       Co-authored-by: Praveen Rao praveensinghrao@outlook.com
       Co-authored-by: Sherlock Huang baihan.huang@gmail.com
       Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
       Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
      18285c1e
  12. 24 6月, 2015 1 次提交
    • G
      Implement a table-level row cache · 782a1590
      Giuseppe Ottaviano 提交于
      Summary:
      Implementation of a table-level row cache.
      It only caches point queries done through the `DB::Get` interface, queries done through the `Iterator` interface will completely skip the cache.
      
      Supports snapshots and merge operations.
      
      Test Plan: Ran `make valgrind_check commit-prereq`
      
      Reviewers: igor, philipp, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D39849
      782a1590
  13. 23 6月, 2015 2 次提交
    • K
      Introduce WAL recovery consistency levels · de85e4ca
      krad 提交于
      Summary:
      The "one size fits all" approach with WAL recovery will only introduce inconvenience for our varied clients as we go forward. The current recovery is a bit heuristic. We introduce the following levels of consistency while replaying the WAL.
      
      1. RecoverAfterRestart (kTolerateCorruptedTailRecords)
      
      This mocks the current recovery mode.
      
      2. RecoverAfterCleanShutdown (kAbsoluteConsistency)
      
      This is ideal for unit test and cases where the store is shutdown cleanly. We tolerate no corruption or incomplete writes.
      
      3. RecoverPointInTime (kPointInTimeRecovery)
      
      This is ideal when using devices with controller cache or file systems which can loose data on restart. We recover upto the point were is no corruption or incomplete write.
      
      4. RecoverAfterDisaster (kSkipAnyCorruptRecord)
      
      This is ideal mode to recover data. We tolerate corruption and incomplete writes, and we hop over those sections that we cannot make sense of salvaging as many records as possible.
      
      Test Plan:
      (1) Run added unit test to cover all levels.
      (2) Run make check.
      
      Reviewers: leveldb, sdong, igor
      
      Subscribers: yoshinorim, dhruba
      
      Differential Revision: https://reviews.facebook.net/D38487
      de85e4ca
    • K
      Add read_nanos to IOStatsContext. · 7015fd81
      krad 提交于
      Summary: MyRocks need a mechanism to track read outliers. We need to expose this
      stat.
      
      Test Plan: None
      
      Reviewers: sdong
      
      CC: leveldb
      
      Task ID: #7152512
      
      Blame Rev:
      7015fd81
  14. 20 6月, 2015 1 次提交
  15. 19 6月, 2015 4 次提交
    • Y
      Make autovector_test runnable in ROCKSDB_LITE · df719d49
      Yueh-Hsuan Chiang 提交于
      Summary: Make autovector_test runnable in ROCKSDB_LITE
      
      Test Plan: autovector_test
      
      Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D40245
      df719d49
    • I
      Fail DB::Open() when the requested compression is not available · 760e9a94
      Igor Canadi 提交于
      Summary:
      Currently RocksDB silently ignores this issue and doesn't compress the data. Based on discussion, we agree that this is pretty bad because it can cause confusion for our users.
      
      This patch fails DB::Open() if we don't support the compression that is specified in the options.
      
      Test Plan: make check with LZ4 not present. If Snappy is not present all tests will just fail because Snappy is our default library. We should make Snappy the requirement, since without it our default DB::Open() fails.
      
      Reviewers: sdong, MarkCallaghan, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D39687
      760e9a94
    • A
      Add Cache.GetPinnedUsageUsage() · 69bb210d
      Aaron Feldman 提交于
      Summary:
        Add the funcion Cache.GetPinnedUsage() to return the memory size of entries
        that are in use by the system (that is, all the entries not in the LRU list).
      
      Test Plan:
        Run ./cache_test and examine PinnedUsageTest.
      
      Reviewers: tnovak, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40305
      69bb210d
    • I
      Don't dump DBOptions for each column family · 4b8bb62f
      Igor Canadi 提交于
      Summary: Currently we dump DBOptions for each column family options we dump. This leads to duplicate lines in our LOG file. This diff fixes that.
      
      Test Plan: Check out the LOG
      
      Reviewers: sdong, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: IslamAbdelRahman, yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D39729
      4b8bb62f
  16. 18 6月, 2015 2 次提交
    • I
      Use CompactRangeOptions for CompactRange · 12e030a9
      Islam AbdelRahman 提交于
      Summary:
      This diff update DB::CompactRange to use RangeCompactionOptions instead of using multiple parameters
      Old CompactRange is still available but deprecated
      
      Test Plan:
      make all check
      make rocksdbjava
      USE_CLANG=1 make all
      OPT=-DROCKSDB_LITE make release
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40209
      12e030a9
    • Y
      Only initialize the ThreadStatusData when necessary. · 1369f015
      Yueh-Hsuan Chiang 提交于
      Summary:
      Before this patch, any function call to ThreadStatusUtil might automatically initialize and register the thread status data.  However, if it is the user-thread making this call, the allocated thread-status-data will never be released as such threads are not managed by rocksdb.
      
      In this patch, I remove the automatic-initialization part.  Thread-status data is only initialized and uninitialized in Env during the thread creation and destruction.
      
      Test Plan:
      db_test
      thread_list_test
      listener_test
      
      Reviewers: igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D40017
      1369f015
  17. 17 6月, 2015 1 次提交
  18. 13 6月, 2015 1 次提交
  19. 12 6月, 2015 1 次提交
    • S
      Slow down writes by bytes written · 7842920b
      sdong 提交于
      Summary:
      We slow down data into the database to the rate of options.delayed_write_rate (a new option) with this patch.
      
      The thread synchronization approach I take is to still synchronize write controller by DB mutex and GetDelay() is inside DB mutex. Try to minimize the frequency of getting time in GetDelay(). I verified it through db_bench and it seems to work
      
      hard_rate_limit is deprecated.
      
      options.delayed_write_rate is still not dynamically changeable. Need to work on it as a follow-up.
      
      Test Plan: Add new unit tests in db_test
      
      Reviewers: yhchiang, rven, kradhakrishnan, anthony, MarkCallaghan, igor
      
      Reviewed By: igor
      
      Subscribers: ikabiljo, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D36351
      7842920b