1. 17 12月, 2016 1 次提交
  2. 14 12月, 2016 1 次提交
  3. 22 11月, 2016 2 次提交
    • M
      Add WriteOptions.no_slowdown · 182b940e
      Maysam Yabandeh 提交于
      Summary:
      If the WriteOptions.no_slowdown flag is set AND we need to wait or sleep for
      the write request, then fail immediately with Status::Incomplete().
      Closes https://github.com/facebook/rocksdb/pull/1527
      
      Differential Revision: D4191405
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 7f3ce3f
      182b940e
    • A
      Range deletion microoptimizations · fd43ee09
      Andrew Kryczka 提交于
      Summary:
      - Made RangeDelAggregator's InternalKeyComparator member a reference-to-const so we don't need to copy-construct it. Also added InternalKeyComparator to ImmutableCFOptions so we don't need to construct one for each DBIter.
      - Made MemTable::NewRangeTombstoneIterator and the table readers' NewRangeTombstoneIterator() functions return nullptr instead of NewEmptyInternalIterator to avoid the allocation. Updated callers accordingly.
      Closes https://github.com/facebook/rocksdb/pull/1548
      
      Differential Revision: D4208169
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2fd65cf
      fd43ee09
  4. 17 11月, 2016 1 次提交
  5. 05 11月, 2016 1 次提交
    • A
      DeleteRange user iterator support · 9e7cf346
      Andrew Kryczka 提交于
      Summary:
      Note: reviewed in  https://reviews.facebook.net/D65115
      
      - DBIter maintains a range tombstone accumulator. We don't cleanup obsolete tombstones yet, so if the user seeks back and forth, the same tombstones would be added to the accumulator multiple times.
      - DBImpl::NewInternalIterator() (used to make DBIter's underlying iterator) adds memtable/L0 range tombstones, L1+ range tombstones are added on-demand during NewSecondaryIterator() (see D62205)
      - DBIter uses ShouldDelete() when advancing to check whether keys are covered by range tombstones
      Closes https://github.com/facebook/rocksdb/pull/1464
      
      Differential Revision: D4131753
      
      Pulled By: ajkr
      
      fbshipit-source-id: be86559
      9e7cf346
  6. 21 10月, 2016 1 次提交
    • I
      Support IngestExternalFile (remove AddFile restrictions) · 869ae5d7
      Islam AbdelRahman 提交于
      Summary:
      Changes in the diff
      
      API changes:
      - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
      - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
      - Deprecate AddFile() API
      
      Logic changes:
      - If our file overlap with the memtable we will flush the memtable
      - We will find the first level in the LSM tree that our file key range overlap with the keys in it
      - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
      - We will assign a global sequence number to our new file
      - Remove AddFile restrictions by using global sequence numbers
      
      Other changes:
      - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
      
      Test Plan:
      unit tests (still need to add more)
      addfile_stress (https://reviews.facebook.net/D65037)
      
      Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: jkedgar, hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65061
      869ae5d7
  7. 19 10月, 2016 1 次提交
    • I
      Support SST files with Global sequence numbers [reland] · b88f8e87
      Islam AbdelRahman 提交于
      Summary:
      reland https://reviews.facebook.net/D62523
      
      - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno`
      - Update TableProperties to be aware of the offset of each property in the file
      - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file
      
      Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks
      
      Test Plan: unit tests
      
      Reviewers: sdong, yhchiang
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65211
      b88f8e87
  8. 12 8月, 2016 1 次提交
    • I
      Eliminate memcpy from ForwardIterator · d11c09d9
      Islam AbdelRahman 提交于
      Summary:
      This diff update ForwardIterator to support pinning keys and values, which will allow DBIter to take advantage of that and eliminate memcpy when executing merge operators
      This diff is stacked on D61305
      
      Test Plan:
      existing tests (updated them to test tailing iterator)
      new test
      
      Reviewers: andrewkr, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D60009
      d11c09d9
  9. 11 8月, 2016 2 次提交
  10. 21 7月, 2016 1 次提交
    • I
      Introduce FullMergeV2 (eliminate memcpy from merge operators) · 68a8e6b8
      Islam AbdelRahman 提交于
      Summary:
      This diff update the code to pin the merge operator operands while the merge operation is done, so that we can eliminate the memcpy cost, to do that we need a new public API for FullMerge that replace the std::deque<std::string> with std::vector<Slice>
      
      This diff is stacked on top of D56493 and D56511
      
      In this diff we
      - Update FullMergeV2 arguments to be encapsulated in MergeOperationInput and MergeOperationOutput which will make it easier to add new arguments in the future
      - Replace std::deque<std::string> with std::vector<Slice> to pass operands
      - Replace MergeContext std::deque with std::vector (based on a simple benchmark I ran https://gist.github.com/IslamAbdelRahman/78fc86c9ab9f52b1df791e58943fb187)
      - Allow FullMergeV2 output to be an existing operand
      
      ```
      [Everything in Memtable | 10K operands | 10 KB each | 1 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=10000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000
      
      [FullMergeV2]
      readseq      :       0.607 micros/op 1648235 ops/sec; 16121.2 MB/s
      readseq      :       0.478 micros/op 2091546 ops/sec; 20457.2 MB/s
      readseq      :       0.252 micros/op 3972081 ops/sec; 38850.5 MB/s
      readseq      :       0.237 micros/op 4218328 ops/sec; 41259.0 MB/s
      readseq      :       0.247 micros/op 4043927 ops/sec; 39553.2 MB/s
      
      [master]
      readseq      :       3.935 micros/op 254140 ops/sec; 2485.7 MB/s
      readseq      :       3.722 micros/op 268657 ops/sec; 2627.7 MB/s
      readseq      :       3.149 micros/op 317605 ops/sec; 3106.5 MB/s
      readseq      :       3.125 micros/op 320024 ops/sec; 3130.1 MB/s
      readseq      :       4.075 micros/op 245374 ops/sec; 2400.0 MB/s
      ```
      
      ```
      [Everything in Memtable | 10K operands | 10 KB each | 10 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=1000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000
      
      [FullMergeV2]
      readseq      :       3.472 micros/op 288018 ops/sec; 2817.1 MB/s
      readseq      :       2.304 micros/op 434027 ops/sec; 4245.2 MB/s
      readseq      :       1.163 micros/op 859845 ops/sec; 8410.0 MB/s
      readseq      :       1.192 micros/op 838926 ops/sec; 8205.4 MB/s
      readseq      :       1.250 micros/op 800000 ops/sec; 7824.7 MB/s
      
      [master]
      readseq      :      24.025 micros/op 41623 ops/sec;  407.1 MB/s
      readseq      :      18.489 micros/op 54086 ops/sec;  529.0 MB/s
      readseq      :      18.693 micros/op 53495 ops/sec;  523.2 MB/s
      readseq      :      23.621 micros/op 42335 ops/sec;  414.1 MB/s
      readseq      :      18.775 micros/op 53262 ops/sec;  521.0 MB/s
      
      ```
      
      ```
      [Everything in Block cache | 10K operands | 10 KB each | 1 operand per key]
      
      [FullMergeV2]
      $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions
      readseq      :      14.741 micros/op 67837 ops/sec;  663.5 MB/s
      readseq      :       1.029 micros/op 971446 ops/sec; 9501.6 MB/s
      readseq      :       0.974 micros/op 1026229 ops/sec; 10037.4 MB/s
      readseq      :       0.965 micros/op 1036080 ops/sec; 10133.8 MB/s
      readseq      :       0.943 micros/op 1060657 ops/sec; 10374.2 MB/s
      
      [master]
      readseq      :      16.735 micros/op 59755 ops/sec;  584.5 MB/s
      readseq      :       3.029 micros/op 330151 ops/sec; 3229.2 MB/s
      readseq      :       3.136 micros/op 318883 ops/sec; 3119.0 MB/s
      readseq      :       3.065 micros/op 326245 ops/sec; 3191.0 MB/s
      readseq      :       3.014 micros/op 331813 ops/sec; 3245.4 MB/s
      ```
      
      ```
      [Everything in Block cache | 10K operands | 10 KB each | 10 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10-operands-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions
      
      [FullMergeV2]
      readseq      :      24.325 micros/op 41109 ops/sec;  402.1 MB/s
      readseq      :       1.470 micros/op 680272 ops/sec; 6653.7 MB/s
      readseq      :       1.231 micros/op 812347 ops/sec; 7945.5 MB/s
      readseq      :       1.091 micros/op 916590 ops/sec; 8965.1 MB/s
      readseq      :       1.109 micros/op 901713 ops/sec; 8819.6 MB/s
      
      [master]
      readseq      :      27.257 micros/op 36687 ops/sec;  358.8 MB/s
      readseq      :       4.443 micros/op 225073 ops/sec; 2201.4 MB/s
      readseq      :       5.830 micros/op 171526 ops/sec; 1677.7 MB/s
      readseq      :       4.173 micros/op 239635 ops/sec; 2343.8 MB/s
      readseq      :       4.150 micros/op 240963 ops/sec; 2356.8 MB/s
      ```
      
      Test Plan: COMPILE_WITH_ASAN=1 make check -j64
      
      Reviewers: yhchiang, andrewkr, sdong
      
      Reviewed By: sdong
      
      Subscribers: lovro, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D57075
      68a8e6b8
  11. 14 7月, 2016 1 次提交
  12. 12 7月, 2016 1 次提交
    • A
      update DB::AddFile to ingest list of sst files · 8e6b38d8
      Aaron Gao 提交于
      Summary:
      DB::AddFile(std::string file_path) API that allow them to ingest an SST file created using SstFileWriter
      We want to update this interface to be able to accept a list of files that will be ingested, DB::AddFile(std::vector<std::string> file_path_list).
      
      Test Plan:
      Add test case `AddExternalSstFileList` in `DBSSTTest`. To make sure:
      1. files key ranges are not overlapping with each other
      2. each file key range dont overlap with the DB key range
      3. make sure no snapshots are held
      
      Reviewers: andrewkr, sdong, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D58587
      8e6b38d8
  13. 22 6月, 2016 1 次提交
  14. 18 6月, 2016 1 次提交
    • S
      Deprectate filter_deletes · 7b79238b
      sdong 提交于
      Summary: filter_deltes is not a frequently used feature. Remove it.
      
      Test Plan: Run all test suites.
      
      Reviewers: igor, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D59427
      7b79238b
  15. 29 4月, 2016 1 次提交
    • S
      Change several option defaults · 6a14f7a9
      sdong 提交于
      Summary:
      Changing several option defaults:
       options.max_open_files changes from 5000 to -1
       options.base_background_compactions changes from max_background_compactions to 1
       options.wal_recovery_mode changes from kTolerateCorruptedTailRecords to kTolerateCorruptedTailRecords
       options.compaction_pri changes from kByCompensatedSize to kByCompensatedSize
      
      Test Plan: Write unit tests to see OldDefaults() works as expected.
      
      Reviewers: IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: MarkCallaghan, yiwu, kradhakrishnan, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D56427
      6a14f7a9
  16. 21 4月, 2016 1 次提交
    • A
      Add per-level compression ratio property · 73a847ef
      Andrew Kryczka 提交于
      Summary:
      This is needed so we can measure compression ratio improvements
      achieved by D52287.
      
      The property compares raw data size against the total file size for a given
      level. If the level is empty it should return 0.0.
      
      Test Plan: new unit test
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D56967
      73a847ef
  17. 19 4月, 2016 1 次提交
    • Y
      Split db_test.cc · 792762c4
      Yi Wu 提交于
      Summary: Split db_test.cc into several files. Moving several helper functions into DBTestBase.
      
      Test Plan: make check
      
      Reviewers: sdong, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: dhruba, andrewkr, kradhakrishnan, yhchiang, leveldb, sdong
      
      Differential Revision: https://reviews.facebook.net/D56715
      792762c4
  18. 16 4月, 2016 1 次提交
  19. 01 4月, 2016 1 次提交
    • S
      Change some RocksDB default options · 2feafa3d
      sdong 提交于
      Summary: Change some RocksDB default options to make it more friendly to server workloads.
      
      Test Plan: Run all existing tests
      
      Reviewers: yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: sumeet, muthu, benj, MarkCallaghan, igor, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D55941
      2feafa3d
  20. 18 2月, 2016 1 次提交
  21. 10 2月, 2016 1 次提交
  22. 06 2月, 2016 1 次提交
  23. 02 2月, 2016 1 次提交
  24. 29 1月, 2016 1 次提交
  25. 26 1月, 2016 1 次提交
    • S
      Parameterize DBTest.Randomized · da33dfe1
      sdong 提交于
      Summary: Break down DBTest.Randomized to multiple gtest tests based on config type
      
      Test Plan: Run the test and all tests. Make sure configurations are correctly set
      
      Reviewers: yhchiang, IslamAbdelRahman, rven, kradhakrishnan, andrewkr, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53247
      da33dfe1
  26. 05 12月, 2015 1 次提交
    • K
      Build break fix. · d3bb572d
      krad 提交于
      Summary: Skip list now cannot estimate memory across allocators
      consistently and hence triggers flush at different time. This breaks certain
      unit tests.
      
      The fix is to adopt key count instead of size for flush.
      
      Test Plan: Ran test on dev box and mac (where it used to fail)
      
      Reviewers: sdong
      
      CC: leveldb@
      
      Task ID: #9273334
      
      Blame Rev:
      d3bb572d
  27. 01 12月, 2015 1 次提交
    • S
      Fix DBTest.SuggestCompactRangeTest for disable jemalloc case · ef8ed368
      sdong 提交于
      Summary: DBTest.SuggestCompactRangeTest fails for the case when jemalloc is disabled, including ASAN and valgrind builds. It is caused by the improvement of skip list, which allocates different size of nodes for a new records. Fix it by using a special mem table that triggers a flush by number of entries. In that way the behavior will be consistent for all allocators.
      
      Test Plan: Run the test with both of DISABLE_JEMALLOC=1 and 0
      
      Reviewers: anthony, rven, yhchiang, kradhakrishnan, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51423
      ef8ed368
  28. 24 11月, 2015 1 次提交
    • V
      Enable C4267 warning · 41b32c60
      Vasili Svirski 提交于
      * conversion from 'size_t' to 'type', by add static_cast
      
      Tested:
      * by build solution on Windows, Linux locally,
      * run tests
      * build CI system successful
      41b32c60
  29. 18 11月, 2015 1 次提交
  30. 11 11月, 2015 1 次提交
    • Y
      Enable RocksDB to persist Options file. · e114f0ab
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch allows rocksdb to persist options into a file on
      DB::Open, SetOptions, and Create / Drop ColumnFamily.
      Options files are created under the same directory as the rocksdb
      instance.
      
      In addition, this patch also adds a fail_if_missing_options_file in DBOptions
      that makes any function call return non-ok status when it is not able to
      persist options properly.
      
        // If true, then DB::Open / CreateColumnFamily / DropColumnFamily
        // / SetOptions will fail if options file is not detected or properly
        // persisted.
        //
        // DEFAULT: false
        bool fail_if_missing_options_file;
      
      Options file names are formatted as OPTIONS-<number>, and RocksDB
      will always keep the latest two options files.
      
      Test Plan:
      Add options_file_test.
      
      options_test
      column_family_test
      
      Reviewers: igor, IslamAbdelRahman, sdong, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48285
      e114f0ab
  31. 10 11月, 2015 1 次提交
    • N
      Switch to thread-local random for skiplist · b81b4309
      Nathan Bronson 提交于
      Summary:
      Using a TLS random instance for skiplist makes it smaller
      (useful for hash_skiplist_rep) and prepares skiplist for concurrent
      adds.  This diff also modifies the branching factor math to avoid an
      unnecessary division.
      
      This diff has the effect of changing the sequence of skip list node
      height choices made by tests, so it has the potential to cause unit
      test failures for tests that implicitly rely on the exact structure
      of the skip list.  Tests that try to exactly trigger a compaction are
      likely suspects for this problem (these tests have always been brittle to
      changes in the skiplist details).  I've minimizes this risk by reseeding
      the main thread's Random at the beginning of each test, increasing the
      universal compaction size_ratio limit from 101% to 105% for some tests,
      and verifying that the tests pass many times.
      
      Test Plan: for i in `seq 0 9`; do make check; done
      
      Reviewers: sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D50439
      b81b4309
  32. 23 10月, 2015 1 次提交
  33. 19 10月, 2015 1 次提交
  34. 16 10月, 2015 1 次提交
  35. 14 10月, 2015 2 次提交
    • I
      Make db_test_util compile under ROCKSDB_LITE · f55d3009
      Islam AbdelRahman 提交于
      Summary: db_test_util is used in multiple test files but it dont compile under ROCKSDB_LITE
      
      Test Plan:
      make check
      make static_lib
      OPT=-DROCKSDB_LITE make db_wal_test
      
      Reviewers: igor, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48579
      f55d3009
    • S
      Seperate InternalIterator from Iterator · 35ad531b
      sdong 提交于
      Summary:
      Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.
      
      This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
      At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.
      
      Test Plan: Run all existing tests.
      
      Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven
      
      Reviewed By: rven
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48549
      35ad531b
  36. 13 10月, 2015 2 次提交