1. 24 2月, 2017 1 次提交
  2. 23 2月, 2017 2 次提交
  3. 07 2月, 2017 1 次提交
    • M
      Two-level Indexes · 69d5262c
      Maysam Yabandeh 提交于
      Summary:
      Partition Index blocks and use a Partition-index as a 2nd level index.
      
      The two-level index can be used by setting
      BlockBasedTableOptions::kTwoLevelIndexSearch as the index type and
      configuring BlockBasedTableOptions::index_per_partition
      
      t15539501
      Closes https://github.com/facebook/rocksdb/pull/1814
      
      Differential Revision: D4473535
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: bffb87e
      69d5262c
  4. 04 1月, 2017 1 次提交
  5. 23 12月, 2016 1 次提交
    • A
      direct io write support · 972f96b3
      Aaron Gao 提交于
      Summary:
      rocksdb direct io support
      
      ```
      [gzh@dev11575.prn2 ~/rocksdb] ./db_bench -benchmarks=fillseq --num=1000000
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      RocksDB:    version 5.0
      Date:       Wed Nov 23 13:17:43 2016
      CPU:        40 * Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
      CPUCache:   25600 KB
      Keys:       16 bytes each
      Values:     100 bytes each (50 bytes after compression)
      Entries:    1000000
      Prefix:    0 bytes
      Keys per prefix:    0
      RawSize:    110.6 MB (estimated)
      FileSize:   62.9 MB (estimated)
      Write rate: 0 bytes/second
      Compression: Snappy
      Memtablerep: skip_list
      Perf Level: 1
      WARNING: Assertions are enabled; benchmarks unnecessarily slow
      ------------------------------------------------
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      DB path: [/tmp/rocksdbtest-112628/dbbench]
      fillseq      :       4.393 micros/op 227639 ops/sec;   25.2 MB/s
      
      [gzh@dev11575.prn2 ~/roc
      Closes https://github.com/facebook/rocksdb/pull/1564
      
      Differential Revision: D4241093
      
      Pulled By: lightmark
      
      fbshipit-source-id: 98c29e3
      972f96b3
  6. 17 12月, 2016 1 次提交
  7. 22 11月, 2016 1 次提交
  8. 17 11月, 2016 1 次提交
  9. 21 10月, 2016 1 次提交
    • I
      Support IngestExternalFile (remove AddFile restrictions) · 869ae5d7
      Islam AbdelRahman 提交于
      Summary:
      Changes in the diff
      
      API changes:
      - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
      - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
      - Deprecate AddFile() API
      
      Logic changes:
      - If our file overlap with the memtable we will flush the memtable
      - We will find the first level in the LSM tree that our file key range overlap with the keys in it
      - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
      - We will assign a global sequence number to our new file
      - Remove AddFile restrictions by using global sequence numbers
      
      Other changes:
      - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
      
      Test Plan:
      unit tests (still need to add more)
      addfile_stress (https://reviews.facebook.net/D65037)
      
      Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: jkedgar, hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65061
      869ae5d7
  10. 19 10月, 2016 1 次提交
    • I
      Support SST files with Global sequence numbers [reland] · b88f8e87
      Islam AbdelRahman 提交于
      Summary:
      reland https://reviews.facebook.net/D62523
      
      - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno`
      - Update TableProperties to be aware of the offset of each property in the file
      - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file
      
      Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks
      
      Test Plan: unit tests
      
      Reviewers: sdong, yhchiang
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65211
      b88f8e87
  11. 20 9月, 2016 1 次提交
  12. 08 9月, 2016 1 次提交
  13. 12 8月, 2016 1 次提交
    • I
      Eliminate memcpy from ForwardIterator · d11c09d9
      Islam AbdelRahman 提交于
      Summary:
      This diff update ForwardIterator to support pinning keys and values, which will allow DBIter to take advantage of that and eliminate memcpy when executing merge operators
      This diff is stacked on D61305
      
      Test Plan:
      existing tests (updated them to test tailing iterator)
      new test
      
      Reviewers: andrewkr, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D60009
      d11c09d9
  14. 11 8月, 2016 1 次提交
    • S
      read_options.background_purge_on_iterator_cleanup to cover forward iterator... · 56dd0341
      sdong 提交于
      read_options.background_purge_on_iterator_cleanup to cover forward iterator and log file closing too.
      
      Summary: With read_options.background_purge_on_iterator_cleanup=true, File deletion and closing can still happen in forward iterator, or WAL file closing. Cover those cases too.
      
      Test Plan: I am adding unit tests.
      
      Reviewers: andrewkr, IslamAbdelRahman, yiwu
      
      Reviewed By: yiwu
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D61503
      56dd0341
  15. 21 7月, 2016 1 次提交
    • I
      Introduce FullMergeV2 (eliminate memcpy from merge operators) · 68a8e6b8
      Islam AbdelRahman 提交于
      Summary:
      This diff update the code to pin the merge operator operands while the merge operation is done, so that we can eliminate the memcpy cost, to do that we need a new public API for FullMerge that replace the std::deque<std::string> with std::vector<Slice>
      
      This diff is stacked on top of D56493 and D56511
      
      In this diff we
      - Update FullMergeV2 arguments to be encapsulated in MergeOperationInput and MergeOperationOutput which will make it easier to add new arguments in the future
      - Replace std::deque<std::string> with std::vector<Slice> to pass operands
      - Replace MergeContext std::deque with std::vector (based on a simple benchmark I ran https://gist.github.com/IslamAbdelRahman/78fc86c9ab9f52b1df791e58943fb187)
      - Allow FullMergeV2 output to be an existing operand
      
      ```
      [Everything in Memtable | 10K operands | 10 KB each | 1 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=10000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000
      
      [FullMergeV2]
      readseq      :       0.607 micros/op 1648235 ops/sec; 16121.2 MB/s
      readseq      :       0.478 micros/op 2091546 ops/sec; 20457.2 MB/s
      readseq      :       0.252 micros/op 3972081 ops/sec; 38850.5 MB/s
      readseq      :       0.237 micros/op 4218328 ops/sec; 41259.0 MB/s
      readseq      :       0.247 micros/op 4043927 ops/sec; 39553.2 MB/s
      
      [master]
      readseq      :       3.935 micros/op 254140 ops/sec; 2485.7 MB/s
      readseq      :       3.722 micros/op 268657 ops/sec; 2627.7 MB/s
      readseq      :       3.149 micros/op 317605 ops/sec; 3106.5 MB/s
      readseq      :       3.125 micros/op 320024 ops/sec; 3130.1 MB/s
      readseq      :       4.075 micros/op 245374 ops/sec; 2400.0 MB/s
      ```
      
      ```
      [Everything in Memtable | 10K operands | 10 KB each | 10 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=1000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000
      
      [FullMergeV2]
      readseq      :       3.472 micros/op 288018 ops/sec; 2817.1 MB/s
      readseq      :       2.304 micros/op 434027 ops/sec; 4245.2 MB/s
      readseq      :       1.163 micros/op 859845 ops/sec; 8410.0 MB/s
      readseq      :       1.192 micros/op 838926 ops/sec; 8205.4 MB/s
      readseq      :       1.250 micros/op 800000 ops/sec; 7824.7 MB/s
      
      [master]
      readseq      :      24.025 micros/op 41623 ops/sec;  407.1 MB/s
      readseq      :      18.489 micros/op 54086 ops/sec;  529.0 MB/s
      readseq      :      18.693 micros/op 53495 ops/sec;  523.2 MB/s
      readseq      :      23.621 micros/op 42335 ops/sec;  414.1 MB/s
      readseq      :      18.775 micros/op 53262 ops/sec;  521.0 MB/s
      
      ```
      
      ```
      [Everything in Block cache | 10K operands | 10 KB each | 1 operand per key]
      
      [FullMergeV2]
      $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions
      readseq      :      14.741 micros/op 67837 ops/sec;  663.5 MB/s
      readseq      :       1.029 micros/op 971446 ops/sec; 9501.6 MB/s
      readseq      :       0.974 micros/op 1026229 ops/sec; 10037.4 MB/s
      readseq      :       0.965 micros/op 1036080 ops/sec; 10133.8 MB/s
      readseq      :       0.943 micros/op 1060657 ops/sec; 10374.2 MB/s
      
      [master]
      readseq      :      16.735 micros/op 59755 ops/sec;  584.5 MB/s
      readseq      :       3.029 micros/op 330151 ops/sec; 3229.2 MB/s
      readseq      :       3.136 micros/op 318883 ops/sec; 3119.0 MB/s
      readseq      :       3.065 micros/op 326245 ops/sec; 3191.0 MB/s
      readseq      :       3.014 micros/op 331813 ops/sec; 3245.4 MB/s
      ```
      
      ```
      [Everything in Block cache | 10K operands | 10 KB each | 10 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10-operands-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions
      
      [FullMergeV2]
      readseq      :      24.325 micros/op 41109 ops/sec;  402.1 MB/s
      readseq      :       1.470 micros/op 680272 ops/sec; 6653.7 MB/s
      readseq      :       1.231 micros/op 812347 ops/sec; 7945.5 MB/s
      readseq      :       1.091 micros/op 916590 ops/sec; 8965.1 MB/s
      readseq      :       1.109 micros/op 901713 ops/sec; 8819.6 MB/s
      
      [master]
      readseq      :      27.257 micros/op 36687 ops/sec;  358.8 MB/s
      readseq      :       4.443 micros/op 225073 ops/sec; 2201.4 MB/s
      readseq      :       5.830 micros/op 171526 ops/sec; 1677.7 MB/s
      readseq      :       4.173 micros/op 239635 ops/sec; 2343.8 MB/s
      readseq      :       4.150 micros/op 240963 ops/sec; 2356.8 MB/s
      ```
      
      Test Plan: COMPILE_WITH_ASAN=1 make check -j64
      
      Reviewers: yhchiang, andrewkr, sdong
      
      Reviewed By: sdong
      
      Subscribers: lovro, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D57075
      68a8e6b8
  16. 22 6月, 2016 1 次提交
  17. 18 6月, 2016 1 次提交
    • S
      Deprectate filter_deletes · 7b79238b
      sdong 提交于
      Summary: filter_deltes is not a frequently used feature. Remove it.
      
      Test Plan: Run all test suites.
      
      Reviewers: igor, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D59427
      7b79238b
  18. 10 5月, 2016 1 次提交
    • Y
      Fix win build · 730f7e2e
      Yi Wu 提交于
      Summary: Fixing error with win build where we compare int64_t with size_t.
      
      Test Plan: make check
      
      Reviewers: andrewkr
      
      Reviewed By: andrewkr
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D57885
      730f7e2e
  19. 05 5月, 2016 1 次提交
    • Y
      Enable configurable readahead for iterators · 24a24f01
      Yi Wu 提交于
      Summary:
      Add an option `iterator_readahead_size` to `ReadOptions` to enable
      configurable readahead for iterators similar to the corresponding
      option for compaction.
      
      Test Plan:
      ```
      make commit_prereq
      ```
      
      Reviewers: kumar.rangarajan, ott, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: yiwu, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D55419
      24a24f01
  20. 21 4月, 2016 1 次提交
    • A
      Add per-level compression ratio property · 73a847ef
      Andrew Kryczka 提交于
      Summary:
      This is needed so we can measure compression ratio improvements
      achieved by D52287.
      
      The property compares raw data size against the total file size for a given
      level. If the level is empty it should return 0.0.
      
      Test Plan: new unit test
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D56967
      73a847ef
  21. 19 4月, 2016 2 次提交
    • Y
      Fix lite build · 290883d9
      Yi Wu 提交于
      Summary: Fix rocksdb lite build after D56715.
      
      Test Plan:
        make -j40 'OPT=-g -DROCKSDB_LITE'
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D56895
      290883d9
    • Y
      Split db_test.cc · 792762c4
      Yi Wu 提交于
      Summary: Split db_test.cc into several files. Moving several helper functions into DBTestBase.
      
      Test Plan: make check
      
      Reviewers: sdong, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: dhruba, andrewkr, kradhakrishnan, yhchiang, leveldb, sdong
      
      Differential Revision: https://reviews.facebook.net/D56715
      792762c4
  22. 18 2月, 2016 1 次提交
  23. 10 2月, 2016 1 次提交
  24. 06 2月, 2016 1 次提交
  25. 04 2月, 2016 1 次提交
    • N
      always invalidate sequential-insertion cache for concurrent skiplist adds · 2c1db5ea
      Nathan Bronson 提交于
      Summary:
      InlineSkipList::InsertConcurrently should invalidate the
      sequential-insertion cache prev_[] for all inserts of multi-level nodes,
      not just those that increase the height of the skip list.  The invariant
      for prev_ is that prev_[i] (i > 0) is supposed to be the predecessor of
      prev_[0] at level i.  Before this diff InsertConcurrently could violate
      this constraint when inserting a multi-level node after prev_[i] but
      before prev_[0].
      
      This diff also reenables kConcurrentSkipList as db_test's
      MultiThreaded/MultiThreadedDBTest.MultiThreaded/29.
      
      Test Plan:
      1. unit tests
      2. temporarily hack kConcurrentSkipList timing so that it is fast but has a 1.5% failure rate on my dev box (1ms stagger on thread launch, 1s test duration, failure rate baseline over 1000 runs)
      3. observe 1000 passes post-fix
      
      Reviewers: igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53751
      2c1db5ea
  26. 03 2月, 2016 1 次提交
  27. 02 2月, 2016 1 次提交
  28. 29 1月, 2016 1 次提交
  29. 26 1月, 2016 1 次提交
    • S
      Parameterize DBTest.Randomized · da33dfe1
      sdong 提交于
      Summary: Break down DBTest.Randomized to multiple gtest tests based on config type
      
      Test Plan: Run the test and all tests. Make sure configurations are correctly set
      
      Reviewers: yhchiang, IslamAbdelRahman, rven, kradhakrishnan, andrewkr, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53247
      da33dfe1
  30. 21 1月, 2016 1 次提交
    • A
      Split db_test.cc (part 1: properties) · eceb5cb1
      Andrew Kryczka 提交于
      Summary:
      Moved all the tests that verify property correctness into a separate
      file. The goal is to reduce compile time and complexity of db_test. I didn't
      add parallelism for db_properties_test, even though these tests were
      parallelized in db_test, since the file is small enough that it won't matter.
      
      Some of these moves may be controversial since it's hard to say whether the
      test is "verifying property correctness," or "using properties to verify
      rocksdb's correctness." I'm interested in any opinions.
      
      Test Plan: ran db_properties_test, also waiting on "make commit-prereq -j32"
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D52995
      eceb5cb1
  31. 05 12月, 2015 1 次提交
    • K
      Build break fix. · d3bb572d
      krad 提交于
      Summary: Skip list now cannot estimate memory across allocators
      consistently and hence triggers flush at different time. This breaks certain
      unit tests.
      
      The fix is to adopt key count instead of size for flush.
      
      Test Plan: Ran test on dev box and mac (where it used to fail)
      
      Reviewers: sdong
      
      CC: leveldb@
      
      Task ID: #9273334
      
      Blame Rev:
      d3bb572d
  32. 01 12月, 2015 2 次提交
    • A
      Fix CLANG build · 481f9edb
      agiardullo 提交于
      Summary: fix clang build
      
      Test Plan: build
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D51453
      481f9edb
    • S
      Fix DBTest.SuggestCompactRangeTest for disable jemalloc case · ef8ed368
      sdong 提交于
      Summary: DBTest.SuggestCompactRangeTest fails for the case when jemalloc is disabled, including ASAN and valgrind builds. It is caused by the improvement of skip list, which allocates different size of nodes for a new records. Fix it by using a special mem table that triggers a flush by number of entries. In that way the behavior will be consistent for all allocators.
      
      Test Plan: Run the test with both of DISABLE_JEMALLOC=1 and 0
      
      Reviewers: anthony, rven, yhchiang, kradhakrishnan, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51423
      ef8ed368
  33. 24 11月, 2015 1 次提交
    • V
      Enable C4267 warning · 41b32c60
      Vasili Svirski 提交于
      * conversion from 'size_t' to 'type', by add static_cast
      
      Tested:
      * by build solution on Windows, Linux locally,
      * run tests
      * build CI system successful
      41b32c60
  34. 18 11月, 2015 1 次提交
    • S
      DBTest.MergeTestTime to only use fake time to be determinstic · d5540e18
      sdong 提交于
      Summary: DBTest.MergeTestTime is a test verifying timing counters. Depending on real time may cause non-determinstic results. Change to fake time to be determinsitic.
      
      Test Plan: Run the test and make sure it passes
      
      Reviewers: yhchiang, anthony, rven, kradhakrishnan, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D50883
      d5540e18
  35. 19 10月, 2015 1 次提交
  36. 17 10月, 2015 1 次提交
  37. 14 10月, 2015 1 次提交