1. 17 11月, 2016 1 次提交
    • Y
      Remove Ticker::SEQUENCE_NUMBER · 36e4762c
      Yi Wu 提交于
      Summary:
      Remove the ticker count because:
      * Having to reset the ticker count in WriteImpl is ineffiecent;
      * It doesn't make sense to have it as a ticker count if multiple db
        instance share a statistics object.
      Closes https://github.com/facebook/rocksdb/pull/1531
      
      Differential Revision: D4194442
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: e2110a9
      36e4762c
  2. 16 11月, 2016 1 次提交
  3. 15 11月, 2016 1 次提交
  4. 13 11月, 2016 1 次提交
  5. 12 11月, 2016 1 次提交
  6. 11 11月, 2016 1 次提交
    • R
      Fix 2PC Recovery SeqId Miscount · 1ca5f6d1
      Reid Horuff 提交于
      Summary:
      Originally sequence ids were calculated, in recovery, based off of the first seqid found if the first log recovered. The working seqid was then incremented from that value based on every insertion that took place. This was faulty because of the potential for missing log files or inserts that skipped the WAL. The current recovery scheme grabs sequence from current recovering batch and increments using memtableinserter to track how many actual inserts take place. This works for 2PC batches as well scenarios where some logs are missing or inserts that skip the WAL.
      Closes https://github.com/facebook/rocksdb/pull/1486
      
      Differential Revision: D4156064
      
      Pulled By: reidHoruff
      
      fbshipit-source-id: a6da8d9
      1ca5f6d1
  7. 10 11月, 2016 3 次提交
  8. 05 11月, 2016 1 次提交
    • A
      DeleteRange user iterator support · 9e7cf346
      Andrew Kryczka 提交于
      Summary:
      Note: reviewed in  https://reviews.facebook.net/D65115
      
      - DBIter maintains a range tombstone accumulator. We don't cleanup obsolete tombstones yet, so if the user seeks back and forth, the same tombstones would be added to the accumulator multiple times.
      - DBImpl::NewInternalIterator() (used to make DBIter's underlying iterator) adds memtable/L0 range tombstones, L1+ range tombstones are added on-demand during NewSecondaryIterator() (see D62205)
      - DBIter uses ShouldDelete() when advancing to check whether keys are covered by range tombstones
      Closes https://github.com/facebook/rocksdb/pull/1464
      
      Differential Revision: D4131753
      
      Pulled By: ajkr
      
      fbshipit-source-id: be86559
      9e7cf346
  9. 04 11月, 2016 1 次提交
    • A
      DeleteRange Get support · f998c979
      Andrew Kryczka 提交于
      Summary:
      During Get()/MultiGet(), build up a RangeDelAggregator with range
      tombstones as we search through live memtable, immutable memtables, and
      SST files. This aggregator is then used by memtable.cc's SaveValue() and
      GetContext::SaveValue() to check whether keys are covered.
      
      added tests for Get on memtables/files; end-to-end tests mainly in https://reviews.facebook.net/D64761
      Closes https://github.com/facebook/rocksdb/pull/1456
      
      Differential Revision: D4111271
      
      Pulled By: ajkr
      
      fbshipit-source-id: 6e388d4
      f998c979
  10. 03 11月, 2016 1 次提交
  11. 01 11月, 2016 2 次提交
  12. 30 10月, 2016 1 次提交
  13. 29 10月, 2016 1 次提交
  14. 25 10月, 2016 1 次提交
  15. 22 10月, 2016 1 次提交
  16. 21 10月, 2016 1 次提交
    • I
      Support IngestExternalFile (remove AddFile restrictions) · 869ae5d7
      Islam AbdelRahman 提交于
      Summary:
      Changes in the diff
      
      API changes:
      - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
      - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
      - Deprecate AddFile() API
      
      Logic changes:
      - If our file overlap with the memtable we will flush the memtable
      - We will find the first level in the LSM tree that our file key range overlap with the keys in it
      - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
      - We will assign a global sequence number to our new file
      - Remove AddFile restrictions by using global sequence numbers
      
      Other changes:
      - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
      
      Test Plan:
      unit tests (still need to add more)
      addfile_stress (https://reviews.facebook.net/D65037)
      
      Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: jkedgar, hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65061
      869ae5d7
  17. 15 10月, 2016 2 次提交
    • A
      Handle WAL deletion when using avoid_flush_during_recovery · f4705401
      Andrew Kryczka 提交于
      Summary:
      Previously the WAL files that were avoided during recovery would never
      be considered for deletion. That was because alive_log_files_ was only
      populated when log files are created. This diff further populates
      alive_log_files_ with existing log files that aren't flushed during recovery,
      such that FindObsoleteFiles() can find them later.
      
      Depends on D64053.
      
      Test Plan: new unit test, verifies it fails before this change and passes after
      
      Reviewers: sdong, IslamAbdelRahman, yiwu
      
      Reviewed By: yiwu
      
      Subscribers: leveldb, dhruba, andrewkr
      
      Differential Revision: https://reviews.facebook.net/D64059
      f4705401
    • Y
      Make max_background_compactions and base_background_compactions dynamic changeable · e29d3b67
      Yi Wu 提交于
      Summary:
      Add DB::SetDBOptions to dynamic change max_background_compactions and base_background_compactions.
      I'll add more dynamic changeable options soon.
      
      Test Plan: unit test.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64749
      e29d3b67
  18. 14 10月, 2016 1 次提交
    • I
      Fix compaction conflict with running compaction · 5691a1d8
      Islam AbdelRahman 提交于
      Summary:
      Issue scenario:
      (1) We have 3 files in L1 and we issue a compaction that will compact them into 1 file in L2
      (2) While compaction (1) is running, we flush a file into L0 and trigger another compaction that decide to move this file to L1 and then move it again to L2 (this file don't overlap with any other files)
      (3) compaction (1) finishes and install the file it generated in L2, but this file overlap with the file we generated in (2) so we break the LSM consistency
      
      Looks like this issue can be triggered by using non-exclusive manual compaction or AddFile()
      
      Test Plan: unit tests
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: hermanlee4, jkedgar, andrewkr, dhruba, yoshinorim
      
      Differential Revision: https://reviews.facebook.net/D64947
      5691a1d8
  19. 13 10月, 2016 2 次提交
  20. 12 10月, 2016 1 次提交
    • A
      new Prev() prefix support using SeekForPrev() · 447f1712
      Aaron Gao 提交于
      Summary:
      1) The previous solution for Prev() prefix support is not clean.
      Since I add api SeekForPrev(), now the Prev() can be symmetric to Next().
      and we do not need SeekToLast() to be called in Prev() any more.
      
      Also, Next() will Seek(prefix_seek_key_) to solve the problem of possible inconsistency between db_iter and merge_iter when
      there is merge_operator. And prefix_seek_key is only refreshed when change direction to forward.
      
      2) This diff also solves the bug of Iterator::SeekToLast() with iterate_upper_bound_ with prefix extractor.
      
      add test cases for the above two cases.
      
      There are some tests for the SeekToLast() in Prev(), I will clean them later.
      
      Test Plan: make all check
      
      Reviewers: IslamAbdelRahman, andrewkr, yiwu, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D63933
      447f1712
  21. 08 10月, 2016 1 次提交
    • R
      Add facility to write only a portion of WriteBatch to WAL · 2c1f9529
      Reid Horuff 提交于
      Summary:
      When constructing a write batch a client may now call MarkWalTerminationPoint() on that batch. No batch operations after this call will be added written to the WAL but will still be inserted into the Memtable. This facility is used to remove one of the three WriteImpl calls in 2PC transactions. This produces a ~1% perf improvement.
      
      ```
      RocksDB - unoptimized 2pc, sync_binlog=1, disable_2pc=off
      INFO 2016-08-31 14:30:38,814 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2619 seconds. Requests/second = 28628
      
      RocksDB - optimized 2pc , sync_binlog=1, disable_2pc=off
      INFO 2016-08-31 16:26:59,442 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2581 seconds. Requests/second = 29054
      ```
      
      Test Plan: Two unit tests added.
      
      Reviewers: sdong, yiwu, IslamAbdelRahman
      
      Reviewed By: yiwu
      
      Subscribers: hermanlee4, dhruba, andrewkr
      
      Differential Revision: https://reviews.facebook.net/D64599
      2c1f9529
  22. 29 9月, 2016 1 次提交
    • I
      Fix conflict between AddFile() and CompactRange() · 87dfc1d2
      Islam AbdelRahman 提交于
      Summary:
      Fix the conflict bug between AddFile() and CompactRange() by
      - Make sure that no AddFile calls are running when asking CompactionPicker to pick compaction for manual compaction
      - If AddFile() run after we pick the compaction for the manual compaction it will be aware of it since we will add the manual compaction to running_compactions_ after picking it
      
      This will solve these 2 scenarios
      - If AddFile() is running, we will wait for it to finish before we pick a compaction for the manual compaction
      - If we already picked a manual compaction and then AddFile() started ... we ensure that it never ingest a file in a level that will overlap with the manual compaction
      
      Test Plan: unit tests
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, yoshinorim, jkedgar, dhruba
      
      Differential Revision: https://reviews.facebook.net/D64449
      87dfc1d2
  23. 28 9月, 2016 1 次提交
  24. 27 9月, 2016 1 次提交
    • I
      Fix AddFile() conflict with compaction output [WaitForAddFile()] · 5c64fb67
      Islam AbdelRahman 提交于
      Summary:
      Since AddFile unlock/lock the mutex inside LogAndApply() we need to ensure that during this period other compactions cannot run since such compactions are not aware of the file we are ingesting and could create a compaction that overlap wit this file
      
      this diff add
      - WaitForAddFile() call that will ensure that no AddFile() calls are being processed right now
      - Call `WaitForAddFile()` in 3 locations
      -- When doing manual Compaction
      -- When starting automatic Compaction
      -- When  doing CompactFiles()
      
      Test Plan: unit test
      
      Reviewers: lightmark, yiwu, andrewkr, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, yoshinorim, jkedgar, dhruba
      
      Differential Revision: https://reviews.facebook.net/D64383
      5c64fb67
  25. 24 9月, 2016 2 次提交
    • Y
      Split DBOptions into ImmutableDBOptions and MutableDBOptions · 9ed928e7
      Yi Wu 提交于
      Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs.
      
      Test Plan:
        make all check
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64065
      9ed928e7
    • Y
      Recover same sequence id from WAL (#1350) · 4bc8c88e
      yiwu-arbug 提交于
      Summary:
      Revert the behavior where we don't read sequence id from WAL, but increase it as we replay the log. We still keep the behave for 2PC for now but will fix later.
      
      This change fixes github issue 1339, where some writes come with WAL disabled and we may recover records with wrong sequence id.
      
      Test Plan: Added unit test.
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D64275
      4bc8c88e
  26. 22 9月, 2016 1 次提交
  27. 21 9月, 2016 1 次提交
  28. 20 9月, 2016 2 次提交
    • S
      DBImpl::GetWalPreallocateBlockSize() should return size_t · d78a4401
      sdong 提交于
      Summary: WritableFile::SetPreallocationBlockSize() requires parameter as size_t, and options used in DBImpl::GetWalPreallocateBlockSize() are all size_t. WritableFile::SetPreallocationBlockSize() should return size_t to avoid build break if size_t is not uint64_t.
      
      Test Plan: Run existing tests.
      
      Reviewers: andrewkr, IslamAbdelRahman, yiwu
      
      Reviewed By: yiwu
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D64137
      d78a4401
    • S
      Consider more factors when determining preallocation size of WAL files · b666f854
      sdong 提交于
      Summary: Currently the WAL file preallocation size is 1.1 * write_buffer_size. This, however, will be over-estimated if options.db_write_buffer_size or options.max_total_wal_size is set and is much smaller.
      
      Test Plan: Add a unit test.
      
      Reviewers: andrewkr, yiwu
      
      Reviewed By: yiwu
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D63957
      b666f854
  29. 17 9月, 2016 1 次提交
    • Y
      Remove ColumnFamilyData::options() · 0a88f38b
      Yi Wu 提交于
      Summary: One more small refactor before I split DBOptions into mutable and immutable parts.
      
      Test Plan: existing unit tests.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64047
      0a88f38b
  30. 16 9月, 2016 1 次提交
    • A
      Fix recovery for WALs without data for all CFs · 06b4785f
      Andrew Kryczka 提交于
      Summary:
      if one or more CFs had no data in the WAL, the log number that's used
      by FindObsoleteFiles() wasn't updated. We need to treat this case the same as
      if the data for that WAL had been flushed.
      
      Test Plan: new unit test
      
      Reviewers: IslamAbdelRahman, yiwu, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D63963
      06b4785f
  31. 15 9月, 2016 1 次提交
  32. 14 9月, 2016 1 次提交
    • Y
      Refactor MutableCFOptions · 81747f1b
      Yi Wu 提交于
      Summary:
      * Change constructor of MutableCFOptions to depends only on ColumnFamilyOptions.
      * Move `max_subcompactions`, `compaction_options_fifo` and `compaction_pri` to ImmutableCFOptions to make it clear that they are immutable.
      
      Test Plan: existing unit tests.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D63945
      81747f1b
  33. 13 9月, 2016 1 次提交
    • S
      Summary: (#1313) · 9e4aa798
      somnathr 提交于
      If log recycling is enabled with the rocksdb (recycle_log_file_num=16)
       db->Writebatch is erroring out with keynotfound after ~5-6 hours of run
       (1M seq but can happen to any workload I guess).See my detailed bug
       report here (https://github.com/facebook/rocksdb/issues/1303).
       This commit is the fix for this, a check is been added not to delete
       the log file if it is already there in the recycle list.
      
      Test Plan:
       Unit tested it and ran the similar profile. Not reproducing anymore.
      9e4aa798