1. 27 4月, 2017 1 次提交
  2. 25 4月, 2017 1 次提交
    • A
      Reunite checkpoint and backup core logic · e5e545a0
      Andrew Kryczka 提交于
      Summary:
      These code paths forked when checkpoint was introduced by copy/pasting the core backup logic. Over time they diverged and bug fixes were sometimes applied to one but not the other (like fix to include all relevant WALs for 2PC), or it required extra effort to fix both (like fix to forge CURRENT file). This diff reunites the code paths by extracting the core logic into a function, CreateCustomCheckpoint(), that is customizable via callbacks to implement both checkpoint and backup.
      
      Related changes:
      
      - flush_before_backup is now forcibly enabled when 2PC is enabled
      - Extracted CheckpointImpl class definition into a header file. This is so the function, CreateCustomCheckpoint(), can be called by internal rocksdb code but not exposed to users.
      - Implemented more functions in DummyDB/DummyLogFile (in backupable_db_test.cc) that are used by CreateCustomCheckpoint().
      Closes https://github.com/facebook/rocksdb/pull/1932
      
      Differential Revision: D4622986
      
      Pulled By: ajkr
      
      fbshipit-source-id: 157723884236ee3999a682673b64f7457a7a0d87
      e5e545a0
  3. 07 4月, 2017 3 次提交
    • S
      Refactor compaction picker code · ff972870
      Siying Dong 提交于
      Summary:
      1. Move universal compaction picker to separate files compaction_picker_universal.cc and compaction_picker_universal.h.
      2. Rename some functions to make the code easier to understand.
      3. Move leveled compaction picking code to a dedicated class, so that we we don't need to pass some common variable around when calling functions. It also allowed us to break down LevelCompactionPicker::PickCompaction() to smaller functions.
      Closes https://github.com/facebook/rocksdb/pull/2100
      
      Differential Revision: D4845948
      
      Pulled By: siying
      
      fbshipit-source-id: efa0ab4
      ff972870
    • S
      Move various string utility functions into string_util · 343b59d6
      Sagar Vemuri 提交于
      Summary:
      This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.
      
      Check the sub-commits for easier reviewing.
      Closes https://github.com/facebook/rocksdb/pull/2094
      
      Differential Revision: D4837730
      
      Pulled By: sagar0
      
      fbshipit-source-id: 344278a
      343b59d6
    • Y
      Move memtable related files into memtable directory · df6f5a37
      Yi Wu 提交于
      Summary:
      Move memtable related files into memtable directory.
      Closes https://github.com/facebook/rocksdb/pull/2087
      
      Differential Revision: D4829242
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: ca70ab6
      df6f5a37
  4. 06 4月, 2017 2 次提交
  5. 05 4月, 2017 1 次提交
  6. 04 4月, 2017 1 次提交
  7. 31 3月, 2017 1 次提交
  8. 04 3月, 2017 1 次提交
  9. 01 3月, 2017 1 次提交
  10. 28 2月, 2017 1 次提交
  11. 24 2月, 2017 1 次提交
  12. 03 2月, 2017 1 次提交
  13. 26 1月, 2017 1 次提交
    • A
      Generalize Env registration framework · 17c11806
      Andrew Kryczka 提交于
      Summary:
      The Env registration framework supports registering client Envs and selecting which one to instantiate according to a text field. This enabled things like adding the -env_uri argument to db_bench, so the same binary could be reused with different Envs just by changing CLI config.
      
      Now this problem has come up again in a non-Env context, as I want to instantiate a client Statistics implementation from db_bench, which is configured entirely via text parameters. Also, in the future we may wish to use it for deserializing client objects when loading OPTIONS file.
      
      This diff generalizes the Env registration logic to work with arbitrary types.
      
      - Generalized registration and instantiation code by templating them
      - The entire implementation is in a header file as that's Google style guide's recommendation for template definitions
      - Pattern match with std::regex_match rather than checking prefix, which was the previous behavior
      - Rename functions/files to be non-Env-specific
      Closes https://github.com/facebook/rocksdb/pull/1776
      
      Differential Revision: D4421933
      
      Pulled By: ajkr
      
      fbshipit-source-id: 34647d1
      17c11806
  14. 17 12月, 2016 1 次提交
  15. 02 12月, 2016 1 次提交
  16. 30 11月, 2016 1 次提交
  17. 17 11月, 2016 1 次提交
  18. 21 10月, 2016 1 次提交
    • I
      Support IngestExternalFile (remove AddFile restrictions) · 869ae5d7
      Islam AbdelRahman 提交于
      Summary:
      Changes in the diff
      
      API changes:
      - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
      - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
      - Deprecate AddFile() API
      
      Logic changes:
      - If our file overlap with the memtable we will flush the memtable
      - We will find the first level in the LSM tree that our file key range overlap with the keys in it
      - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
      - We will assign a global sequence number to our new file
      - Remove AddFile restrictions by using global sequence numbers
      
      Other changes:
      - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
      
      Test Plan:
      unit tests (still need to add more)
      addfile_stress (https://reviews.facebook.net/D65037)
      
      Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: jkedgar, hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65061
      869ae5d7
  19. 19 10月, 2016 1 次提交
    • A
      Compaction Support for Range Deletion · 6fbe96ba
      Andrew Kryczka 提交于
      Summary:
      This diff introduces RangeDelAggregator, which takes ownership of iterators
      provided to it via AddTombstones(). The tombstones are organized in a two-level
      map (snapshot stripe -> begin key -> tombstone). Tombstone creation avoids data
      copy by holding Slices returned by the iterator, which remain valid thanks to pinning.
      
      For compaction, we create a hierarchical range tombstone iterator with structure
      matching the iterator over compaction input data. An aggregator based on that
      iterator is used by CompactionIterator to determine which keys are covered by
      range tombstones. In case of merge operand, the same aggregator is used by
      MergeHelper. Upon finishing each file in the compaction, relevant range tombstones
      are added to the output file's range tombstone metablock and file boundaries are
      updated accordingly.
      
      To check whether a key is covered by range tombstone, RangeDelAggregator::ShouldDelete()
      considers tombstones in the key's snapshot stripe. When this function is used outside of
      compaction, it also checks newer stripes, which can contain covering tombstones. Currently
      the intra-stripe check involves a linear scan; however, in the future we plan to collapse ranges
      within a stripe such that binary search can be used.
      
      RangeDelAggregator::AddToBuilder() adds all range tombstones in the table's key-range
      to a new table's range tombstone meta-block. Since range tombstones may fall in the gap
      between files, we may need to extend some files' key-ranges. The strategy is (1) first file
      extends as far left as possible and other files do not extend left, (2) all files extend right
      until either the start of the next file or the end of the last range tombstone in the gap,
      whichever comes first.
      
      One other notable change is adding release/move semantics to ScopedArenaIterator
      such that it can be used to transfer ownership of an arena-allocated iterator, similar to
      how unique_ptr is used for malloc'd data.
      
      Depends on D61473
      
      Test Plan: compaction_iterator_test, mock_table, end-to-end tests in D63927
      
      Reviewers: sdong, IslamAbdelRahman, wanning, yhchiang, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62205
      6fbe96ba
  20. 30 9月, 2016 1 次提交
  21. 24 9月, 2016 1 次提交
    • Y
      Split DBOptions into ImmutableDBOptions and MutableDBOptions · 9ed928e7
      Yi Wu 提交于
      Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs.
      
      Test Plan:
        make all check
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64065
      9ed928e7
  22. 08 9月, 2016 1 次提交
  23. 06 9月, 2016 1 次提交
  24. 03 9月, 2016 1 次提交
  25. 27 8月, 2016 1 次提交
    • I
      Expose ThreadPool under include/rocksdb/threadpool.h · e9b2af87
      Islam AbdelRahman 提交于
      Summary:
      This diff split ThreadPool to
      -ThreadPool (abstract interface exposed in include/rocksdb/threadpool.h)
      -ThreadPoolImpl (actual implementation in util/threadpool_imp.h)
      
      This allow us to expose ThreadPool to the user so we can use it as an option later
      
      Test Plan: existing unit tests
      
      Reviewers: andrewkr, yiwu, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62085
      e9b2af87
  26. 23 8月, 2016 1 次提交
  27. 20 8月, 2016 1 次提交
    • Y
      Introduce ClockCache · 4cc37f59
      Yi Wu 提交于
      Summary:
      Clock-based cache implemenetation aim to have better concurreny than
      default LRU cache. See inline comments for implementation details.
      
      Test Plan:
      Update cache_test to run on both LRUCache and ClockCache. Adding some
      new tests to catch some of the bugs that I fixed while implementing the
      cache.
      
      Reviewers: kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61647
      4cc37f59
  28. 11 8月, 2016 1 次提交
    • S
      [Proof-Of-Concept] RocksDB Blob Storage with a blob log file. · 8b79422b
      sdong 提交于
      Summary:
      This is a proof of concept of a RocksDB blob log file. The actual value of the Put() is appended to a blob log using normal data block format, and the handle of the block is written as the value of the key in RocksDB.
      
      The prototype only supports Put() and Get(). It doesn't support DB restart, garbage collection, Write() call, iterator, snapshots, etc.
      
      Test Plan: Add unit tests.
      
      Reviewers: arahut
      
      Reviewed By: arahut
      
      Subscribers: kradhakrishnan, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D61485
      8b79422b
  29. 06 8月, 2016 2 次提交
    • O
      Add time series database (resubmitted) · 44f5cc57
      omegaga 提交于
      Summary: Implement a time series database that supports DateTieredCompactionStrategy. It wraps a db object and separate SST files in different column families (time windows).
      
      Test Plan: Add `date_tiered_test`.
      
      Reviewers: dhruba, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61653
      44f5cc57
    • S
      A utility function to help users migrate DB after options change · 7c4615cf
      sdong 提交于
      Summary: Add a utility function that trigger necessary full compaction and put output to the correct level by looking at new options and old options.
      
      Test Plan: Add unit tests for it.
      
      Reviewers: andrewkr, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: muthu, sumeet, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D60783
      7c4615cf
  30. 02 8月, 2016 2 次提交
    • O
      Experiments on column-aware encodings · d51dc96a
      omegaga 提交于
      Summary:
      Experiments on column-aware encodings. Supported features: 1) extract data blocks from SST file and encode with specified encodings; 2) Decode encoded data back into row format; 3) Directly extract data blocks and write in row format (without prefix encoding); 4) Get column distribution statistics for column format; 5) Dump data blocks separated by columns in human-readable format.
      
      There is still on-going work on this diff. More refactoring is necessary.
      
      Test Plan: Wrote tests in `column_aware_encoding_test.cc`. More tests should be added.
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: arahut, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D60027
      d51dc96a
    • K
      Persistent Read Cache (part 6) Block Cache Tier Implementation · c116b478
      krad 提交于
      Summary:
      The patch is a continuation of part 5. It glues the abstraction for
      file layout and metadata, and flush out the implementation of the API. It
      adds unit tests for the implementation.
      
      Test Plan: Run unit tests
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D57549
      c116b478
  31. 26 7月, 2016 1 次提交
  32. 22 7月, 2016 1 次提交
    • Y
      Fix flush not being commit while writing manifest · 32604e66
      Yi Wu 提交于
      Summary:
      Fix flush not being commit while writing manifest, which is a recent bug introduced by D60075.
      
      The issue:
      # Options.max_background_flushes > 1
      # Background thread A pick up a flush job, flush, then commit to manifest. (Note that mutex is released before writing manifest.)
      # Background thread B pick up another flush job, flush. When it gets to `MemTableList::InstallMemtableFlushResults`, it notices another thread is commiting, so it quit.
      # After the first commit, thread A doesn't double check if there are more flush result need to commit, leaving the second flush uncommitted.
      
      Test Plan: run the test. Also verify the new test hit deadlock without the fix.
      
      Reviewers: sdong, igor, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, omegaga, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D60969
      32604e66
  33. 20 7月, 2016 1 次提交
    • K
      Persistent Read Cache (6) Persistent cache tier implentation - File layout · d9cfaa2b
      krad 提交于
      Summary:
      Persistent cache tier is the tier abstraction that can work for any block
      device based device mounted on a file system. The design/implementation can
      handle any generic block device.
      
      Any generic block support is achieved by generalizing the access patten as
      {io-size, q-depth, direct-io/buffered}.
      
      We have specifically tested and adapted the IO path for NVM and SSD.
      
      Persistent cache tier consists of there parts :
      
      1) File layout
      
      Provides the implementation for handling IO path for reading and writing data
      (key/value pair).
      
      2) Meta-data
      Provides the implementation for handling the index for persistent read cache.
      
      3) Implementation
      It binds (1) and (2) and flushed out the PersistentCacheTier interface
      
      This patch provides implementation for (1)(2). Follow up patch will provide (3)
      and tests.
      
      Test Plan: Compile and run check
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D57117
      d9cfaa2b
  34. 16 7月, 2016 1 次提交
    • Y
      Refactor cache.cc · 4b952535
      Yi Wu 提交于
      Summary: Refactor cache.cc so that I can plugin clock cache (D55581). Mainly move `ShardedCache` to separate file, move `LRUHandle` back to cache.cc and rename it lru_cache.cc.
      
      Test Plan:
          make check -j64
      
      Reviewers: lightmark, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D59655
      4b952535
  35. 13 7月, 2016 1 次提交
    • Y
      Fix deadlock when trying update options when write stalls · 6ea41f85
      Yi Wu 提交于
      Summary:
      When write stalls because of auto compaction is disabled, or stop write trigger is reached,
      user may change these two options to unblock writes. Unfortunately we had issue where the write
      thread will block the attempt to persist the options, thus creating a deadlock. This diff
      fix the issue and add two test cases to detect such deadlock.
      
      Test Plan:
      Run unit tests.
      
      Also, revert db_impl.cc to master (but don't revert `DBImpl::BackgroundCompaction:Finish` sync point) and run db_options_test. Both tests should hit deadlock.
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D60627
      6ea41f85