1. 28 7月, 2018 1 次提交
    • F
      Add DataBlockIndexType option in BlockBasedTableOptions (#4150) · a11df583
      Fenggang Wu 提交于
      Summary:
      Added DataBlockIndexType option in BlockBasedTableOptions.
      ```
      enum DataBlockIndexType : char {
          kDataBlockBinarySearch = 0, // traditional block type
          kDataBlockHashIndex = 1, // additional hash index appended to the end.
      };
      ```
      The default type is the traditional binary seek option: `kDataBlockBinarySearch`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4150
      
      Differential Revision: D8895958
      
      Pulled By: fgwu
      
      fbshipit-source-id: 480adef48104cf11d30db3bad9a73f98b4a80c10
      a11df583
  2. 17 7月, 2018 1 次提交
  3. 28 6月, 2018 1 次提交
  4. 27 6月, 2018 1 次提交
  5. 22 5月, 2018 1 次提交
    • Z
      Move prefix_extractor to MutableCFOptions · c3ebc758
      Zhongyi Xie 提交于
      Summary:
      Currently it is not possible to change bloom filter config without restart the db, which is causing a lot of operational complexity for users.
      This PR aims to make it possible to dynamically change bloom filter config.
      Closes https://github.com/facebook/rocksdb/pull/3601
      
      Differential Revision: D7253114
      
      Pulled By: miasantreble
      
      fbshipit-source-id: f22595437d3e0b86c95918c484502de2ceca120c
      c3ebc758
  6. 15 5月, 2018 1 次提交
  7. 09 5月, 2018 1 次提交
  8. 11 4月, 2018 1 次提交
    • A
      fix calling SetOptions on deprecated options · 019d7894
      Andrew Kryczka 提交于
      Summary:
      In `cf_options_type_info`, the deprecated options are all considered to have offset zero in the `MutableCFOptions` struct. Previously we weren't checking in `GetMutableOptionsFromStrings` whether the provided option was deprecated or not and simply writing the provided value to the offset specified by `cf_options_type_info`. That meant setting any deprecated option would overwrite the first element in the struct, which is `write_buffer_size`. `db_stress` hit this often since it calls `SetOptions` with `soft_rate_limit=0` and `hard_rate_limit=0`, which are both deprecated so cause `write_buffer_size` to be set to zero, which causes it to crash on the following assertion:
      
      ```
      db_stress: db/memtable.cc:106: rocksdb::MemTable::MemTable(const rocksdb::InternalKeyComparator&, const rocksdb::ImmutableCFOptions&, const rocksdb::MutableCFOptions&, rocksdb::WriteBufferManager*, rocksdb::SequenceNumber, uint32_t): Assertion `!ShouldScheduleFlush()' failed.
      ```
      
      We fix it by skipping deprecated options (and logging a warning) when users provide them to `SetOptions`. I didn't want to fail the call for compatibility reasons.
      Closes https://github.com/facebook/rocksdb/pull/3700
      
      Differential Revision: D7572596
      
      Pulled By: ajkr
      
      fbshipit-source-id: bd5d84e14c0c39f30c5d4c6df7c1503d2c28ecf1
      019d7894
  9. 06 4月, 2018 1 次提交
    • P
      Support for Column family specific paths. · 446b32cf
      Phani Shekhar Mantripragada 提交于
      Summary:
      In this change, an option to set different paths for different column families is added.
      This option is set via cf_paths setting of ColumnFamilyOptions. This option will work in a similar fashion to db_paths setting. Cf_paths is a vector of Dbpath values which contains a pair of the absolute path and target size. Multiple levels in a Column family can go to different paths if cf_paths has more than one path.
      To maintain backward compatibility, if cf_paths is not specified for a column family, db_paths setting will be used. Note that, if db_paths setting is also not specified, RocksDB already has code to use db_name as the only path.
      
      Changes :
      1) A new member "cf_paths" is added to ImmutableCfOptions. This is set, based on cf_paths setting of ColumnFamilyOptions and db_paths setting of ImmutableDbOptions.  This member is used to identify the path information whenever files are accessed.
      2) Validation checks are added for cf_paths setting based on existing checks for db_paths setting.
      3) DestroyDB, PurgeObsoleteFiles etc. are edited to support multiple cf_paths.
      4) Unit tests are added appropriately.
      Closes https://github.com/facebook/rocksdb/pull/3102
      
      Differential Revision: D6951697
      
      Pulled By: ajkr
      
      fbshipit-source-id: 60d2262862b0a8fd6605b09ccb0da32bb331787d
      446b32cf
  10. 03 4月, 2018 1 次提交
    • S
      Level Compaction with TTL · 04c11b86
      Sagar Vemuri 提交于
      Summary:
      Level Compaction with TTL.
      
      As of today, a file could exist in the LSM tree without going through the compaction process for a really long time if there are no updates to the data in the file's key range. For example, in certain use cases, the keys are not actually "deleted"; instead they are just set to empty values. There might not be any more writes to this "deleted" key range, and if so, such data could remain in the LSM for a really long time resulting in wasted space.
      
      Introducing a TTL could solve this problem. Files (and, in turn, data) older than TTL will be scheduled for compaction when there is no other background work. This will make the data go through the regular compaction process and get rid of old unwanted data.
      This also has the (good) side-effect of all the data in the non-bottommost level being newer than ttl, and all data in the bottommost level older than ttl. It could lead to more writes while reducing space.
      
      This functionality can be controlled by the newly introduced column family option -- ttl.
      
      TODO for later:
      - Make ttl mutable
      - Extend TTL to Universal compaction as well? (TTL is already supported in FIFO)
      - Maybe deprecate CompactionOptionsFIFO.ttl in favor of this new ttl option.
      Closes https://github.com/facebook/rocksdb/pull/3591
      
      Differential Revision: D7275442
      
      Pulled By: sagar0
      
      fbshipit-source-id: dcba484717341200d419b0953dafcdf9eb2f0267
      04c11b86
  11. 23 3月, 2018 1 次提交
  12. 14 3月, 2018 1 次提交
  13. 24 1月, 2018 2 次提交
  14. 12 12月, 2017 1 次提交
  15. 29 11月, 2017 1 次提交
    • P
      Support for block_cache num_shards and other config via option string. · 4b65cfc7
      Phani Shekhar Mantripragada 提交于
      Summary:
      Problem: Option string accepts only cache_size as parameter for block_cache which is specified as "block_cache=1M".
      It doesn't accept other parameters like num_shards etc.
      
      Changes :
      1) ParseBlockBasedTableOption in block_based_table_factory is edited to accept cache options in the format "block_cache=<cache_size>:<num_shard_bits>:<strict_capacity_limit>:<high_pri_pool_ratio>".
      Options other than cache_size are optional to maintain backward compatibility. The changes are valid for block_cache_compressed as well.
      For example, "block_cache=1M:6:true:0.5", "block_cache=1M:6:true", "block_cache=1M:6" and "block_cache=1M" are all valid option strings.
      
      2) Corresponding unit tests are added.
      Closes https://github.com/facebook/rocksdb/pull/3108
      
      Differential Revision: D6420997
      
      Pulled By: sagar0
      
      fbshipit-source-id: cdea8b785688d2802907974af27225ccc1c0cd43
      4b65cfc7
  16. 18 11月, 2017 1 次提交
  17. 17 11月, 2017 1 次提交
  18. 02 11月, 2017 1 次提交
    • M
      Added support for differential snapshots · 7fe3b328
      Mikhail Antonov 提交于
      Summary:
      The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2).
      
      This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages.
      
      From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff".
      
      This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR.
      
      For now, what's done here according to initial discussions:
      
      Preserving deletes:
       - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion.
       - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum.
       - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum.
      
      Iterator changes:
       - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum.
      
      TableCache changes:
       - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span.
      
      What's left:
      
       - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type.
      Closes https://github.com/facebook/rocksdb/pull/2999
      
      Differential Revision: D6175602
      
      Pulled By: mikhail-antonov
      
      fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900
      7fe3b328
  19. 01 11月, 2017 1 次提交
  20. 20 10月, 2017 1 次提交
    • S
      Make FIFO compaction options dynamically configurable · f0804db7
      Sagar Vemuri 提交于
      Summary:
      ColumnFamilyOptions::compaction_options_fifo and all its sub-fields can be set dynamically now.
      
      Some of the ways in which the fifo compaction options can be set are:
      - `SetOptions({{"compaction_options_fifo", "{max_table_files_size=1024}"}})`
      - `SetOptions({{"compaction_options_fifo", "{ttl=600;}"}})`
      - `SetOptions({{"compaction_options_fifo", "{max_table_files_size=1024;ttl=600;}"}})`
      - `SetOptions({{"compaction_options_fifo", "{max_table_files_size=51;ttl=49;allow_compaction=true;}"}})`
      
      Most of the code has been made generic enough so that it could be reused later to make universal options (and other such nested defined-types) dynamic with very few lines of parsing/serializing code changes.
      Introduced a few new functions like `ParseStruct`, `SerializeStruct` and `GetStringFromStruct`.
      The duplicate code in `GetStringFromDBOptions` and `GetStringFromColumnFamilyOptions` has been moved into `GetStringFromStruct`. So they become just simple wrappers now.
      Closes https://github.com/facebook/rocksdb/pull/3006
      
      Differential Revision: D6058619
      
      Pulled By: sagar0
      
      fbshipit-source-id: 1e8f78b3374ca5249bb4f3be8a6d3bb4cbc52f92
      f0804db7
  21. 05 10月, 2017 1 次提交
    • M
      Allow upgrades from nullptr to some merge operator · 88ed1f6e
      Manuel Ung 提交于
      Summary:
      Currently, RocksDB does not allow reopening a preexisting DB with no merge operator defined, with a merge operator defined. This means that if a DB ever want to add a merge operator, there's no way to do so currently.
      
      Fix this by adding a new verification type `kByNameAllowFromNull` which will allow old values to be nullptr, and new values to be non-nullptr.
      Closes https://github.com/facebook/rocksdb/pull/2958
      
      Differential Revision: D5961131
      
      Pulled By: lth
      
      fbshipit-source-id: 06179bebd0d90db3d43690b5eb7345e2d5bab1eb
      88ed1f6e
  22. 28 9月, 2017 1 次提交
    • Q
      Make bytes_per_sync and wal_bytes_per_sync mutable · 6a541afc
      Quinn Jarrell 提交于
      Summary:
      SUMMARY
      Moves the bytes_per_sync and wal_bytes_per_sync options from immutableoptions to mutable options. Also if wal_bytes_per_sync is changed, the wal file and memtables are flushed.
      TEST PLAN
      ran make check
      all passed
      
      Two new tests SetBytesPerSync, SetWalBytesPerSync check that after issuing setoptions with a new value for the var, the db options have the new value.
      Closes https://github.com/facebook/rocksdb/pull/2893
      
      Reviewed By: yiwu-arbug
      
      Differential Revision: D5845814
      
      Pulled By: TheRushingWookie
      
      fbshipit-source-id: 93b52d779ce623691b546679dcd984a06d2ad1bd
      6a541afc
  23. 29 7月, 2017 1 次提交
    • S
      Replace dynamic_cast<> · 21696ba5
      Siying Dong 提交于
      Summary:
      Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available.
      Some nontrivial changes:
      1. Add Comparator::GetRootComparator() to get around the internal comparator hack
      2. Add the two experiemental functions to DB
      3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string
      4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric.
      Closes https://github.com/facebook/rocksdb/pull/2645
      
      Differential Revision: D5502723
      
      Pulled By: siying
      
      fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c
      21696ba5
  24. 22 7月, 2017 2 次提交
  25. 16 7月, 2017 1 次提交
  26. 14 6月, 2017 1 次提交
    • S
      Allow ignoring unknown options when loading options from a file · 89ad9f3a
      Sagar Vemuri 提交于
      Summary:
      Added a flag, `ignore_unknown_options`, to skip unknown options when loading an options file (using `LoadLatestOptions`/`LoadOptionsFromFile`) or while verifying options (using `CheckOptionsCompatibility`). This will help in downgrading the db to an older version.
      
      Also added `--ignore_unknown_options` flag to ldb
      
      **Example Use case:**
      In MyRocks, if copying from newer version to older version, it is often impossible to start because of new RocksDB options that don't exist in older version, even though data format is compatible.
      MyRocks uses these load and verify functions in [ha_rocksdb.cc::check_rocksdb_options_compatibility](https://github.com/facebook/mysql-5.6/blob/e004fd9f416821d043ccc8ad4a345c33ac9953f0/storage/rocksdb/ha_rocksdb.cc#L3348-L3401).
      
      **Test Plan:**
      Updated the unit tests.
      `make check`
      
      ldb:
      $ ./ldb --db=/tmp/test_db --create_if_missing put a1 b1
      OK
      
      Now edit /tmp/test_db/<OPTIONS-file> and add an unknown option.
      
      Try loading the options now, and it fails:
      $ ./ldb --db=/tmp/test_db --try_load_options get a1
      Failed: Invalid argument: Unrecognized option DBOptions:: abcd
      
      Passes with the new --ignore_unknown_options flag
      $ ./ldb --db=/tmp/test_db --try_load_options --ignore_unknown_options get a1
      b1
      Closes https://github.com/facebook/rocksdb/pull/2423
      
      Differential Revision: D5212091
      
      Pulled By: sagar0
      
      fbshipit-source-id: 2ec17636feb47dc0351b53a77e5f15ef7cbf2ca7
      89ad9f3a
  27. 25 5月, 2017 1 次提交
    • A
      Introduce max_background_jobs mutable option · bb01c188
      Andrew Kryczka 提交于
      Summary:
      - `max_background_flushes` and `max_background_compactions` are still supported for backwards compatibility
      - `base_background_compactions` is completely deprecated. Now we just throttle to one background compaction when there's no pressure.
      - `max_background_jobs` is added to automatically partition the concurrent background jobs into flushes vs compactions. Currently it's very simple as we just allocate one-fourth of the jobs to flushes, and the remaining can be used for compactions.
      - The test cases that set `base_background_compactions > 1` needed to be updated. I just grab the pressure token such that the desired number of compactions can be scheduled.
      Closes https://github.com/facebook/rocksdb/pull/2205
      
      Differential Revision: D4937461
      
      Pulled By: ajkr
      
      fbshipit-source-id: df52cbbd497e13bbc9a60560a5ac2a2526b3f1f9
      bb01c188
  28. 18 5月, 2017 1 次提交
    • M
      Support ingest_behind for IngestExternalFile · ba685a47
      Mikhail Antonov 提交于
      Summary:
      First cut for early review; there are few conceptual points to answer and some code structure issues.
      
      For conceptual points -
      
       - restriction-wise, we're going to disallow ingest_behind if (use_seqno_zero_out=true || disable_auto_compaction=false), the user is responsible to properly open and close DB with required params
       - we wanted to ingest into reserved bottom most level. Should we fail fast if bottom level isn't empty, or should we attempt to ingest if file fits there key-ranges-wise?
       - Modifying AssignLevelForIngestedFile seems the place we we'd handle that.
      
      On code structure - going to refactor GenerateAndAddExternalFile call in the test class to allow passing instance of IngestionOptions, that's just going to incur lots of changes at callsites.
      Closes https://github.com/facebook/rocksdb/pull/2144
      
      Differential Revision: D4873732
      
      Pulled By: lightmark
      
      fbshipit-source-id: 81cb698106b68ef8797f564453651d50900e153a
      ba685a47
  29. 04 5月, 2017 1 次提交
    • L
      Max open files mutable · e7ae4a3a
      Leonidas Galanis 提交于
      Summary:
      Makes max_open_files db option dynamically set-able by SetDBOptions. During the call of SetDBOptions we call SetCapacity on the table cache, which is a LRUCache.
      Closes https://github.com/facebook/rocksdb/pull/2185
      
      Differential Revision: D4979189
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: ca7e8dc5e3619c79434f579be4847c0f7e56afda
      e7ae4a3a
  30. 28 4月, 2017 1 次提交
  31. 22 4月, 2017 2 次提交
  32. 14 4月, 2017 1 次提交
    • A
      change use_direct_writes to use_direct_io_for_flush_and_compaction · 44fa8ece
      Aaron Gao 提交于
      Summary:
      Replace Options::use_direct_writes with Options::use_direct_io_for_flush_and_compaction
      Now if Options::use_direct_io_for_flush_and_compaction = true, we will enable direct io for both reads and writes for flush and compaction job. Whereas Options::use_direct_reads controls user reads like iterator and Get().
      Closes https://github.com/facebook/rocksdb/pull/2117
      
      Differential Revision: D4860912
      
      Pulled By: lightmark
      
      fbshipit-source-id: d93575a8a5e780cf7e40797287edc425ee648c19
      44fa8ece
  33. 07 4月, 2017 1 次提交
    • S
      Move various string utility functions into string_util · 343b59d6
      Sagar Vemuri 提交于
      Summary:
      This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.
      
      Check the sub-commits for easier reviewing.
      Closes https://github.com/facebook/rocksdb/pull/2094
      
      Differential Revision: D4837730
      
      Pulled By: sagar0
      
      fbshipit-source-id: 344278a
      343b59d6
  34. 06 4月, 2017 1 次提交
  35. 21 3月, 2017 1 次提交
  36. 03 3月, 2017 1 次提交
  37. 24 2月, 2017 1 次提交