1. 30 4月, 2016 1 次提交
    • Y
      Added EventListener::OnTableFileCreationStarted() callback · a92049e3
      Yi Wu 提交于
      Summary: Added EventListener::OnTableFileCreationStarted. EventListener::OnTableFileCreated will be called on failure case. User can check creation status via TableFileCreationInfo::status.
      
      Test Plan: unit test.
      
      Reviewers: dhruba, yhchiang, ott, sdong
      
      Reviewed By: sdong
      
      Subscribers: sdong, kradhakrishnan, IslamAbdelRahman, andrewkr, yhchiang, leveldb, ott, dhruba
      
      Differential Revision: https://reviews.facebook.net/D56337
      a92049e3
  2. 28 4月, 2016 1 次提交
    • A
      Shared dictionary compression using reference block · 843d2e31
      Andrew Kryczka 提交于
      Summary:
      This adds a new metablock containing a shared dictionary that is used
      to compress all data blocks in the SST file. The size of the shared dictionary
      is configurable in CompressionOptions and defaults to 0. It's currently only
      used for zlib/lz4/lz4hc, but the block will be stored in the SST regardless of
      the compression type if the user chooses a nonzero dictionary size.
      
      During compaction, computes the dictionary by randomly sampling the first
      output file in each subcompaction. It pre-computes the intervals to sample
      by assuming the output file will have the maximum allowable length. In case
      the file is smaller, some of the pre-computed sampling intervals can be beyond
      end-of-file, in which case we skip over those samples and the dictionary will
      be a bit smaller. After the dictionary is generated using the first file in a
      subcompaction, it is loaded into the compression library before writing each
      block in each subsequent file of that subcompaction.
      
      On the read path, gets the dictionary from the metablock, if it exists. Then,
      loads that dictionary into the compression library before reading each block.
      
      Test Plan: new unit test
      
      Reviewers: yhchiang, IslamAbdelRahman, cyan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, yoshinorim, kradhakrishnan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D52287
      843d2e31
  3. 21 4月, 2016 2 次提交
  4. 19 4月, 2016 1 次提交
    • A
      Delete deprecated *BackupableDB interface for backups · 40b840f2
      Andrew Kryczka 提交于
      Summary:
      This interface is redundant and has been deprecated for a while.
      It's also unused internally. Let's delete it.
      
      I moved the comments to the corresponding functions in BackupEngine/
      BackupEngineReadOnly. This caused the diff tool to not work cleanly.
      
      Test Plan:
      unit tests
      
        $ ./backupable_db_test
      
      Reviewers: yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D56331
      40b840f2
  5. 16 4月, 2016 1 次提交
  6. 14 4月, 2016 1 次提交
  7. 01 4月, 2016 1 次提交
    • S
      Change some RocksDB default options · 2feafa3d
      sdong 提交于
      Summary: Change some RocksDB default options to make it more friendly to server workloads.
      
      Test Plan: Run all existing tests
      
      Reviewers: yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: sumeet, muthu, benj, MarkCallaghan, igor, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D55941
      2feafa3d
  8. 13 3月, 2016 1 次提交
  9. 12 3月, 2016 1 次提交
    • I
      Aggregate hot Iterator counters in LocalStatistics (DBIter::Next perf regression) · 580fede3
      Islam AbdelRahman 提交于
      Summary:
      This patch bump the counters in the frequent code path DBIter::Next() / DBIter::Prev() in a local data members and send them to Statistics when the iterator is destroyed
      A better solution will be to have thread_local implementation for Statistics
      
      New performance
      ```
      readseq      :       0.035 micros/op 28597881 ops/sec; 3163.7 MB/s
           1,851,568,819      stalled-cycles-frontend   #   31.29% frontend cycles idle    [49.86%]
             884,929,823      stalled-cycles-backend    #   14.95% backend  cycles idle    [50.21%]
      readreverse  :       0.071 micros/op 14077393 ops/sec; 1557.3 MB/s
           3,239,575,993      stalled-cycles-frontend   #   27.36% frontend cycles idle    [49.96%]
           1,558,253,983      stalled-cycles-backend    #   13.16% backend  cycles idle    [50.14%]
      
      ```
      
      Existing performance
      
      ```
      readreverse  :       0.174 micros/op 5732342 ops/sec;  634.1 MB/s
          20,570,209,389      stalled-cycles-frontend   #   70.71% frontend cycles idle    [50.01%]
          18,422,816,837      stalled-cycles-backend    #   63.33% backend  cycles idle    [50.04%]
      
      readseq      :       0.119 micros/op 8400537 ops/sec;  929.3 MB/s
          15,634,225,844      stalled-cycles-frontend   #   79.07% frontend cycles idle    [49.96%]
          14,227,427,453      stalled-cycles-backend    #   71.95% backend  cycles idle    [50.09%]
      ```
      
      Test Plan: unit tests
      
      Reviewers: yhchiang, sdong, igor
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D55107
      580fede3
  10. 11 3月, 2016 1 次提交
    • Y
      Cache to have an option to fail Cache::Insert() when full · f71fc77b
      Yi Wu 提交于
      Summary:
      Cache to have an option to fail Cache::Insert() when full. Update call sites to check status and handle error.
      
      I totally have no idea what's correct behavior of all the call sites when they encounter error. Please let me know if you see something wrong or more unit test is needed.
      
      Test Plan: make check -j32, see tests pass.
      
      Reviewers: anthony, yhchiang, andrewkr, IslamAbdelRahman, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D54705
      f71fc77b
  11. 02 3月, 2016 1 次提交
  12. 01 3月, 2016 1 次提交
    • S
      Introduce Iterator::GetProperty() and replace Iterator::IsKeyPinned() · 1f595414
      sdong 提交于
      Summary:
      Add Iterator::GetProperty(), a way for users to communicate with iterator, and turn Iterator::IsKeyPinned() with it.
      As a follow-up, I'll ask a property as the version number attached to the iterator
      
      Test Plan: Rerun existing tests and add a negative test case.
      
      Reviewers: yhchiang, andrewkr, kradhakrishnan, anthony, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D54783
      1f595414
  13. 12 2月, 2016 1 次提交
    • S
      Add a new compaction priority that picks file whose overlapping ratio is smallest · 92a9ccf1
      sdong 提交于
      Summary:
      Add a new compaction priority as following:
      For every file, we calculate total size of files overalapping with the file in the next level, over the file's size itself. The file with smallest ratio will be picked first.
      My "db_bench --fillrandom" shows about 5% less compaction than kOldestSmallestSeqFirst if --hard_pending_compaction_bytes_limit value to keep LSM tree in shape. If not limiting hard_pending_compaction_bytes_limit, improvement is only 1% or 2%.
      
      Test Plan: Add a unit test
      
      Reviewers: andrewkr, kradhakrishnan, anthony, IslamAbdelRahman, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D54075
      92a9ccf1
  14. 10 2月, 2016 1 次提交
    • Y
      Allows Get and MultiGet to read directly from SST files. · 4a8cbf4e
      Yueh-Hsuan Chiang 提交于
      Summary:
      Add kSstFileTier to ReadTier, which allows Get and MultiGet to
      read only directly from SST files and skip mem-tables.
      
          kSstFileTier = 0x2      // data in SST files.
                                // Note that this ReadTier currently only supports
                                // Get and MultiGet and does not support iterators.
      
      Test Plan: add new test in db_test.
      
      Reviewers: anthony, IslamAbdelRahman, rven, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: igor, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D53511
      4a8cbf4e
  15. 06 2月, 2016 1 次提交
    • S
      Update version to 4.5 · 73a9b0f4
      sdong 提交于
      Summary: Time to cut branch for release 4.5. Change the versions.
      
      Test Plan: Not needed
      
      Reviewers: IslamAbdelRahman, yhchiang, kradhakrishnan, andrewkr, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53883
      73a9b0f4
  16. 29 1月, 2016 1 次提交
  17. 27 1月, 2016 1 次提交
    • S
      Disable stats about mutex duration by default · d20915d5
      sdong 提交于
      Summary: Measuring mutex duration will measure time inside DB mutex, which breaks our best practice. Add a stat level in Statistics class. By default, disable to measure the mutex operations.
      
      Test Plan: Add a unit test to make sure it is off by default.
      
      Reviewers: rven, anthony, IslamAbdelRahman, kradhakrishnan, andrewkr, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53367
      d20915d5
  18. 26 1月, 2016 2 次提交
    • S
      ldb to support --column_family option · 38e1d7fe
      sdong 提交于
      Summary:
      Add an option --column_family option, so that users can query or update specific column family.
      Also add an create column family parameter to make unit test easier.
      Still need to add unit tests.
      
      Test Plan: Will add a test case in ldb python test.
      
      Reviewers: yhchiang, rven, andrewkr, IslamAbdelRahman, kradhakrishnan, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53265
      38e1d7fe
    • S
      Add a perf context level that doesn't measure time for mutex operations · fb9811ee
      sdong 提交于
      Summary: Timing mutex operations can impact scalability of the system. Add a new perf context level that can measure time counters except for mutex.
      
      Test Plan: Add a new unit test case to make sure it is not set.
      
      Reviewers: IslamAbdelRahman, rven, kradhakrishnan, yhchiang, anthony
      
      Reviewed By: anthony
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53199
      fb9811ee
  19. 15 1月, 2016 1 次提交
  20. 07 1月, 2016 1 次提交
    • G
      Make ldb automagically determine the file type and use the correct dumping function · b1a3b4c0
      Gunnar Kudrjavets 提交于
      Summary:
      This set of changes implements the following design: `ldb` will utilize `--path` parameter which can be used to specify a file name. Tool will then apply some heuristic to determine how to output the data properly. The design decision is not to probe the file content, but use file names to determine what dumping function to call.
      
      Usage examples:
      
      Understands that path points to a manifest file and dumps it.
      `./ldb --path=/tmp/test_db/MANIFEST-000023 dump`
      
      Understands that path points to a WAL file and dumps it.
      `./ldb --path=/tmp/test_db/000024.log dump --header`
      
      Understands that path points to a SST file and dumps it.
      `./ldb --path=/tmp/test_db/000007.sst dump`
      
      Figures out that none of the supported file types are applicable and outputs
      an appropriate error message.
      `./ldb --path=/tmp/cron.log dump`
      
      Test Plan:
      Basics:
      
      git diff
      make clean
      make -j 32 commit-prereq
      arc lint
      
      More specific testing (done as part of commit-prereq, but can be iterated separately when making isolated changes):
      
      make clean
      make ldb
      python tools/ldb_test.py
      make rocksdb_dump
      make rocksdb_undump
      sh tools/rocksdb_dump_test.sh
      
      Reviewers: rven, IslamAbdelRahman, yhchiang, kradhakrishnan, anthony, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D52269
      b1a3b4c0
  21. 24 12月, 2015 2 次提交
    • S
      Change default options.delayed_write_rate · 15b89022
      sdong 提交于
      Summary: We now have a mechanism to further slowdown writes. Double default options.delayed_write_rate to try to keep the default behavior closer to it used to be.
      
      Test Plan: Run all tests.
      
      Reviewers: IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: yhchiang, kradhakrishnan, rven, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D52281
      15b89022
    • S
      When slowdown is triggered, reduce the write rate · b9f77ba1
      sdong 提交于
      Summary: It's usually hard for users to set a value of options.delayed_write_rate. With this diff, after slowdown condition triggers, we greedily reduce write rate if estimated pending compaction bytes increase. If estimated compaction pending bytes drop, we increase the write rate.
      
      Test Plan:
      Add a unit test
      Test with db_bench setting:
      TEST_TMPDIR=/dev/shm/ ./db_bench --benchmarks=fillrandom -num=10000000 --soft_pending_compaction_bytes_limit=1000000000 --hard_pending_compaction_bytes_limit=3000000000 --delayed_write_rate=100000000
      
      and make sure without the commit, write stop will happen, but with the commit, it will not happen.
      
      Reviewers: igor, anthony, rven, yhchiang, kradhakrishnan, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D52131
      b9f77ba1
  22. 23 12月, 2015 1 次提交
  23. 18 12月, 2015 1 次提交
    • S
      Slowdown when writing to the last write buffer · d72b3177
      sdong 提交于
      Summary: Now if inserting to mem table is much faster than writing to files, there is no mechanism users can rely on to avoid stopping for reaching options.max_write_buffer_number. With the commit, if there are more than four maximum write buffers configured, we slow down to the rate of options.delayed_write_rate while we reach the last one.
      
      Test Plan:
      1. Add a new unit test.
      2. Run db_bench with
      
      ./db_bench --benchmarks=fillrandom --num=10000000 --max_background_flushes=6 --batch_size=32 -max_write_buffer_number=4 --delayed_write_rate=500000 --statistics
      
      based on hard drive and see stopping is avoided with the commit.
      
      Reviewers: yhchiang, IslamAbdelRahman, anthony, rven, kradhakrishnan, igor
      
      Reviewed By: igor
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D52047
      d72b3177
  24. 11 12月, 2015 1 次提交
  25. 10 12月, 2015 2 次提交
    • S
      Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit · 56e77f09
      sdong 提交于
      Summary: Deprecate options.soft_rate_limit, which is hard to tune, with options.soft_pending_compaction_bytes_limit, which would trigger the slowdown if estimated pending compaction bytes exceeds the threshold. The hope is to make it more striaght-forward to tune.
      
      Test Plan: Modify DBTest.SoftLimit to cover options.soft_pending_compaction_bytes_limit instead; run all unit tests.
      
      Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, igor, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51117
      56e77f09
    • S
      A new compaction picking priority that optimizes for write amplification for random updates. · d6e1035a
      sdong 提交于
      Summary: Introduce a compaction picking priority that picks files who contains the oldest rows to compact. This is a mode that slightly improves write amplification for random update cases.
      
      Test Plan: Add a unit test and run it in valgrind too.
      
      Reviewers: yhchiang, anthony, IslamAbdelRahman, rven, kradhakrishnan, MarkCallaghan, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51459
      d6e1035a
  26. 09 12月, 2015 1 次提交
    • K
      Updating HISTORY.md · 188170fb
      krad 提交于
      Summary: Added 4.3.0 version
      
      Test Plan:
      
      Reviewers:
      
      CC: leveldb@
      
      Task ID: #9298965
      
      Blame Rev:
      188170fb
  27. 01 12月, 2015 1 次提交
    • S
      DB to only flush the column family with the largest memtable while... · db320b1b
      sdong 提交于
      DB to only flush the column family with the largest memtable while option.db_write_buffer_size is hit
      
      Summary: When option.db_write_buffer_size is hit, we currently flush all column families. Move to flush the column family with the largest active memt table instead. In this way, we can avoid too many small files in some cases.
      
      Test Plan: Modify test DBTest.SharedWriteBuffer to work with the updated behavior
      
      Reviewers: kradhakrishnan, yhchiang, rven, anthony, IslamAbdelRahman, igor
      
      Reviewed By: igor
      
      Subscribers: march, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51291
      db320b1b
  28. 21 11月, 2015 1 次提交
  29. 17 11月, 2015 1 次提交
  30. 13 11月, 2015 1 次提交
    • Y
      Add CheckOptionsCompatibility() API to options_util · d781da81
      Yueh-Hsuan Chiang 提交于
      Summary:
      Add CheckOptionsCompatibility() API to options_util that returns
      Status::OK if the input DBOptions and ColumnFamilyDescriptors
      are compatible with the latest options stored in the specified DB path.
      
      Test Plan: Added tests in options_util_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D50649
      d781da81
  31. 12 11月, 2015 1 次提交
    • Y
      Add OptionsUtil::LoadOptionsFromFile() API · e11f676e
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch adds OptionsUtil::LoadOptionsFromFile() and
      OptionsUtil::LoadLatestOptionsFromDB(), which allow developers
      to construct DBOptions and ColumnFamilyOptions from a RocksDB
      options file.  Note that most pointer-typed options such as
      merge_operator will not be constructed.
      
      With this API, developers no longer need to remember all the
      options in order to reopen an existing rocksdb instance like
      the following:
      
        DBOptions db_options;
        std::vector<std::string> cf_names;
        std::vector<ColumnFamilyOptions> cf_opts;
      
        // Load primitive-typed options from an existing DB
        OptionsUtil::LoadLatestOptionsFromDB(
            dbname, &db_options, &cf_names, &cf_opts);
      
        // Initialize necessary pointer-typed options
        cf_opts[0].merge_operator.reset(new MyMergeOperator());
        ...
      
        // Construct the vector of ColumnFamilyDescriptor
        std::vector<ColumnFamilyDescriptor> cf_descs;
        for (size_t i = 0; i < cf_opts.size(); ++i) {
          cf_descs.emplace_back(cf_names[i], cf_opts[i]);
        }
      
        // Open the DB
        DB* db = nullptr;
        std::vector<ColumnFamilyHandle*> cf_handles;
        auto s = DB::Open(db_options, dbname, cf_descs,
                          &handles, &db);
      
      Test Plan:
      Augment existing tests in column_family_test
      options_test
      db_test
      
      Reviewers: igor, IslamAbdelRahman, sdong, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D49095
      e11f676e
  32. 11 11月, 2015 1 次提交
    • Y
      Enable RocksDB to persist Options file. · e114f0ab
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch allows rocksdb to persist options into a file on
      DB::Open, SetOptions, and Create / Drop ColumnFamily.
      Options files are created under the same directory as the rocksdb
      instance.
      
      In addition, this patch also adds a fail_if_missing_options_file in DBOptions
      that makes any function call return non-ok status when it is not able to
      persist options properly.
      
        // If true, then DB::Open / CreateColumnFamily / DropColumnFamily
        // / SetOptions will fail if options file is not detected or properly
        // persisted.
        //
        // DEFAULT: false
        bool fail_if_missing_options_file;
      
      Options file names are formatted as OPTIONS-<number>, and RocksDB
      will always keep the latest two options files.
      
      Test Plan:
      Add options_file_test.
      
      options_test
      column_family_test
      
      Reviewers: igor, IslamAbdelRahman, sdong, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48285
      e114f0ab
  33. 05 11月, 2015 1 次提交
  34. 04 11月, 2015 2 次提交
    • Y
      Add Memory Insight support to utilities · 7d7ee2b6
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch introduces utilities/memory, which currently includes
      GetApproximateMemoryUsageByType that reports different types of
      rocksdb memory usage given a list of input DBs.
      
      The API also take care of the case where Cache could be shared
      across multiple column families / multiple db instances.
      
      Currently, it reports memory usage of memtable, table-readers
      and cache.
      
      Test Plan: utilities/memory/memory_test.cc
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D49257
      7d7ee2b6
    • Y
      Add GetAggregatedIntProperty(): returns the aggregated value from all CFs · 3ecbab00
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch adds GetAggregatedIntProperty() that returns the aggregated
      value from all CFs
      
      Test Plan: Added a test in db_test
      
      Reviewers: igor, sdong, anthony, IslamAbdelRahman, rven
      
      Reviewed By: rven
      
      Subscribers: rven, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D49497
      3ecbab00
  35. 30 10月, 2015 1 次提交
    • I
      Clean and expose CreateLoggerFromOptions · 2872e0c8
      Islam AbdelRahman 提交于
      Summary:
      CreateLoggerFromOptions have some parameters like  db_log_dir and env, these parameters are redundant since they already exist in DBOptions
      
      this patch remove the redundant parameters and expose CreateLoggerFromOptions to users
      
      Test Plan: make check
      
      Reviewers: igor, anthony, yhchiang, rven, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, hermanlee4
      
      Differential Revision: https://reviews.facebook.net/D49713
      2872e0c8