1. 08 1月, 2014 3 次提交
    • M
      Don't always compress L0 files written by memtable flush · 50994bf6
      Mark Callaghan 提交于
      Summary:
      Code was always compressing L0 files written by a memtable flush
      when compression was enabled. Now this is done when
      min_level_to_compress=0 for leveled compaction and when
      universal_compaction_size_percent=-1 for universal compaction.
      
      Task ID: #3416472
      
      Blame Rev:
      
      Test Plan:
      ran db_bench with compression options
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba, igor, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14757
      50994bf6
    • I
      [column families] Implement DB::OpenWithColumnFamilies() · 72918eff
      Igor Canadi 提交于
      Summary:
      In addition to implementing OpenWithColumnFamilies, this diff also includes some minor changes:
      * Changed all column family names from Slice() to std::string. The performance of column family name handling is not critical, and it's more convenient and cleaner to have names as std::strings
      * Implemented ColumnFamilyOptions(const Options&) and DBOptions(const Options&)
      * Added ColumnFamilyOptions to VersionSet::ColumnFamilyData. ColumnFamilyOptions are specified on OpenWithColumnFamilies() and CreateColumnFamily()
      
      I will keep the diff in the Phabricator for a day or two and will push to the branch then. Feel free to comment even after the diff has been pushed.
      
      Test Plan: Added a simple unit test
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15033
      72918eff
    • T
      Fix a deadlock in CompactRange() · 9f690ec6
      Tomislav Novak 提交于
      Summary:
      The way DBImpl::TEST_CompactRange() throttles down the number of bg compactions
      can cause it to deadlock when CompactRange() is called concurrently from
      multiple threads. Imagine a following scenario with only two threads
      (max_background_compactions is 10 and bg_compaction_scheduled_ is initially 0):
      
         1. Thread #1 increments bg_compaction_scheduled_ (to LargeNumber), sets
            bg_compaction_scheduled_ to 9 (newvalue), schedules the compaction
            (bg_compaction_scheduled_ is now 10) and waits for it to complete.
         2. Thread #2 calls TEST_CompactRange(), increments bg_compaction_scheduled_
            (now LargeNumber + 10) and waits on a cv for bg_compaction_scheduled_ to
            drop to LargeNumber.
         3. BG thread completes the first manual compaction, decrements
            bg_compaction_scheduled_ and wakes up all threads waiting on bg_cv_.
            Thread #1 runs, increments bg_compaction_scheduled_ by LargeNumber again
            (now 2*LargeNumber + 9). Since that's more than LargeNumber + newvalue,
            thread #2 also goes to sleep (waiting on bg_cv_), without resetting
            bg_compaction_scheduled_.
      
      This diff attempts to address the problem by introducing a new counter
      bg_manual_only_ (when positive, MaybeScheduleFlushOrCompaction() will only
      schedule manual compactions).
      
      Test Plan:
      I could pretty much consistently reproduce the deadlock with a program that
      calls CompactRange(nullptr, nullptr) immediately after Write() from multiple
      threads. This no longer happens with this patch.
      
      Tests (make check) pass.
      
      Reviewers: dhruba, igor, sdong, haobo
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14799
      9f690ec6
  2. 03 1月, 2014 1 次提交
  3. 02 1月, 2014 1 次提交
    • I
      Support multi-threaded DisableFileDeletions() and EnableFileDeletions() · b60c14f6
      Igor Canadi 提交于
      Summary:
      We don't want two threads to clash if they concurrently call DisableFileDeletions() and EnableFileDeletions(). I'm adding a counter that will enable file deletions only after all DisableFileDeletions() calls have been negated with EnableFileDeletions().
      
      However, we also don't want to break the old behavior, so I added a parameter force to EnableFileDeletions(). If force is true, we will still enable file deletions after every call to EnableFileDeletions(), which is what is happening now.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, sanketh
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14781
      b60c14f6
  4. 21 12月, 2013 1 次提交
    • I
      [RocksDB] Optimize locking for Get · 1fdb3f7d
      Igor Canadi 提交于
      Summary:
      Instead of locking and saving a DB state, we can cache a DB state and update it only when it changes. This change reduces lock contention and speeds up read operations on the DB.
      
      Performance improvements are substantial, although there is some cost in no-read workloads. I ran the regression tests on my devserver and here are the numbers:
      
        overwrite                    56345  ->   63001
        fillseq                      193730 ->  185296
        readrandom                   771301 -> 1219803 (58% improvement!)
        readrandom_smallblockcache   677609 ->  862850
        readrandom_memtable_sst      710440 -> 1109223
        readrandom_fillunique_random 221589 ->  247869
        memtablefillrandom           105286 ->   92643
        memtablereadrandom           763033 -> 1288862
      
      Test Plan:
      make asan_check
      I am also running db_stress
      
      Reviewers: dhruba, haobo, sdong, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14679
      1fdb3f7d
  5. 19 12月, 2013 1 次提交
    • I
      [RocksDB] [Column Family] Interface proposal · 9385a524
      Igor Canadi 提交于
      Summary:
      <This diff is for Column Family branch>
      
      Sharing some of the work I've done so far. This diff compiles and passes the tests.
      
      The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all.
      
      Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility.
      There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on]
      
      Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families.
      
      Please provide feedback.
      
      Test Plan: make check works, the code is backward compatible
      
      Reviewers: dhruba, haobo, sdong, kailiu, emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14445
      9385a524
  6. 13 12月, 2013 1 次提交
    • M
      Add monitoring for universal compaction and add counters for compaction IO · e9e6b00d
      Mark Callaghan 提交于
      Summary:
      Adds these counters
      { WAL_FILE_SYNCED, "rocksdb.wal.synced" }
        number of writes that request a WAL sync
      { WAL_FILE_BYTES, "rocksdb.wal.bytes" },
        number of bytes written to the WAL
      { WRITE_DONE_BY_SELF, "rocksdb.write.self" },
        number of writes processed by the calling thread
      { WRITE_DONE_BY_OTHER, "rocksdb.write.other" },
        number of writes not processed by the calling thread. Instead these were
        processed by the current holder of the write lock
      { WRITE_WITH_WAL, "rocksdb.write.wal" },
        number of writes that request WAL logging
      { COMPACT_READ_BYTES, "rocksdb.compact.read.bytes" },
        number of bytes read during compaction
      { COMPACT_WRITE_BYTES, "rocksdb.compact.write.bytes" },
        number of bytes written during compaction
      
      Per-interval stats output was updated with WAL stats and correct stats for universal compaction
      including a correct value for write-amplification. It now looks like:
                                     Compactions
      Level  Files Size(MB) Score Time(sec)  Read(MB) Write(MB)    Rn(MB)  Rnp1(MB)  Wnew(MB) RW-Amplify Read(MB/s) Write(MB/s)      Rn     Rnp1     Wnp1     NewW    Count  Ln-stall Stall-cnt
      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        0        7      464  46.4       281      3411      3875      3411         0      3875        2.1      12.1        13.8      621        0      240      240      628       0.0         0
      Uptime(secs): 310.8 total, 2.0 interval
      Writes cumulative: 9999999 total, 9999999 batches, 1.0 per batch, 1.22 ingest GB
      WAL cumulative: 9999999 WAL writes, 9999999 WAL syncs, 1.00 writes per sync, 1.22 GB written
      Compaction IO cumulative (GB): 1.22 new, 3.33 read, 3.78 write, 7.12 read+write
      Compaction IO cumulative (MB/sec): 4.0 new, 11.0 read, 12.5 write, 23.4 read+write
      Amplification cumulative: 4.1 write, 6.8 compaction
      Writes interval: 100000 total, 100000 batches, 1.0 per batch, 12.5 ingest MB
      WAL interval: 100000 WAL writes, 100000 WAL syncs, 1.00 writes per sync, 0.01 MB written
      Compaction IO interval (MB): 12.49 new, 14.98 read, 21.50 write, 36.48 read+write
      Compaction IO interval (MB/sec): 6.4 new, 7.6 read, 11.0 write, 18.6 read+write
      Amplification interval: 101.7 write, 102.9 compaction
      Stalls(secs): 142.924 level0_slowdown, 0.000 level0_numfiles, 0.805 memtable_compaction, 0.000 leveln_slowdown
      Stalls(count): 132461 level0_slowdown, 0 level0_numfiles, 3 memtable_compaction, 0 leveln_slowdown
      
      Task ID: #3329644, #3301695
      
      Blame Rev:
      
      Test Plan:
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14583
      e9e6b00d
  7. 10 12月, 2013 1 次提交
    • I
      [RocksDB] BackupableDB · fb9fce4f
      Igor Canadi 提交于
      Summary:
      In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
      Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.
      
      There are multiple things you can configure:
      1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
      2. sync - if true, it *guarantees* backup consistency on machine reboot
      3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
      4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
      5. More things you can find in BackupableDBOptions
      
      Here is the directory structure I use:
      
         backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
                     0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
                     files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
                     files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files
      
      All the files are ref counted and deleted immediatelly when they get out of scope.
      
      Some other stuff in this diff:
      1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
      2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.
      
      Test Plan:
      I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.
      
      Also, `make asan_check`
      
      Reviewers: dhruba, haobo, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb, haobo
      
      Differential Revision: https://reviews.facebook.net/D14295
      fb9fce4f
  8. 05 12月, 2013 1 次提交
  9. 04 12月, 2013 1 次提交
    • I
      Get rid of some shared_ptrs · 043fc14c
      Igor Canadi 提交于
      Summary:
      I went through all remaining shared_ptrs and removed the ones that I found not-necessary. Only GenerateCachePrefix() is called fairly often, so don't expect much perf wins.
      
      The ones that are left are accessed infrequently and I think we're fine with keeping them.
      
      Test Plan: make asan_check
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14427
      043fc14c
  10. 29 11月, 2013 1 次提交
  11. 26 11月, 2013 2 次提交
  12. 15 11月, 2013 1 次提交
    • I
      PurgeObsoleteFiles() unittest · a0ce3fd0
      Igor Canadi 提交于
      Summary:
      Created a unittest that verifies that automatic deletion performed by PurgeObsoleteFiles() works correctly.
      
      Also, few small fixes on the logic part -- call version_set_->GetObsoleteFiles() in FindObsoleteFiles() instead of on some arbitrary positions.
      
      Test Plan: Created a unit test
      
      Reviewers: dhruba, haobo, nkg-
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14079
      a0ce3fd0
  13. 13 11月, 2013 1 次提交
    • I
      Small changes in Deleting obsolete files · 9bc4a26f
      Igor Canadi 提交于
      Summary:
      @haobo's suggestions from https://reviews.facebook.net/D13827
      
      Renaming some variables, deprecating purge_log_after_flush, changing for loop into auto for loop.
      
      I have not implemented deleting objects outside of mutex yet because it would require a big code change - we would delete object in db_impl, which currently does not know anything about object because it's defined in version_edit.h (FileMetaData). We should do it at some point, though.
      
      Test Plan: Ran deletefile_test
      
      Reviewers: haobo
      
      Reviewed By: haobo
      
      CC: leveldb, haobo
      
      Differential Revision: https://reviews.facebook.net/D14025
      9bc4a26f
  14. 12 11月, 2013 1 次提交
  15. 09 11月, 2013 1 次提交
    • I
      Speed up FindObsoleteFiles · 1510339e
      Igor Canadi 提交于
      Summary:
      Here's one solution we discussed on speeding up FindObsoleteFiles. Keep a set of all files in DBImpl and update the set every time we create a file. I probably missed few other spots where we create a file.
      
      It might speed things up a bit, but makes code uglier. I don't really like it.
      
      Much better approach would be to abstract all file handling to a separate class. Think of it as layer between DBImpl and Env. Having a separate class deal with file namings and deletion would benefit both code cleanliness (especially with huge DBImpl) and speed things up. It will take a huge effort to do this, though.
      
      Let's discuss offline today.
      
      Test Plan: Ran ./db_stress, verified that files are getting deleted
      
      Reviewers: dhruba, haobo, kailiu, emayanke
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D13827
      1510339e
  16. 07 11月, 2013 1 次提交
    • S
      WAL log retention policy based on archive size. · c2be2cba
      shamdor 提交于
      Summary:
      Archive cleaning will still happen every WAL_ttl seconds
      but archived logs will be deleted only if archive size
      is greater then a WAL_size_limit value.
      Empty archived logs will be deleted evety WAL_ttl.
      
      Test Plan:
      1. Unit tests pass.
      2. Benchmark.
      
      Reviewers: emayanke, dhruba, haobo, sdong, kailiu, igor
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13869
      c2be2cba
  17. 05 11月, 2013 1 次提交
    • M
      Making the transaction log iterator more robust · f837f5b1
      Mayank Agarwal 提交于
      Summary:
      strict essentially means that we MUST find the startsequence. Thus we should return if starteSequence is not found in the first file in case strict is set. This will take care of ending the iterator in case of permanent gaps due to corruptions in the log files
      Also created NextImpl function that will have internal variable to distinguish whether Next is being called from StartSequence or by application.
      Set NotFoudn::gaps status to give an indication of gaps happeneing.
      Polished the inline documentation at various places
      
      Test Plan:
      * db_repl_stress test
      * db_test relating to transaction log iterator
      * fbcode/wormhole/rocksdb/rocks_log_iterator
      * sigma production machine sigmafio032.prn1
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13689
      f837f5b1
  18. 02 11月, 2013 1 次提交
    • D
      Implement a compressed block cache. · b4ad5e89
      Dhruba Borthakur 提交于
      Summary:
      Rocksdb can now support a uncompressed block cache, or a compressed
      block cache or both. Lookups first look for a block in the
      uncompressed cache, if it is not found only then it is looked up
      in the compressed cache. If it is found in the compressed cache,
      then it is uncompressed and inserted into the uncompressed cache.
      
      It is possible that the same block resides in the compressed cache
      as well as the uncompressed cache at the same time. Both caches
      have their own individual LRU policy.
      
      Test Plan: Unit test case attached.
      
      Reviewers: kailiu, sdong, haobo, leveldb
      
      Reviewed By: haobo
      
      CC: xjin, haobo
      
      Differential Revision: https://reviews.facebook.net/D12675
      b4ad5e89
  19. 31 10月, 2013 1 次提交
    • S
      Follow-up Cleaning-up After D13521 · f03b2df0
      Siying Dong 提交于
      Summary:
      This patch is to address @haobo's comments on D13521:
      1. rename Table to be TableReader and make its factory function to be GetTableReader
      2. move the compression type selection logic out of TableBuilder but to compaction logic
      3. more accurate comments
      4. Move stat name constants into BlockBasedTable implementation.
      5. remove some uncleaned codes in simple_table_db_test
      
      Test Plan: pass test suites.
      
      Reviewers: haobo, dhruba, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13785
      f03b2df0
  20. 25 10月, 2013 1 次提交
    • M
      Unify DeleteFile and DeleteWalFiles · 56305221
      Mayank Agarwal 提交于
      Summary:
      This is to simplify rocksdb public APIs and improve the code quality.
      Created an additional parameter to ParseFileName for log sub type and improved the code for deleting a wal file.
      Wrote exhaustive unit-tests in delete_file_test
      Unification of other redundant APIs can be taken up in a separate diff
      
      Test Plan: Expanded delete_file test
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13647
      56305221
  21. 18 10月, 2013 1 次提交
    • S
      Universal Compaction to Have a Size Percentage Threshold To Decide Whether to Compress · 9edda370
      Siying Dong 提交于
      Summary:
      This patch adds a option for universal compaction to allow us to only compress output files if the files compacted previously did not yet reach a specified ratio, to save CPU costs in some cases.
      
      Compression is always skipped for flushing. This is because the size information is not easy to evaluate for flushing case. We can improve it later.
      
      Test Plan:
      add test
      DBTest.UniversalCompactionCompressRatio1 and DBTest.UniversalCompactionCompressRatio12
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13467
      9edda370
  22. 17 10月, 2013 2 次提交
    • D
      Add appropriate LICENSE and Copyright message. · 9cd22109
      Dhruba Borthakur 提交于
      Summary:
      Add appropriate LICENSE and Copyright message.
      
      Test Plan:
      make check
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      9cd22109
    • S
      Enable background flush thread by default and fix issues related to it · 073cbfc8
      Siying Dong 提交于
      Summary:
      Enable background flush thread in this patch and fix unit tests with:
      (1) After background flush, schedule a background compaction if condition satisfied;
      (2) Fix a bug that if universal compaction is enabled and number of levels are set to be 0, compaction will not be automatically triggered
      (3) Fix unit tests to wait for compaction to finish instead of flush, before checking the compaction results.
      
      Test Plan: pass all unit tests
      
      Reviewers: haobo, xjin, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13461
      073cbfc8
  23. 15 10月, 2013 1 次提交
    • S
      Change Function names from Compaction->Flush When they really mean Flush · 88f2f890
      Siying Dong 提交于
      Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance.
      
      Test Plan: Run test suites.
      
      Reviewers: haobo, dhruba, xjin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13431
      88f2f890
  24. 06 10月, 2013 1 次提交
  25. 05 10月, 2013 3 次提交
    • D
      Removed scribe, thrift and java modules. · 0a9f873f
      Dhruba Borthakur 提交于
      Summary: Removed scribe, thrift and java modules.
      
      Test Plan:
      make release
      make check
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13293
      0a9f873f
    • D
      Change namespace from leveldb to rocksdb · a143ef9b
      Dhruba Borthakur 提交于
      Summary:
      Change namespace from leveldb to rocksdb. This allows a single
      application to link in open-source leveldb code as well as
      rocksdb code into the same process.
      
      Test Plan: compile rocksdb
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13287
      a143ef9b
    • M
      Add backward compatible option in GetLiveFiles to choose whether to not Flush first · 854d2363
      Mayank Agarwal 提交于
      Summary:
      As explained in comments in GetLiveFiles in db.h, this option will cause flush to be skipped in GetLiveFiles because some use-cases use GetSortedWalFiles after GetLiveFiles to generate more complete snapshots.
      Using GetSortedWalFiles after GetLiveFiles allows us to not Flush in GetLiveFiles first because wals have everything.
      Note: file deletions will be disabled before calling GLF or GSWF so live logs will not move to archive logs or get delted.
      Note: Manifest file is truncated to a proper value in GLF, so it will always reply from the proper wal files on a restart
      
      Test Plan: make
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13257
      854d2363
  26. 04 10月, 2013 1 次提交
  27. 13 9月, 2013 1 次提交
    • H
      [RocksDB] Remove Log file immediately after memtable flush · 0e422308
      Haobo Xu 提交于
      Summary: As title. The DB log file life cycle is tied up with the memtable it backs. Once the memtable is flushed to sst and committed, we should be able to delete the log file, without holding the mutex. This is part of the bigger change to avoid FindObsoleteFiles at runtime. It deals with log files. sst files will be dealt with later.
      
      Test Plan: make check; db_bench
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11709
      0e422308
  28. 05 9月, 2013 1 次提交
    • X
      New ldb command to convert compaction style · 42c109cc
      Xing Jin 提交于
      Summary:
      Add new command "change_compaction_style" to ldb tool. For
      universal->level, it shows "nothing to do". For level->universal, it
      compacts all files into a single one and moves the file to level 0.
      
      Also add check for number of files at level 1+ when opening db with
      universal compaction style.
      
      Test Plan:
      'make all check'. New unit test for internal convertion function. Also manully test various
      cmd like:
      
      ./ldb change_compaction_style --old_compaction_style=0
      --new_compaction_style=1 --db=/tmp/leveldbtest-3088/db_test
      
      Reviewers: haobo, dhruba
      
      Reviewed By: haobo
      
      CC: vamsi, emayanke
      
      Differential Revision: https://reviews.facebook.net/D12603
      42c109cc
  29. 29 8月, 2013 1 次提交
    • D
      Introduced a new flag non_blocking_io in ReadOptions. · fc0c399d
      Dhruba Borthakur 提交于
      Summary:
      If ReadOptions.non_blocking_io is set to true, then KeyMayExists
      and Iterators will return data that is cached in RAM.
      If the Iterator needs to do IO from storage to serve the data,
      then the Iterator.status() will return Status::IsRetry().
      
      Test Plan:
      Enhanced unit test DBTest.KeyMayExist to detect if there were are IOs
      issues from storage. Added DBTest.NonBlockingIteration to verify
      nonblocking Iterations.
      
      Reviewers: emayanke, haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Maniphest Tasks: T63
      
      Differential Revision: https://reviews.facebook.net/D12531
      fc0c399d
  30. 24 8月, 2013 1 次提交
  31. 23 8月, 2013 1 次提交
    • S
      Add APIs to query SST file metadata and to delete specific SST files · 60bf2b7d
      Simha Venkataramaiah 提交于
      Summary: An api to query the level, key ranges, size etc for each SST file and an api to delete a specific file from the db and all associated state in the bookkeeping datastructures.
      
      Notes: Editing the manifest version does not release the obsolete files right away. However deleting the file directly will mess up the iterator. We may need a more aggressive/timely file deletion api.
      
      I have used std::unique_ptr - will switch to boost:: since this is external. thoughts?
      
      Unit test is fragile right now as it expects the compaction at certain levels.
      
      Test Plan: unittest
      
      Reviewers: dhruba, vamsi, emayanke
      
      CC: zshao, leveldb, haobo
      
      Task ID: #
      
      Blame Rev:
      60bf2b7d
  32. 20 8月, 2013 1 次提交
  33. 06 8月, 2013 1 次提交
    • J
      Add soft and hard rate limit support · 1036537c
      Jim Paton 提交于
      Summary:
      This diff adds support for both soft and hard rate limiting. The following changes are included:
      
      1) Options.rate_limit is renamed to Options.hard_rate_limit.
      2) Options.rate_limit_delay_milliseconds is renamed to Options.rate_limit_delay_max_milliseconds.
      3) Options.soft_rate_limit is added.
      4) If the maximum compaction score is > hard_rate_limit and rate_limit_delay_max_milliseconds == 0, then writes are delayed by 1 ms at a time until the max compaction score falls below hard_rate_limit.
      5) If the max compaction score is > soft_rate_limit but <= hard_rate_limit, then writes are delayed by 0-1 ms depending on how close we are to hard_rate_limit.
      6) Users can disable 4 by setting hard_rate_limit = 0. They can add a limit to the maximum amount of time waited by setting rate_limit_delay_max_milliseconds > 0. Thus, the old behavior can be preserved by setting soft_rate_limit = 0, which is the default.
      
      Test Plan:
      make -j32 check
      ./db_stress
      
      Reviewers: dhruba, haobo, MarkCallaghan
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12003
      1036537c
  34. 02 8月, 2013 1 次提交
    • M
      Expand KeyMayExist to return the proper value if it can be found in memory and... · 59d0b02f
      Mayank Agarwal 提交于
      Expand KeyMayExist to return the proper value if it can be found in memory and also check block_cache
      
      Summary: Removed KeyMayExistImpl because KeyMayExist demanded Get like semantics now. Removed no_io from memtable and imm because we need the proper value now and shouldn't just stop when we see Merge in memtable. Added checks to block_cache. Updated documentation and unit-test
      
      Test Plan: make all check;db_stress for 1 hour
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11853
      59d0b02f