1. 08 1月, 2014 1 次提交
    • T
      Fix a deadlock in CompactRange() · 9f690ec6
      Tomislav Novak 提交于
      Summary:
      The way DBImpl::TEST_CompactRange() throttles down the number of bg compactions
      can cause it to deadlock when CompactRange() is called concurrently from
      multiple threads. Imagine a following scenario with only two threads
      (max_background_compactions is 10 and bg_compaction_scheduled_ is initially 0):
      
         1. Thread #1 increments bg_compaction_scheduled_ (to LargeNumber), sets
            bg_compaction_scheduled_ to 9 (newvalue), schedules the compaction
            (bg_compaction_scheduled_ is now 10) and waits for it to complete.
         2. Thread #2 calls TEST_CompactRange(), increments bg_compaction_scheduled_
            (now LargeNumber + 10) and waits on a cv for bg_compaction_scheduled_ to
            drop to LargeNumber.
         3. BG thread completes the first manual compaction, decrements
            bg_compaction_scheduled_ and wakes up all threads waiting on bg_cv_.
            Thread #1 runs, increments bg_compaction_scheduled_ by LargeNumber again
            (now 2*LargeNumber + 9). Since that's more than LargeNumber + newvalue,
            thread #2 also goes to sleep (waiting on bg_cv_), without resetting
            bg_compaction_scheduled_.
      
      This diff attempts to address the problem by introducing a new counter
      bg_manual_only_ (when positive, MaybeScheduleFlushOrCompaction() will only
      schedule manual compactions).
      
      Test Plan:
      I could pretty much consistently reproduce the deadlock with a program that
      calls CompactRange(nullptr, nullptr) immediately after Write() from multiple
      threads. This no longer happens with this patch.
      
      Tests (make check) pass.
      
      Reviewers: dhruba, igor, sdong, haobo
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14799
      9f690ec6
  2. 07 1月, 2014 3 次提交
  3. 06 1月, 2014 1 次提交
  4. 05 1月, 2014 1 次提交
  5. 03 1月, 2014 3 次提交
    • K
      Add clang-format rules · 463086bc
      Kai Liu 提交于
      Summary:
      The rule file is forked from that in Facebook's repo.
      
      I'll add format file for now and team members can tune the rules later.
      
      In this patch, I made only two changes in order to be consistent with existing coding style
      
      `SpacesBeforeTrailingComments: 2`
      
      `ColumnLimit:     80`
      
      Test Plan: N/A
      
      Reviewers: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15015
      463086bc
    • K
      Automate the preparation step for a new release · 46950597
      Kai Liu 提交于
      Summary: Added a script that prepares the repo for facebook's new rocksdb release, which will automatically do some necessary work to make sure this repo is ready for 3rdparty release.
      
      Test Plan:
      Run this script and observed:
      
      * new version was created (both in local and remote repo) as a git tag.
      * build_version.cc was updated
      * build_detect_platform was changed so that it won't create any new change.
      
      Reviewers: haobo, dhruba, sdong, igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15003
      46950597
    • K
      Hotfix the bug in table cache's GetSliceForFileNumber · 9281a826
      kailiu 提交于
      Forgot to fix this problem in master branch. Already fixed it in performance branch.
      9281a826
  6. 02 1月, 2014 4 次提交
  7. 31 12月, 2013 3 次提交
  8. 30 12月, 2013 2 次提交
  9. 27 12月, 2013 8 次提交
    • D
      fix build bug from recent... · 9d4dc0da
      dyu 提交于
      fix build bug from recent commit:https://github.com/facebook/rocksdb/commit/43c386b72ee834c88a1a22500ce1fc36a8208277
      9d4dc0da
    • S
      TableCache.FindTable() to avoid the mem copy of file number · a094f3b3
      Siying Dong 提交于
      Summary: I'm not sure what's the purpose of encoding file number to a new buffer for looking up the table cache. It seems to be unnecessary to me. With this patch, we point the lookup key to the address of the int64 of the file number.
      
      Test Plan: make all check
      
      Reviewers: dhruba, haobo, igor, kailiu
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14811
      a094f3b3
    • S
      Avoid malloc in NotFound key status if no message is given. · 18df47b7
      Siying Dong 提交于
      Summary:
      In some places we have NotFound status created with empty message, but it doesn't avoid a malloc. With this patch, the malloc is avoided for that case.
      
      The motivation of it is that I found in db_bench readrandom test when all keys are not existing, about 4% of the total running time is spent on malloc of Status, plus a similar amount of CPU spent on free of them, which is not necessary.
      
      Test Plan: make all check
      
      Reviewers: dhruba, haobo, igor
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14691
      18df47b7
    • K
      Fix all the comparison issue in fb dev servers · b40c052b
      Kai Liu 提交于
      b40c052b
    • K
      Fix [-Werror=sign-compare] in autovector_test · 113a08c9
      kailiu 提交于
      113a08c9
    • K
      Fix the unused variable warning message in mac os · 079a21ba
      kailiu 提交于
      079a21ba
    • K
      Implement autovector · c01676e4
      kailiu 提交于
      Summary:
      A vector that leverages pre-allocated stack-based array to achieve better
      performance for array with small amount of items.
      
      Test Plan:
      Added tests for both correctness and performance
      
      Here is the performance benchmark between vector and autovector
      
      Please note that in the test "Creation and Insertion Test", the test case were designed with the motivation described below:
      
      * no element inserted: internal array of std::vector may not really get
        initialize.
      * one element inserted: internal array of std::vector must have
        initialized.
      * kSize elements inserted. This shows the most time we'll spend if we
        keep everything in stack.
      * 2 * kSize elements inserted. The internal vector of
        autovector must have been initialized.
      
      Note: kSize is the capacity of autovector
      
        =====================================================
        Creation and Insertion Test
        =====================================================
        created 100000 vectors:
        	each was inserted with 0 elements
        	total time elapsed: 128000 (ns)
        created 100000 autovectors:
        	each was inserted with 0 elements
        	total time elapsed: 3641000 (ns)
        created 100000 VectorWithReserveSizes:
        	each was inserted with 0 elements
        	total time elapsed: 9896000 (ns)
        -----------------------------------
        created 100000 vectors:
        	each was inserted with 1 elements
        	total time elapsed: 11089000 (ns)
        created 100000 autovectors:
        	each was inserted with 1 elements
        	total time elapsed: 5008000 (ns)
        created 100000 VectorWithReserveSizes:
        	each was inserted with 1 elements
        	total time elapsed: 24271000 (ns)
        -----------------------------------
        created 100000 vectors:
        	each was inserted with 4 elements
        	total time elapsed: 39369000 (ns)
        created 100000 autovectors:
        	each was inserted with 4 elements
        	total time elapsed: 10121000 (ns)
        created 100000 VectorWithReserveSizes:
        	each was inserted with 4 elements
        	total time elapsed: 28473000 (ns)
        -----------------------------------
        created 100000 vectors:
        	each was inserted with 8 elements
        	total time elapsed: 75013000 (ns)
        created 100000 autovectors:
        	each was inserted with 8 elements
        	total time elapsed: 18237000 (ns)
        created 100000 VectorWithReserveSizes:
        	each was inserted with 8 elements
        	total time elapsed: 42464000 (ns)
        -----------------------------------
        created 100000 vectors:
        	each was inserted with 16 elements
        	total time elapsed: 102319000 (ns)
        created 100000 autovectors:
        	each was inserted with 16 elements
        	total time elapsed: 76724000 (ns)
        created 100000 VectorWithReserveSizes:
        	each was inserted with 16 elements
        	total time elapsed: 68285000 (ns)
        -----------------------------------
        =====================================================
        Sequence Access Test
        =====================================================
        performed 100000 sequence access against vector
        	size: 4
        	total time elapsed: 198000 (ns)
        performed 100000 sequence access against autovector
        	size: 4
        	total time elapsed: 306000 (ns)
        -----------------------------------
        performed 100000 sequence access against vector
        	size: 8
        	total time elapsed: 565000 (ns)
        performed 100000 sequence access against autovector
        	size: 8
        	total time elapsed: 512000 (ns)
        -----------------------------------
        performed 100000 sequence access against vector
        	size: 16
        	total time elapsed: 1076000 (ns)
        performed 100000 sequence access against autovector
        	size: 16
        	total time elapsed: 1070000 (ns)
        -----------------------------------
      
      Reviewers: dhruba, haobo, sdong, chip
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14655
      c01676e4
    • K
      Merge pull request #32 from jamesgolick/master · 5643ae1a
      Kai Liu 提交于
      Only try to use fallocate if it's actually present on the system.
      5643ae1a
  10. 24 12月, 2013 1 次提交
  11. 21 12月, 2013 2 次提交
    • I
      Initialize sequence number in BatchResult - issue #39 · b26dc956
      Igor Canadi 提交于
      b26dc956
    • I
      [RocksDB] Optimize locking for Get · 1fdb3f7d
      Igor Canadi 提交于
      Summary:
      Instead of locking and saving a DB state, we can cache a DB state and update it only when it changes. This change reduces lock contention and speeds up read operations on the DB.
      
      Performance improvements are substantial, although there is some cost in no-read workloads. I ran the regression tests on my devserver and here are the numbers:
      
        overwrite                    56345  ->   63001
        fillseq                      193730 ->  185296
        readrandom                   771301 -> 1219803 (58% improvement!)
        readrandom_smallblockcache   677609 ->  862850
        readrandom_memtable_sst      710440 -> 1109223
        readrandom_fillunique_random 221589 ->  247869
        memtablefillrandom           105286 ->   92643
        memtablereadrandom           763033 -> 1288862
      
      Test Plan:
      make asan_check
      I am also running db_stress
      
      Reviewers: dhruba, haobo, sdong, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14679
      1fdb3f7d
  12. 20 12月, 2013 1 次提交
  13. 19 12月, 2013 4 次提交
    • M
      Add 'readtocache' test · ca92068b
      Mark Callaghan 提交于
      Summary:
      For some tests I want to cache the database prior to running other tests on the same invocation
      of db_bench. The readtocache test ignores --threads and --reads so those can be used by other tests
      and it will still do a full read of --num rows with one thread. It might be invoked like:
        db_bench --benchmarks=readtocache,readrandom --reads 100 --num 10000 --threads 8
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14739
      ca92068b
    • I
      Reorder tests · e914b649
      Igor Canadi 提交于
      Summary:
      db_test should be the first to execute because it finds the most bugs.
      
      Also, when third parties report issues, we don't want ldb error message, we prefer to have db_test error message. For example, see thread: https://github.com/facebook/rocksdb/issues/25
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14715
      e914b649
    • I
      Merge pull request #35 from zizkovrb/rm-ds_store · cbb8da6f
      Igor Canadi 提交于
      Remove utilities/.DS_Store file.
      cbb8da6f
    • I
      Merge pull request #37 from mlin/more-c-bindings · 3b50b621
      Igor Canadi 提交于
      C bindings: add a bunch of the newer options
      3b50b621
  14. 18 12月, 2013 2 次提交
  15. 17 12月, 2013 1 次提交
  16. 16 12月, 2013 1 次提交
  17. 13 12月, 2013 2 次提交
    • I
      [backupable db] Delete db_dir children when restoring backup · 417b453f
      Igor Canadi 提交于
      Summary:
      I realized that manifest will get deleted by PurgeObsoleteFiles in DBImpl, but it is sill cleaner to delete
      files before we restore the backup
      
      Test Plan: backupable_db_test
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14619
      417b453f
    • M
      Add monitoring for universal compaction and add counters for compaction IO · e9e6b00d
      Mark Callaghan 提交于
      Summary:
      Adds these counters
      { WAL_FILE_SYNCED, "rocksdb.wal.synced" }
        number of writes that request a WAL sync
      { WAL_FILE_BYTES, "rocksdb.wal.bytes" },
        number of bytes written to the WAL
      { WRITE_DONE_BY_SELF, "rocksdb.write.self" },
        number of writes processed by the calling thread
      { WRITE_DONE_BY_OTHER, "rocksdb.write.other" },
        number of writes not processed by the calling thread. Instead these were
        processed by the current holder of the write lock
      { WRITE_WITH_WAL, "rocksdb.write.wal" },
        number of writes that request WAL logging
      { COMPACT_READ_BYTES, "rocksdb.compact.read.bytes" },
        number of bytes read during compaction
      { COMPACT_WRITE_BYTES, "rocksdb.compact.write.bytes" },
        number of bytes written during compaction
      
      Per-interval stats output was updated with WAL stats and correct stats for universal compaction
      including a correct value for write-amplification. It now looks like:
                                     Compactions
      Level  Files Size(MB) Score Time(sec)  Read(MB) Write(MB)    Rn(MB)  Rnp1(MB)  Wnew(MB) RW-Amplify Read(MB/s) Write(MB/s)      Rn     Rnp1     Wnp1     NewW    Count  Ln-stall Stall-cnt
      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        0        7      464  46.4       281      3411      3875      3411         0      3875        2.1      12.1        13.8      621        0      240      240      628       0.0         0
      Uptime(secs): 310.8 total, 2.0 interval
      Writes cumulative: 9999999 total, 9999999 batches, 1.0 per batch, 1.22 ingest GB
      WAL cumulative: 9999999 WAL writes, 9999999 WAL syncs, 1.00 writes per sync, 1.22 GB written
      Compaction IO cumulative (GB): 1.22 new, 3.33 read, 3.78 write, 7.12 read+write
      Compaction IO cumulative (MB/sec): 4.0 new, 11.0 read, 12.5 write, 23.4 read+write
      Amplification cumulative: 4.1 write, 6.8 compaction
      Writes interval: 100000 total, 100000 batches, 1.0 per batch, 12.5 ingest MB
      WAL interval: 100000 WAL writes, 100000 WAL syncs, 1.00 writes per sync, 0.01 MB written
      Compaction IO interval (MB): 12.49 new, 14.98 read, 21.50 write, 36.48 read+write
      Compaction IO interval (MB/sec): 6.4 new, 7.6 read, 11.0 write, 18.6 read+write
      Amplification interval: 101.7 write, 102.9 compaction
      Stalls(secs): 142.924 level0_slowdown, 0.000 level0_numfiles, 0.805 memtable_compaction, 0.000 leveln_slowdown
      Stalls(count): 132461 level0_slowdown, 0 level0_numfiles, 3 memtable_compaction, 0 leveln_slowdown
      
      Task ID: #3329644, #3301695
      
      Blame Rev:
      
      Test Plan:
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14583
      e9e6b00d