1. 30 8月, 2014 3 次提交
    • R
      Improve Cuckoo Table Reader performance. Inlined hash function and number of... · d20b8cfa
      Radheshyam Balasundaram 提交于
      Improve Cuckoo Table Reader performance. Inlined hash function and number of buckets a power of two.
      
      Summary:
      Use inlined hash functions instead of function pointer. Make number of buckets a power of two and use bitwise and instead of mod.
      After these changes, we get almost 50% improvement in performance.
      
      Results:
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.231us (4.3 Mqps) with batch size of 0
      Time taken per op is 0.229us (4.4 Mqps) with batch size of 0
      Time taken per op is 0.185us (5.4 Mqps) with batch size of 0
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.108us (9.3 Mqps) with batch size of 10
      Time taken per op is 0.100us (10.0 Mqps) with batch size of 10
      Time taken per op is 0.103us (9.7 Mqps) with batch size of 10
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.101us (9.9 Mqps) with batch size of 25
      Time taken per op is 0.098us (10.2 Mqps) with batch size of 25
      Time taken per op is 0.097us (10.3 Mqps) with batch size of 25
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.100us (10.0 Mqps) with batch size of 50
      Time taken per op is 0.097us (10.3 Mqps) with batch size of 50
      Time taken per op is 0.097us (10.3 Mqps) with batch size of 50
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.102us (9.8 Mqps) with batch size of 100
      Time taken per op is 0.098us (10.2 Mqps) with batch size of 100
      Time taken per op is 0.115us (8.7 Mqps) with batch size of 100
      
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.201us (5.0 Mqps) with batch size of 0
      Time taken per op is 0.155us (6.5 Mqps) with batch size of 0
      Time taken per op is 0.152us (6.6 Mqps) with batch size of 0
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.089us (11.3 Mqps) with batch size of 10
      Time taken per op is 0.084us (11.9 Mqps) with batch size of 10
      Time taken per op is 0.086us (11.6 Mqps) with batch size of 10
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.087us (11.5 Mqps) with batch size of 25
      Time taken per op is 0.085us (11.7 Mqps) with batch size of 25
      Time taken per op is 0.093us (10.8 Mqps) with batch size of 25
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.094us (10.6 Mqps) with batch size of 50
      Time taken per op is 0.094us (10.7 Mqps) with batch size of 50
      Time taken per op is 0.093us (10.8 Mqps) with batch size of 50
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.092us (10.9 Mqps) with batch size of 100
      Time taken per op is 0.089us (11.2 Mqps) with batch size of 100
      Time taken per op is 0.088us (11.3 Mqps) with batch size of 100
      
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.154us (6.5 Mqps) with batch size of 0
      Time taken per op is 0.168us (6.0 Mqps) with batch size of 0
      Time taken per op is 0.190us (5.3 Mqps) with batch size of 0
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.081us (12.4 Mqps) with batch size of 10
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 10
      Time taken per op is 0.083us (12.1 Mqps) with batch size of 10
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 25
      Time taken per op is 0.073us (13.7 Mqps) with batch size of 25
      Time taken per op is 0.073us (13.7 Mqps) with batch size of 25
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.076us (13.1 Mqps) with batch size of 50
      Time taken per op is 0.072us (13.8 Mqps) with batch size of 50
      Time taken per op is 0.072us (13.8 Mqps) with batch size of 50
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 100
      Time taken per op is 0.074us (13.6 Mqps) with batch size of 100
      Time taken per op is 0.073us (13.6 Mqps) with batch size of 100
      
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.190us (5.3 Mqps) with batch size of 0
      Time taken per op is 0.186us (5.4 Mqps) with batch size of 0
      Time taken per op is 0.184us (5.4 Mqps) with batch size of 0
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.079us (12.7 Mqps) with batch size of 10
      Time taken per op is 0.070us (14.2 Mqps) with batch size of 10
      Time taken per op is 0.072us (14.0 Mqps) with batch size of 10
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.080us (12.5 Mqps) with batch size of 25
      Time taken per op is 0.072us (14.0 Mqps) with batch size of 25
      Time taken per op is 0.071us (14.1 Mqps) with batch size of 25
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.082us (12.1 Mqps) with batch size of 50
      Time taken per op is 0.071us (14.1 Mqps) with batch size of 50
      Time taken per op is 0.073us (13.6 Mqps) with batch size of 50
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.080us (12.5 Mqps) with batch size of 100
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 100
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 100
      
      Test Plan:
      make check all
      make valgrind_check
      make asan_check
      
      Reviewers: sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22539
      d20b8cfa
    • T
      ForwardIterator: reset incomplete iterators on Seek() · 0f9c43ea
      Tomislav Novak 提交于
      Summary:
      When reading from kBlockCacheTier, ForwardIterator's internal child iterators
      may end up in the incomplete state (read was unable to complete without doing
      disk I/O). `ForwardIterator::status()` will correctly report that; however, the
      iterator may be stuck in that state until all sub-iterators are rebuilt:
      
        * `NeedToSeekImmutable()` may return false even if some sub-iterators are
          incomplete
        * one of the child iterators may be an empty iterator without any state other
          that the kIncomplete status (created using `NewErrorIterator()`); seeking on
          any such iterator has no effect -- we need to construct it again
      
      Akin to rebuilding iterators after a superversion bump, this diff makes forward
      iterator reset all incomplete child iterators when `Seek()` or `Next()` are
      called.
      
      Test Plan: TEST_TMPDIR=/dev/shm/rocksdbtest ROCKSDB_TESTS=TailingIterator ./db_test
      
      Reviewers: igor, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: lovro, march, leveldb
      
      Differential Revision: https://reviews.facebook.net/D22575
      0f9c43ea
    • L
      reduce recordTick overhead in compaction loop · 722d80c3
      Lei Jin 提交于
      Summary: It is too expensive to bump ticker to every key/vaue pair
      
      Test Plan: make release
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22527
      722d80c3
  2. 29 8月, 2014 8 次提交
  3. 28 8月, 2014 4 次提交
  4. 27 8月, 2014 9 次提交
  5. 26 8月, 2014 5 次提交
  6. 24 8月, 2014 2 次提交
  7. 23 8月, 2014 1 次提交
    • I
      Fix concurrency issue in CompactionPicker · 42ea7952
      Igor Canadi 提交于
      Summary:
      I am currently working on a project that uses RocksDB. While debugging some perf issues, I came up across interesting compaction concurrency issue. Namely, I had 15 idle threads and a good comapction to do, but CompactionPicker returned "Compaction nothing to do". Here's how Internal stats looked:
      
          2014/08/22-08:08:04.551982 7fc7fc3f5700 ------- DUMPING STATS -------
          2014/08/22-08:08:04.552000 7fc7fc3f5700
          ** Compaction Stats [default] **
          Level   Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s)  Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt)  Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms)
          ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            L0     7/5        353   1.0      0.0     0.0      0.0       2.3      2.3    0.0   0.0      0.0      9.4        0         0         0         0        247        46    5.359       8.53          1 8526.25
            L1     2/2         86   1.3      2.6     1.9      0.7       2.6      1.9    2.7   1.3     24.3     24.0       39        19        71        52        109        11    9.938       0.00          0    0.00
            L2    26/0        833   1.3      5.7     1.7      4.0       5.2      1.2    6.3   3.0     15.6     14.2       47       112       147        35        373        44    8.468       0.00          0    0.00
            L3    12/0        505   0.1      0.0     0.0      0.0       0.0      0.0    0.0   0.0      0.0      0.0        0         0         0         0          0         0    0.000       0.00          0    0.00
           Sum    47/7       1778   0.0      8.3     3.6      4.6      10.0      5.4    8.1   4.4     11.6     14.1       86       131       218        87        728       101    7.212       8.53          1 8526.25
           Int     0/0          0   0.0      2.4     0.8      1.6       2.7      1.2   11.5   6.1     12.0     13.6       20        43        63        20        203        23    8.845       0.00          0    0.00
          Flush(GB): accumulative 2.266, interval 0.444
          Stalls(secs): 0.000 level0_slowdown, 0.000 level0_numfiles, 8.526 memtable_compaction, 0.000 leveln_slowdown_soft, 0.000 leveln_slowdown_hard
          Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 1 memtable_compaction, 0 leveln_slowdown_soft, 0 leveln_slowdown_hard
      
          ** DB Stats **
          Uptime(secs): 336.8 total, 60.4 interval
          Cumulative writes: 61584000 writes, 6480589 batches, 9.5 writes per batch, 1.39 GB user ingest
          Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 GB written
          Interval writes: 11235257 writes, 1175050 batches, 9.6 writes per batch, 259.9 MB user ingest
          Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 MB written
      
      To see what happened, go here: https://github.com/facebook/rocksdb/blob/47b452cfcf9b1487d41f886a98bc0d6f95587e90/db/compaction_picker.cc#L430
      * The for loop started with level 1, because it has the worst score.
      * PickCompactionBySize on L429 returned nullptr because all files were being compacted
      * ExpandWhileOverlapping(c) returned true (because that's what it does when it gets nullptr!?)
      * for loop break-ed, never trying compactions for level 2 :( :(
      
      This bug was present at least since January. I have no idea how we didn't find this sooner.
      
      Test Plan:
      Unit testing compaction picker is hard. I tested this by running my service and observing L0->L1 and L2->L3 compactions in parallel. However, for long-term, I opened the task #4968469. @yhchiang is currently refactoring CompactionPicker, hopefully the new version will be unit-testable ;)
      
      Here's how my compactions look like after the patch:
      
          2014/08/22-08:50:02.166699 7f3400ffb700 ------- DUMPING STATS -------
          2014/08/22-08:50:02.166722 7f3400ffb700
          ** Compaction Stats [default] **
          Level   Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s)  Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt)  Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms)
          ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            L0     8/5        404   1.5      0.0     0.0      0.0       4.3      4.3    0.0   0.0      0.0      9.6        0         0         0         0        463        88    5.260       0.00          0    0.00
            L1     2/2         60   0.9      4.8     3.9      0.8       4.7      3.9    2.4   1.2     23.9     23.6       80        23       131       108        204        19   10.747       0.00          0    0.00
            L2    23/3        697   1.0     11.6     3.5      8.1      10.9      2.8    6.4   3.1     17.7     16.6       95       242       317        75        669        92    7.268       0.00          0    0.00
            L3    58/14      2207   0.3      6.2     1.6      4.6       5.9      1.3    7.4   3.6     14.6     13.9       43       121       159        38        436        36   12.106       0.00          0    0.00
           Sum    91/24      3368   0.0     22.5     9.1     13.5      25.8     12.4   11.2   6.0     13.0     14.9      218       386       607       221       1772       235    7.538       0.00          0    0.00
           Int     0/0          0   0.0      3.2     0.9      2.3       3.6      1.3   15.3   8.0     12.4     13.7       24        66        89        23        266        27    9.838       0.00          0    0.00
          Flush(GB): accumulative 4.336, interval 0.444
          Stalls(secs): 0.000 level0_slowdown, 0.000 level0_numfiles, 0.000 memtable_compaction, 0.000 leveln_slowdown_soft, 0.000 leveln_slowdown_hard
          Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 0 leveln_slowdown_soft, 0 leveln_slowdown_hard
      
          ** DB Stats **
          Uptime(secs): 577.7 total, 60.1 interval
          Cumulative writes: 116960736 writes, 11966220 batches, 9.8 writes per batch, 2.64 GB user ingest
          Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 GB written
          Interval writes: 11643735 writes, 1206136 batches, 9.7 writes per batch, 269.2 MB user ingest
          Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 MB written
      
      Yay for concurrent L0->L1 and L2->L3 compactions!
      
      Reviewers: sdong, yhchiang, ljin
      
      Reviewed By: yhchiang
      
      Subscribers: yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D22305
      42ea7952
  8. 22 8月, 2014 2 次提交
  9. 21 8月, 2014 6 次提交
    • R
      Implement Prepare method in CuckooTableReader · 08be7f52
      Radheshyam Balasundaram 提交于
      Summary:
      - Implement Prepare method
      - Rewrite performance tests in cuckoo_table_reader_test to write new file only if one doesn't already exist.
      - Add performance tests for batch lookup along with prefetching.
      
      Test Plan:
      ./cuckoo_table_reader_test --enable_perf
      Results (We get better results if we used int64 comparator instead of string comparator (TBD in future diffs)):
      With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
      Time taken per op is 0.208us (4.8 Mqps) with batch size of 0
      With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
      Time taken per op is 0.182us (5.5 Mqps) with batch size of 10
      With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
      Time taken per op is 0.161us (6.2 Mqps) with batch size of 25
      With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
      Time taken per op is 0.161us (6.2 Mqps) with batch size of 50
      With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
      Time taken per op is 0.163us (6.1 Mqps) with batch size of 100
      
      With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
      Time taken per op is 0.252us (4.0 Mqps) with batch size of 0
      With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
      Time taken per op is 0.192us (5.2 Mqps) with batch size of 10
      With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
      Time taken per op is 0.195us (5.1 Mqps) with batch size of 25
      With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
      Time taken per op is 0.191us (5.2 Mqps) with batch size of 50
      With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
      Time taken per op is 0.194us (5.1 Mqps) with batch size of 100
      
      With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
      Time taken per op is 0.228us (4.4 Mqps) with batch size of 0
      With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
      Time taken per op is 0.185us (5.4 Mqps) with batch size of 10
      With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
      Time taken per op is 0.186us (5.4 Mqps) with batch size of 25
      With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
      Time taken per op is 0.189us (5.3 Mqps) with batch size of 50
      With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
      Time taken per op is 0.188us (5.3 Mqps) with batch size of 100
      
      With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
      Time taken per op is 0.325us (3.1 Mqps) with batch size of 0
      With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
      Time taken per op is 0.196us (5.1 Mqps) with batch size of 10
      With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
      Time taken per op is 0.199us (5.0 Mqps) with batch size of 25
      With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
      Time taken per op is 0.196us (5.1 Mqps) with batch size of 50
      With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
      Time taken per op is 0.209us (4.8 Mqps) with batch size of 100
      
      Reviewers: sdong, yhchiang, igor, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22167
      08be7f52
    • Y
      Fix the error of c_test.c · 47b452cf
      Yueh-Hsuan Chiang 提交于
      Summary:
      Fix the error of c_test.c
      
      Test Plan:
      make c_test
      ./c_test
      47b452cf
    • Y
      Add missing implementaiton of SanitizeDBOptions in simple_table_db_test.cc · 562b7a1f
      Yueh-Hsuan Chiang 提交于
      Summary:
      Add missing implementaiton of SanitizeDBOptions in simple_table_db_test.cc
      
      Test Plan:
      make simple_table_db_test.cc
      562b7a1f
    • Y
      Improve Options sanitization and add MmapReadRequired() to TableFactory · 63a2215c
      Yueh-Hsuan Chiang 提交于
      Summary:
      Currently, PlainTable must use mmap_reads.  When PlainTable is used but
      allow_mmap_reads is not set, rocksdb will fail in flush.
      
      This diff improve Options sanitization and add MmapReadRequired() to
      TableFactory.
      
      Test Plan:
      export ROCKSDB_TESTS=PlainTableOptionsSanitizeTest
      make db_test -j32
      ./db_test
      
      Reviewers: sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: you, leveldb
      
      Differential Revision: https://reviews.facebook.net/D21939
      63a2215c
    • J
      Eliminate VersionSet memory leak · e173bf9c
      Jonah Cohen 提交于
      Summary:
      ManifestDumpCommand::DoCommand was allocating a VersionSet and never
      freeing it.
      
      Test Plan: make
      
      Reviewers: igor
      
      Reviewed By: igor
      
      Differential Revision: https://reviews.facebook.net/D22221
      e173bf9c
    • S
      Revert the unintended change that DestroyDB() doesn't clean up info logs. · 10720a55
      sdong 提交于
      Summary: A previous change triggered a change by mistake: DestroyDB() will keep info logs under DB directory. Revert the unintended change.
      
      Test Plan: Add a unit test case to verify it.
      
      Reviewers: ljin, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22209
      10720a55