1. 18 6月, 2013 3 次提交
  2. 15 6月, 2013 4 次提交
  3. 14 6月, 2013 2 次提交
  4. 13 6月, 2013 3 次提交
    • H
      [RocksDB] Sync file to disk incrementally · 778e1790
      Haobo Xu 提交于
      Summary:
      During compaction, we sync the output files after they are fully written out. This causes unnecessary blocking of the compaction thread and burstiness of the write traffic.
      This diff simply asks the OS to sync data incrementally as they are written, on the background. The hope is that, at the final sync, most of the data are already on disk and we would block less on the sync call. Thus, each compaction runs faster and we could use fewer number of compaction threads to saturate IO.
      In addition, the write traffic will be smoothed out, hopefully reducing the IO P99 latency too.
      
      Some quick tests show 10~20% improvement in per thread compaction throughput. Combined with posix advice on compaction read, just 5 threads are enough to almost saturate the udb flash bandwidth for 800 bytes write only benchmark.
      What's more promising is that, with saturated IO, iostat shows average wait time is actually smoother and much smaller.
      For the write only test 800bytes test:
      Before the change:  await  occillate between 10ms and 3ms
      After the change: await ranges 1-3ms
      
      Will test against read-modify-write workload too, see if high read latency P99 could be resolved.
      
      Will introduce a parameter to control the sync interval in a follow up diff after cleaning up EnvOptions.
      
      Test Plan: make check; db_bench; db_stress
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11115
      778e1790
    • D
      [Rocksdb] [Multiget] Introduced multiget into db_bench · 4985a9f7
      Deon Nicholas 提交于
      Summary:
      Preliminary! Introduced the --use_multiget=1 and --keys_per_multiget=n
      flags for db_bench. Also updated and tested the ReadRandom() method
      to include an option to use multiget. By default,
      keys_per_multiget=100.
      
      Preliminary tests imply that multiget is at least 1.25x faster per
      key than regular get.
      
      Will continue adding Multiget for ReadMissing, ReadHot,
      RandomWithVerify, ReadRandomWriteRandom; soon. Will also think
      about ways to better verify benchmarks.
      
      Test Plan:
      1. make db_bench
      2. ./db_bench --benchmarks=fillrandom
      3. ./db_bench --benchmarks=readrandom --use_existing_db=1
      	      --use_multiget=1 --threads=4 --keys_per_multiget=100
      4. ./db_bench --benchmarks=readrandom --use_existing_db=1
      	      --threads=4
      5. Verify ops/sec (and 1000000 of 1000000 keys found)
      
      Reviewers: haobo, MarkCallaghan, dhruba
      
      Reviewed By: MarkCallaghan
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11127
      4985a9f7
    • H
      [RocksDB] cleanup EnvOptions · bdf10859
      Haobo Xu 提交于
      Summary:
      This diff simplifies EnvOptions by treating it as POD, similar to Options.
      - virtual functions are removed and member fields are accessed directly.
      - StorageOptions is removed.
      - Options.allow_readahead and Options.allow_readahead_compactions are deprecated.
      - Unused global variables are removed: useOsBuffer, useFsReadAhead, useMmapRead, useMmapWrite
      
      Test Plan: make check; db_stress
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11175
      bdf10859
  5. 12 6月, 2013 1 次提交
    • D
      Completed the implementation and test cases for Redis API. · 5679107b
      Deon Nicholas 提交于
      Summary:
      Completed the implementation for the Redis API for Lists.
      The Redis API uses rocksdb as a backend to persistently
      store maps from key->list. It supports basic operations
      for appending, inserting, pushing, popping, and accessing
      a list, given its key.
      
      Test Plan:
        - Compile with: make redis_test
        - Test with: ./redis_test
        - Run all unit tests (for all rocksdb) with: make all check
        - To use an interactive REDIS client use: ./redis_test -m
        - To clean the database before use:       ./redis_test -m -d
      
      Reviewers: haobo, dhruba, zshao
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10833
      5679107b
  6. 11 6月, 2013 6 次提交
    • D
      Do not submit multiple simultaneous seek-compaction requests. · e673d5d2
      Dhruba Borthakur 提交于
      Summary:
      The code was such that if multi-threaded-compactions as well
      as seek compaction are enabled then it submits multiple
      compaction request for the same range of keys. This causes
      extraneous sst-files to accumulate at various levels.
      
      Test Plan:
      I am not able to write a very good unit test for this one
      but can easily reproduce this bug with 'dbstress' with the
      following options.
      
      batch=1;maxk=100000000;ops=100000000;ro=0;fm=2;bpl=10485760;of=500000; wbn=3; mbc=20; mb=2097152; wbs=4194304; dds=1; sync=0;  t=32; bs=16384; cs=1048576; of=500000; ./db_stress --disable_seek_compaction=0 --mmap_read=0 --threads=$t --block_size=$bs --cache_size=$cs --open_files=$of --verify_checksum=1 --db=/data/mysql/leveldb/dbstress.dir --sync=$sync --disable_wal=1 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --target_file_size_multiplier=$fm --max_write_buffer_number=$wbn --max_background_compactions=$mbc --max_bytes_for_level_base=$bpl --reopen=$ro --ops_per_thread=$ops --max_key=$maxk --test_batches_snapshots=$batch
      
      Reviewers: leveldb, emayanke
      
      Reviewed By: emayanke
      
      Differential Revision: https://reviews.facebook.net/D11055
      e673d5d2
    • M
      Make Write API work for TTL databases · 3c35eda9
      Mayank Agarwal 提交于
      Summary: Added logic to make another WriteBatch with Timestamps during the Write function execution in TTL class. Also expanded the ttl_test to test for it. Have done nothing for Merge for now.
      
      Test Plan: make ttl_test;./ttl_test
      
      Reviewers: haobo, vamsi, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10827
      3c35eda9
    • D
      Fix refering freed memory in earlier commit. · 1b69f1e5
      Dhruba Borthakur 提交于
      Summary: Fix refering freed memory in earlier commit by https://reviews.facebook.net/D11181
      
      Test Plan: make check
      
      Reviewers: haobo, sheki
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11193
      1b69f1e5
    • A
      [Rocksdb] fix wrong assert · 4a8554d5
      Abhishek Kona 提交于
      Summary: the assert was wrong in D11145. Broke build
      
      Test Plan: make db_bench run it
      
      Reviewers: dhruba, haobo, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11187
      4a8554d5
    • D
      Print name of user comparator in LOG. · c5de1b93
      Dhruba Borthakur 提交于
      Summary:
      The current code prints the name of the InternalKeyComparator
      in the log file. We would also like to print the name of the
      user-specified comparator for easier debugging.
      
      Test Plan: make check
      
      Reviewers: sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11181
      c5de1b93
    • A
      [rocksdb] names for all metrics provided in statistics.h · a4913c51
      Abhishek Kona 提交于
      Summary: Provide a  map of histograms and ticker vs strings. Fb303 libraries can use this to provide the mapping. We will not have to duplicate the code during release.
      
      Test Plan: db_bench with statistics=1
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11145
      a4913c51
  7. 10 6月, 2013 2 次提交
    • M
      Max_mem_compaction_level can have maximum value of num_levels-1 · 184343a0
      Mayank Agarwal 提交于
      Summary:
      Without this files could be written out to a level greater than the maximum level possible and is the source of the segfaults that wormhole awas getting. The sequence of steps that was followed:
      1. WriteLevel0Table was called when memtable was to be flushed for a file.
      2. PickLevelForMemTableOutput was called to determine the level to which this file should be pushed.
      3. PickLevelForMemTableOutput returned a wrong result because max_mem_compaction_level was equal to 2 even when num_levels was equal to 0.
      The fix to re-initialize max_mem_compaction_level based on num_levels passed seems correct.
      
      Test Plan: make all check; Also made a dummy file to mimic the wormhole-file behaviour which was causing the segfaults and found that the same segfault occurs without this change and not with this.
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11157
      184343a0
    • M
      Modifying options to db_stress when it is run with db_crashtest · 7a6bd8e9
      Mayank Agarwal 提交于
      Summary: These extra options caught some bugs. Will be run via Jenkins now with the crash_test
      
      Test Plan: ./make crashtest
      
      Reviewers: dhruba, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11151
      7a6bd8e9
  8. 08 6月, 2013 3 次提交
  9. 07 6月, 2013 1 次提交
  10. 06 6月, 2013 4 次提交
  11. 05 6月, 2013 1 次提交
  12. 04 6月, 2013 2 次提交
    • M
      Improve output for GetProperty('leveldb.stats') · d9f538e1
      Mark Callaghan 提交于
      Summary:
      Display separate values for read, write & total compaction IO.
      Display compaction amplification and write amplification.
      Add similar values for the period since the last call to GetProperty. Results since the server started
      are reported as "cumulative" stats. Results since the last call to GetProperty are reported as
      "interval" stats.
      
      Level  Files Size(MB) Time(sec)  Read(MB) Write(MB)    Rn(MB)  Rnp1(MB)  Wnew(MB) Amplify Read(MB/s) Write(MB/s)      Rn     Rnp1     Wnp1     NewW    Count  Ln-stall
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
        0        7       13        21         0       211         0         0       211     0.0       0.0        10.1        0        0        0        0      113       0.0
        1       79      157        88       993       989       198       795       194     9.0      11.3        11.2      106      405      502       97       14       0.0
        2       19       36         5        63        63        37        27        36     2.4      12.3        12.2       19       14       32       18       12       0.0
      >>>>>>>>>>>>>>>>>>>>>>>>> text below has been is new and/or reformatted
      Uptime(secs): 122.2 total, 0.9 interval
      Compaction IO cumulative (GB): 0.21 new, 1.03 read, 1.23 write, 2.26 read+write
      Compaction IO cumulative (MB/sec): 1.7 new, 8.6 read, 10.3 write, 19.0 read+write
      Amplification cumulative: 6.0 write, 11.0 compaction
      Compaction IO interval (MB): 5.59 new, 0.00 read, 5.59 write, 5.59 read+write
      Compaction IO interval (MB/sec): 6.5 new, 0.0 read, 6.5 write, 6.5 read+write
      Amplification interval: 1.0 write, 1.0 compaction
      >>>>>>>>>>>>>>>>>>>>>>>> text above is new and/or reformatted
      Stalls(secs): 90.574 level0_slowdown, 0.000 level0_numfiles, 10.165 memtable_compaction, 0.000 leveln_slowdown
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      make check, run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11049
      d9f538e1
    • H
      [RocksDB] Add score column to leveldb.stats · 2b1fb5b0
      Haobo Xu 提交于
      Summary: Added the 'score' column to the compaction stats output, which shows the level total size devided by level target size. Could be useful when monitoring compaction decisions...
      
      Test Plan: make check; db_bench
      
      Reviewers: dhruba
      
      CC: leveldb, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D11025
      2b1fb5b0
  13. 02 6月, 2013 1 次提交
    • H
      [RocksDB] Introduce Fast Mutex option · d897d33b
      Haobo Xu 提交于
      Summary:
      This diff adds an option to specify whether PTHREAD_MUTEX_ADAPTIVE_NP will be enabled for the rocksdb single big kernel lock. db_bench also have this option now.
      Quickly tested 8 thread cpu bound 100 byte random read.
      No fast mutex: ~750k/s ops
      With fast mutex: ~880k/s ops
      
      Test Plan: make check; db_bench; db_stress
      
      Reviewers: dhruba
      
      CC: MarkCallaghan, leveldb
      
      Differential Revision: https://reviews.facebook.net/D11031
      d897d33b
  14. 31 5月, 2013 1 次提交
    • H
      [RocksDB] [Performance] Allow different posix advice to be applied to the same table file · ab8d2f6a
      Haobo Xu 提交于
      Summary:
      Current posix advice implementation ties up the access pattern hint with the creation of a file.
      It is not possible to apply different advice for different access (random get vs compaction read),
      without keeping two open files for the same table. This patch extended the RandomeAccessFile interface
      to accept new access hint at anytime. Particularly, we are able to set different access hint on the same
      table file based on when/how the file is used.
      Two options are added to set the access hint, after the file is first opened and after the file is being
      compacted.
      
      Test Plan: make check; db_stress; db_bench
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: MarkCallaghan, leveldb
      
      Differential Revision: https://reviews.facebook.net/D10905
      ab8d2f6a
  15. 30 5月, 2013 1 次提交
  16. 29 5月, 2013 2 次提交
  17. 25 5月, 2013 3 次提交