1. 06 1月, 2015 5 次提交
    • L
      benchmark.sh won't run through all tests properly if one specifies wal_dir to... · 9d5bd411
      Leonidas Galanis 提交于
      benchmark.sh won't run through all tests properly if one specifies wal_dir to be different than db directory.
      
      Summary:
      A command line like this to run all the tests:
      source benchmark.config.sh && nohup ./benchmark.sh 'bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting'
      where
      benchmark.config.sh is:
      export DB_DIR=/data/mysql/rocksdata
      export WAL_DIR=/txlogs/rockswal
      export OUTPUT_DIR=/root/rocks_benchmarking/output
      
      Will fail for the tests that need a new DB .
      
      Also 1) set disable_data_sync=0 and 2) add debug mode to run through all the tests more quickly
      
      Test Plan: run ./benchmark.sh 'debug,bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting' and verify that there are no complaints about WAL dir not being empty.
      
      Reviewers: sdong, yhchiang, rven, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D30909
      9d5bd411
    • I
      Deprecating skip_log_error_on_recovery · 62ad0a9b
      Igor Canadi 提交于
      Summary:
      Since https://reviews.facebook.net/D16119, we ignore partial tailing writes. Because of that, we no longer need skip_log_error_on_recovery.
      
      The documentation says "Skip log corruption error on recovery (If client is ok with losing most recent changes)", while the option actually ignores any corruption of the WAL (not only just the most recent changes). This is very dangerous and can lead to DB inconsistencies. This was originally set up to ignore partial tailing writes, which we now do automatically (after D16119). I have digged up old task t2416297 which confirms my findings.
      
      Test Plan: There was actually no tests that verified correct behavior of skip_log_error_on_recovery.
      
      Reviewers: yhchiang, rven, dhruba, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30603
      62ad0a9b
    • I
    • I
      Fail DB::Open() on WAL corruption · d7b4bb62
      Igor Canadi 提交于
      Summary:
      This is a serious bug. If paranod_check == true and WAL is corrupted, we don't fail DB::Open(). I tried going into history and it seems we've been doing this for a long long time.
      
      I found this when investigating t5852041.
      
      Test Plan: Added unit test to verify correct behavior.
      
      Reviewers: yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30597
      d7b4bb62
    • I
      Merge pull request #449 from robertabcd/improve-backupable · 9619081d
      Igor Canadi 提交于
      Improve backupable db performance on loading BackupMeta
      9619081d
  2. 05 1月, 2015 1 次提交
  3. 04 1月, 2015 2 次提交
  4. 31 12月, 2014 2 次提交
    • S
      Fix CLANG build for db_bench · e9ca3581
      sdong 提交于
      Summary: CLANG was broken for a recent change in db_ench. Fix it.
      
      Test Plan: Build db_bench using CLANG.
      
      Reviewers: rven, igor, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30801
      e9ca3581
    • Y
      Add structures for exposing thread events and operations. · bf287b76
      Yueh-Hsuan Chiang 提交于
      Summary:
      Add structures for exposing events and operations.  Event describes
      high-level action about a thread such as doing compaciton or
      doing flush, while an operation describes lower-level action
      of a thread such as reading / writing a SST table, waiting for
      mutex.  Events and operations are designed to be independent.
      One thread would typically involve in one event and one operation.
      
      Code instrument will be in a separate diff.
      
      Test Plan:
      Add unit-tests in thread_list_test
      make dbg -j32
      ./thread_list_test
      export ROCKSDB_TESTS=ThreadList
      ./db_test
      
      Reviewers: ljin, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: rven, jonahcohen, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D29781
      bf287b76
  5. 25 12月, 2014 1 次提交
  6. 24 12月, 2014 6 次提交
  7. 23 12月, 2014 6 次提交
  8. 22 12月, 2014 4 次提交
    • I
      Only execute flush from compaction if max_background_flushes = 0 · 4fd26f28
      Igor Canadi 提交于
      Summary: As title. We shouldn't need to execute flush from compaction if there are dedicated threads doing flushes.
      
      Test Plan: make check
      
      Reviewers: rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30579
      4fd26f28
    • I
      Speed up FindObsoleteFiles() · 0acc7388
      Igor Canadi 提交于
      Summary:
      There are two versions of FindObsoleteFiles():
      * full scan, which is executed every 6 hours (and it's terribly slow)
      * no full scan, which is executed every time a background process finishes and iterator is deleted
      
      This diff is optimizing the second case (no full scan). Here's what we do before the diff:
      * Get the list of obsolete files (files with ref==0). Some files in obsolete_files set might actually be live.
      * Get the list of live files to avoid deleting files that are live.
      * Delete files that are in obsolete_files and not in live_files.
      
      After this diff:
      * The only files with ref==0 that are still live are files that have been part of move compaction. Don't include moved files in obsolete_files.
      * Get the list of obsolete files (which exclude moved files).
      * No need to get the list of live files, since all files in obsolete_files need to be deleted.
      
      I'll post the benchmark results, but you can get the feel of it here: https://reviews.facebook.net/D30123
      
      This depends on D30123.
      
      P.S. We should do full scan only in failure scenarios, not every 6 hours. I'll do this in a follow-up diff.
      
      Test Plan:
      One new unit test. Made sure that unit test fails if we don't have a `if (!f->moved)` safeguard in ~Version.
      
      make check
      
      Big number of compactions and flushes:
      
        ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000
      
      Reviewers: yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30249
      0acc7388
    • I
      Merge pull request #442 from alabid/alabid/fix-example-typo · d8c4ce6b
      Igor Canadi 提交于
      fix really trivial typo in column families example
      d8c4ce6b
    • A
      fix really trivial typo · 949bd71f
      alabid 提交于
      949bd71f
  9. 21 12月, 2014 1 次提交
    • I
      Fix a SIGSEGV in BackgroundFlush · f8999fcf
      Igor Canadi 提交于
      Summary:
      This one wasn't easy to find :)
      
      What happens is we go through all cfds on flush_queue_ and find no cfds to flush, *but* the cfd is set to the last CF we looped through and following code assumes we want it flushed.
      
      BTW @sdong do you think we should also make BackgroundFlush() only check a single cfd for flushing instead of doing this `while (!flush_queue_.empty())`?
      
      Test Plan: regression test no longer fails
      
      Reviewers: sdong, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: sdong, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30591
      f8999fcf
  10. 20 12月, 2014 3 次提交
    • I
      MultiGet for DBWithTTL · ade4034a
      Igor Canadi 提交于
      Summary: This is a feature request from rocksdb's user. I didn't even realize we don't support multigets on TTL DB :)
      
      Test Plan: added a unit test
      
      Reviewers: yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30561
      ade4034a
    • I
      Rewritten system for scheduling background work · fdb6be4e
      Igor Canadi 提交于
      Summary:
      When scaling to higher number of column families, the worst bottleneck was MaybeScheduleFlushOrCompaction(), which did a for loop over all column families while holding a mutex. This patch addresses the issue.
      
      The approach is similar to our earlier efforts: instead of a pull-model, where we do something for every column family, we can do a push-based model -- when we detect that column family is ready to be flushed/compacted, we add it to the flush_queue_/compaction_queue_. That way we don't need to loop over every column family in MaybeScheduleFlushOrCompaction.
      
      Here are the performance results:
      
      Command:
      
          ./db_bench --write_buffer_size=268435456 --db_write_buffer_size=268435456 --db=/fast-rocksdb-tmp/rocks_lots_of_cf --use_existing_db=0 --open_files=55000 --statistics=1 --histogram=1 --disable_data_sync=1 --max_write_buffer_number=2 --sync=0 --benchmarks=fillrandom --threads=16 --num_column_families=5000  --disable_wal=1 --max_background_flushes=16 --max_background_compactions=16 --level0_file_num_compaction_trigger=2 --level0_slowdown_writes_trigger=2 --level0_stop_writes_trigger=3 --hard_rate_limit=1 --num=33333333 --writes=33333333
      
      Before the patch:
      
           fillrandom   :      26.950 micros/op 37105 ops/sec;    4.1 MB/s
      
      After the patch:
      
            fillrandom   :      17.404 micros/op 57456 ops/sec;    6.4 MB/s
      
      Next bottleneck is VersionSet::AddLiveFiles, which is painfully slow when we have a lot of files. This is coming in the next patch, but when I removed that code, here's what I got:
      
            fillrandom   :       7.590 micros/op 131758 ops/sec;   14.6 MB/s
      
      Test Plan:
      make check
      
      two stress tests:
      
      Big number of compactions and flushes:
      
          ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000
      
      max_background_flushes=0, to verify that this case also works correctly
      
          ./db_stress --threads=30 --ops_per_thread=2000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=3 --max_background_compactions=3 --max_background_flushes=0 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000
      
      Reviewers: ljin, rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D30123
      fdb6be4e
    • I
      Remove -mtune=native because it's redundant · a3001b1d
      Igor Canadi 提交于
      a3001b1d
  11. 19 12月, 2014 8 次提交
  12. 18 12月, 2014 1 次提交