1. 19 10月, 2016 6 次提交
    • A
      not split file in compaciton on level 0 · 52c9808c
      Aaron Gao 提交于
      Summary: we should not split file on level 0 in compaction because it will fail the following verification of seqno order on level 0
      
      Test Plan: check with filldeterministic in db_bench
      
      Reviewers: yhchiang, andrewkr
      
      Reviewed By: andrewkr
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D65193
      52c9808c
    • A
      fix db_stress assertion failure · 5e0d6b4c
      Aaron Gao 提交于
      Summary: in rocksdb::DBIter::FindValueForCurrentKey(), last_not_merge_type could also be SingleDelete() which is omitted
      
      Test Plan: db_iter_test
      
      Reviewers: yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D65187
      5e0d6b4c
    • Y
      Bump RocksDB version to 4.13 (#1405) · ab539983
      Yueh-Hsuan Chiang 提交于
      Summary:
      Bump RocksDB version to 4.13
      
      Test Plan:
      unit tests
      
      Reviewers: sdong, IslamAbdelRahman, andrewkr, lightmark
      
      Subscribers: leveldb
      ab539983
    • S
      SamePrefixTest.InDomainTest to clear the test directory before testing · b4d07123
      sdong 提交于
      Summary: SamePrefixTest.InDomainTest may fail if the previous run of some test cases in prefix_test fail.
      
      Test Plan: Run the test
      
      Reviewers: lightmark, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65163
      b4d07123
    • I
      Avoid calling GetDBOptions() inside GetFromBatchAndDB() · aa09d033
      Islam AbdelRahman 提交于
      Summary:
      MyRocks hit a regression, @mung generated perf reports showing that the reason is the cost of calling `GetDBOptions()` inside `GetFromBatchAndDB()`
      This diff avoid calling `GetDBOptions` and use the `ImmutableDBOptions` instead
      
      Test Plan: make check -j64
      
      Reviewers: sdong, yiwu
      
      Reviewed By: yiwu
      
      Subscribers: andrewkr, dhruba, mung
      
      Differential Revision: https://reviews.facebook.net/D65151
      aa09d033
    • A
      Compaction Support for Range Deletion · 6fbe96ba
      Andrew Kryczka 提交于
      Summary:
      This diff introduces RangeDelAggregator, which takes ownership of iterators
      provided to it via AddTombstones(). The tombstones are organized in a two-level
      map (snapshot stripe -> begin key -> tombstone). Tombstone creation avoids data
      copy by holding Slices returned by the iterator, which remain valid thanks to pinning.
      
      For compaction, we create a hierarchical range tombstone iterator with structure
      matching the iterator over compaction input data. An aggregator based on that
      iterator is used by CompactionIterator to determine which keys are covered by
      range tombstones. In case of merge operand, the same aggregator is used by
      MergeHelper. Upon finishing each file in the compaction, relevant range tombstones
      are added to the output file's range tombstone metablock and file boundaries are
      updated accordingly.
      
      To check whether a key is covered by range tombstone, RangeDelAggregator::ShouldDelete()
      considers tombstones in the key's snapshot stripe. When this function is used outside of
      compaction, it also checks newer stripes, which can contain covering tombstones. Currently
      the intra-stripe check involves a linear scan; however, in the future we plan to collapse ranges
      within a stripe such that binary search can be used.
      
      RangeDelAggregator::AddToBuilder() adds all range tombstones in the table's key-range
      to a new table's range tombstone meta-block. Since range tombstones may fall in the gap
      between files, we may need to extend some files' key-ranges. The strategy is (1) first file
      extends as far left as possible and other files do not extend left, (2) all files extend right
      until either the start of the next file or the end of the last range tombstone in the gap,
      whichever comes first.
      
      One other notable change is adding release/move semantics to ScopedArenaIterator
      such that it can be used to transfer ownership of an arena-allocated iterator, similar to
      how unique_ptr is used for malloc'd data.
      
      Depends on D61473
      
      Test Plan: compaction_iterator_test, mock_table, end-to-end tests in D63927
      
      Reviewers: sdong, IslamAbdelRahman, wanning, yhchiang, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62205
      6fbe96ba
  2. 18 10月, 2016 3 次提交
  3. 15 10月, 2016 4 次提交
  4. 14 10月, 2016 7 次提交
    • A
      fix assertion failure in Prev() · 21e8dace
      Aaron Gao 提交于
      Summary:
      fix assertion failure in db_stress.
      It happens because of prefix seek key is larger than merge iterator key when they have the same user key
      
      Test Plan: ./db_stress --max_background_compactions=1 --max_write_buffer_number=3 --sync=0 --reopen=20 --write_buffer_size=33554432 --delpercent=5 --log2_keys_per_lock=10 --block_size=16384 --allow_concurrent_memtable_write=0 --test_batches_snapshots=0 --max_bytes_for_level_base=67108864 --progress_reports=0 --mmap_read=0 --writepercent=35 --disable_data_sync=0 --readpercent=50 --subcompactions=4 --ops_per_thread=20000000 --memtablerep=skip_list --prefix_size=0 --target_file_size_multiplier=1 --column_families=1 --threads=32 --disable_wal=0 --open_files=500000 --destroy_db_initially=0 --target_file_size_base=16777216 --nooverwritepercent=1 --iterpercent=10 --max_key=100000000 --prefixpercent=0 --use_clock_cache=false --kill_random_test=888887 --cache_size=1048576 --verify_checksum=1
      
      Reviewers: sdong, andrewkr, yiwu, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D65025
      21e8dace
    • D
      b9311aa6
    • S
      check_format_compatible.sh to use some branch which allows to run with GCC 4.8 (#1393) · a249a0b7
      Siying Dong 提交于
      Summary:
      Some older tags don't run GCC 4.8 with FB internal setting. Fixed them and created branches. Change the format compatible script accordingly.
      
      Also add more releases to check format compatibility.
      a249a0b7
    • Y
      Remove an assertion for single-delete in MergeHelper::MergeUntil · 040328a3
      Yueh-Hsuan Chiang 提交于
      Summary:
      Previously we have an assertion which triggers when we issue Merges
      after a single delete.  However, merges after a single delete are
      unrelated to that single delete.  Thus this behavior should be
      allowed.
      
      This will address a flakyness of db_stress.
      
      Test Plan: db_stress
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64923
      040328a3
    • Y
      Relax the acceptable bias RateLimiterTest::Rate test be 25% · 8cbe3e10
      Yueh-Hsuan Chiang 提交于
      Summary:
      In the current implementation of RateLimiter, the difference
      between the configured rate and the actual rate might be more
      than 20%, while our test only allows 15% difference.  This diff
      relaxes the acceptable bias RateLimiterTest::Rate test be 25%
      to make the test less flaky.
      
      Test Plan: rate_limiter_test
      
      Reviewers: IslamAbdelRahman, andrewkr, yiwu, lightmark, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64941
      8cbe3e10
    • I
      Log successful AddFile · f26a139d
      Islam AbdelRahman 提交于
      Summary: Log successful AddFile
      
      Test Plan: visually check LOG file
      
      Reviewers: yiwu, andrewkr, lightmark, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, jkedgar, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65019
      f26a139d
    • I
      Fix compaction conflict with running compaction · 5691a1d8
      Islam AbdelRahman 提交于
      Summary:
      Issue scenario:
      (1) We have 3 files in L1 and we issue a compaction that will compact them into 1 file in L2
      (2) While compaction (1) is running, we flush a file into L0 and trigger another compaction that decide to move this file to L1 and then move it again to L2 (this file don't overlap with any other files)
      (3) compaction (1) finishes and install the file it generated in L2, but this file overlap with the file we generated in (2) so we break the LSM consistency
      
      Looks like this issue can be triggered by using non-exclusive manual compaction or AddFile()
      
      Test Plan: unit tests
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: hermanlee4, jkedgar, andrewkr, dhruba, yoshinorim
      
      Differential Revision: https://reviews.facebook.net/D64947
      5691a1d8
  5. 13 10月, 2016 8 次提交
  6. 12 10月, 2016 5 次提交
    • P
      Fix log_write_bench -bytes_per_sync option. (#1375) · f8d8cf53
      Peter (Stig) Edwards 提交于
      Hello and thanks for RocksDB,
       
      When log_write_bench is run with the -bytes_per_sync option, the option does not influence any *sync* behaviour.
       
      > strace -e trace=write,sync_file_range ./log_write_bench -record_interval 0 -record_size 1048576 -num_records 11 -bytes_per_sync 2097152 2>&1 | egrep '^(sync|write.*XXXX)'
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
       
      I suspect that this is because the bytes_per_sync option now needs to be using a `WritableFileWriter` and not a `WritableFile`.
       
      With the diff below applied, it changes to:
       
      > strace -e trace=write,sync_file_range ./log_write_bench -record_interval 0 -record_size 1048576 -num_records 11 -bytes_per_sync 2097152 2>&1 | egrep '^(sync|write.*XXXX)'
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      sync_file_range(0x3, 0, 0x200000, 0x2)  = 0
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      sync_file_range(0x3, 0x200000, 0x200000, 0x2) = 0
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      sync_file_range(0x3, 0x400000, 0x200000, 0x2) = 0
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      sync_file_range(0x3, 0x600000, 0x200000, 0x2) = 0
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576
      sync_file_range(0x3, 0x800000, 0x200000, 0x2) = 0
       
      ( Note that the first 1MB is not synced as mentioned in util/file_reader_writer.cc::WritableFileWriter::Flush() )
       
      This diff also includes the fix from https://github.com/facebook/rocksdb/pull/1373
       
      > diff -du util/log_write_bench.cc.orig util/log_write_bench.cc
      --- util/log_write_bench.cc.orig        2016-10-04 12:06:29.115122580 -0400
      +++ util/log_write_bench.cc     2016-10-05 07:24:09.677037576 -0400
      @@ -14,6 +14,7 @@
       #include <gflags/gflags.h>
      
       #include "rocksdb/env.h"
      +#include "util/file_reader_writer.h"
       #include "util/histogram.h"
       #include "util/testharness.h"
       #include "util/testutil.h"
      @@ -38,19 +39,21 @@
         env_options.bytes_per_sync = FLAGS_bytes_per_sync;
         unique_ptr<WritableFile> file;
         env->NewWritableFile(file_name, &file, env_options);
      +  unique_ptr<WritableFileWriter> writer;
      +  writer.reset(new WritableFileWriter(std::move(file), env_options));
      
         std::string record;
      -  record.assign('X', FLAGS_record_size);
      +  record.assign(FLAGS_record_size, 'X');
      
         HistogramImpl hist;
      
         uint64_t start_time = env->NowMicros();
         for (int i = 0; i < FLAGS_num_records; i++) {
           uint64_t start_nanos = env->NowNanos();
      -    file->Append(record);
      -    file->Flush();
      +    writer->Append(record);
      +    writer->Flush();
           if (FLAGS_enable_sync) {
      -      file->Sync();
      +      writer->Sync(false);
           }
           hist.Add(env->NowNanos() - start_nanos);
      f8d8cf53
    • R
      Make txn->GetState() const · 02b3e398
      Reid Horuff 提交于
      Summary: makes Transaction::GetState() a const function.
      
      Test Plan: compiles.
      
      Reviewers: mung
      
      Reviewed By: mung
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D64929
      02b3e398
    • A
      new Prev() prefix support using SeekForPrev() · 447f1712
      Aaron Gao 提交于
      Summary:
      1) The previous solution for Prev() prefix support is not clean.
      Since I add api SeekForPrev(), now the Prev() can be symmetric to Next().
      and we do not need SeekToLast() to be called in Prev() any more.
      
      Also, Next() will Seek(prefix_seek_key_) to solve the problem of possible inconsistency between db_iter and merge_iter when
      there is merge_operator. And prefix_seek_key is only refreshed when change direction to forward.
      
      2) This diff also solves the bug of Iterator::SeekToLast() with iterate_upper_bound_ with prefix extractor.
      
      add test cases for the above two cases.
      
      There are some tests for the SeekToLast() in Prev(), I will clean them later.
      
      Test Plan: make all check
      
      Reviewers: IslamAbdelRahman, andrewkr, yiwu, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D63933
      447f1712
    • Y
      More block cache tickers · 991b585e
      Yi Wu 提交于
      Summary: Adding several missing block cache tickers.
      
      Test Plan:
        make all check
      
      Reviewers: IslamAbdelRahman, yhchiang, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64881
      991b585e
    • Y
      Add Statistics::getAndResetTickerCount(). · d6ae6dec
      Yi Wu 提交于
      Summary: A convience method to atomically get and reset ticker count. I'm wanting to use it to have a thin wrapper to the statistics object to export ticker counts to ODS for LogDevice (since they don't even use fb303).
      
      Test Plan:
      test in LogDevice shadow cluster.
      https://fburl.com/461868822
      
      Reviewers: andrewkr, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64869
      d6ae6dec
  7. 11 10月, 2016 1 次提交
  8. 08 10月, 2016 6 次提交