1. 26 8月, 2016 8 次提交
    • S
      Mitigate regression bug of options.max_successive_merges hit during DB Recovery · dade61ac
      sdong 提交于
      Summary:
      After 1b8a2e8f, DB Pointer is passed to WriteBatchInternal::InsertInto() while DB recovery. This can cause deadlock if options.max_successive_merges hits. In that case DB::Get() will be called. Get() will try to acquire the DB mutex, which is already held by the DB::Open(), causing a deadlock condition.
      
      This commit mitigates the problem by not passing the DB pointer unless 2PC is allowed.
      
      Test Plan: Add a new test and run it.
      
      Reviewers: IslamAbdelRahman, andrewkr, kradhakrishnan, horuff
      
      Reviewed By: kradhakrishnan
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62625
      dade61ac
    • I
      [db_bench] Support single benchmark arguments (Repeat for X times, Warm up for... · cce702a6
      Islam AbdelRahman 提交于
      [db_bench] Support single benchmark arguments (Repeat for X times, Warm up for X times), Support CombinedStats (AVG / MEDIAN)
      
      Summary:
      This diff allow us to run a single benchmark X times and warm it up for Y times. and see the AVG & MEDIAN throughput of these X runs
      for example
      
      ```
      $ ./db_bench --benchmarks="fillseq,readseq[X5-W2]"
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      RocksDB:    version 4.12
      Date:       Wed Aug 24 10:45:26 2016
      CPU:        32 * Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
      CPUCache:   20480 KB
      Keys:       16 bytes each
      Values:     100 bytes each (50 bytes after compression)
      Entries:    1000000
      Prefix:    0 bytes
      Keys per prefix:    0
      RawSize:    110.6 MB (estimated)
      FileSize:   62.9 MB (estimated)
      Write rate: 0 bytes/second
      Compression: Snappy
      Memtablerep: skip_list
      Perf Level: 1
      WARNING: Assertions are enabled; benchmarks unnecessarily slow
      ------------------------------------------------
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      DB path: [/tmp/rocksdbtest-8616/dbbench]
      fillseq      :       4.695 micros/op 212971 ops/sec;   23.6 MB/s
      DB path: [/tmp/rocksdbtest-8616/dbbench]
      Warming up benchmark by running 2 times
      readseq      :       0.214 micros/op 4677005 ops/sec;  517.4 MB/s
      readseq      :       0.212 micros/op 4706834 ops/sec;  520.7 MB/s
      Running benchmark for 5 times
      readseq      :       0.218 micros/op 4588187 ops/sec;  507.6 MB/s
      readseq      :       0.208 micros/op 4816538 ops/sec;  532.8 MB/s
      readseq      :       0.213 micros/op 4685376 ops/sec;  518.3 MB/s
      readseq      :       0.214 micros/op 4676787 ops/sec;  517.4 MB/s
      readseq      :       0.217 micros/op 4618532 ops/sec;  510.9 MB/s
      readseq [AVG    5 runs] : 4677084 ops/sec;  517.4 MB/sec
      readseq [MEDIAN 5 runs] : 4676787 ops/sec;  517.4 MB/sec
      ```
      
      Test Plan: run db_bench
      
      Reviewers: sdong, andrewkr, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62235
      cce702a6
    • I
      cat tests logs sorted by exit code · 3586901f
      Islam AbdelRahman 提交于
      Summary:
      Instead of doing a cat for all the log files, we first sort them and by exit code and cat the failing tests at the end.
      This will make it easier to debug failing tests, since we will just need to look at the end of the logs instead of searching in them
      
      Test Plan: run it locally
      
      Reviewers: sdong, yiwu, lightmark, kradhakrishnan, yhchiang, andrewkr
      
      Reviewed By: andrewkr
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62211
      3586901f
    • J
      Persist data during user initiated shutdown · b2ce5953
      Justin Gibbs 提交于
      Summary:
      Move the manual memtable flush for databases containing data that has
      bypassed the WAL from DBImpl's destructor to CancleAllBackgroundWork().
      
      CancelAllBackgroundWork() is a publicly exposed API which allows
      async operations performed by background threads to be disabled on a
      database. In effect, this places the database into a "shutdown" state
      in advance of calling the database object's destructor. No compactions
      or flushing of SST files can occur once a call to this API completes.
      
      When writes are issued to a database with WriteOptions::disableWAL
      set to true, DBImpl::has_unpersisted_data_ is set so that
      memtables can be flushed when the database object is destroyed. If
      CancelAllBackgroundWork() has been called prior to DBImpl's destructor,
      this flush operation is not possible and is skipped, causing unnecessary
      loss of data.
      
      Since CancelAllBackgroundWork() is already invoked by DBImpl's destructor
      in order to perform the thread join portion of its cleanup processing,
      moving the manual memtable flush to CancelAllBackgroundWork() ensures
      data is persisted regardless of client behavior.
      
      Test Plan:
      Write an amount of data that will not cause a memtable flush to a rocksdb
      database with all writes marked with WriteOptions::disableWAL. Properly
      "close" the database. Reopen database and verify that the data was
      persisted.
      
      Reviewers: IslamAbdelRahman, yiwu, yoshinorim, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62277
      b2ce5953
    • I
      Fix parallel valgrind (valgrind_check) · 4b3438d2
      Islam AbdelRahman 提交于
      Summary:
      I just realized that when we run parallel valgrind we actually don't run the parallel tests under valgrind (we run the normally)
      This patch make sure that we run both parallel and non-parallel tests with valgrind
      
      Test Plan: DISABLE_JEMALLOC=1 make valgrind_check -j64
      
      Reviewers: andrewkr, yiwu, lightmark, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62469
      4b3438d2
    • A
      Relax consistency for thread-local ticker stats · a081f798
      Andrew Kryczka 提交于
      Summary: see discussion in D62337
      
      Test Plan: unit tests
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62577
      a081f798
    • A
    • A
      Fix the Windows build of RocksDB Java. Similar to... · f85f99bf
      Adam Retter 提交于
      Fix the Windows build of RocksDB Java. Similar to https://github.com/facebook/rocksdb/issues/1220 (#1284)
      
      f85f99bf
  2. 25 8月, 2016 5 次提交
    • M
      Fix a crash when compaction fails to open a file · 7b810951
      Mike Kolupaev 提交于
      Summary:
      We've got a crash with this stack trace:
      
        Program terminated with signal SIGTRAP, Trace/breakpoint trap.
      
        #0  0x00007fc85f2f4009 in raise () from /usr/local/fbcode/gcc-4.9-glibc-2.20-fb/lib/libpthread.so.0
        #1  0x00000000005c8f61 in facebook::logdevice::handle_sigsegv(int) () at logdevice/server/sigsegv.cpp:159
        #2  0x00007fc85f2f4150 in <signal handler called> () at /usr/local/fbcode/gcc-4.9-glibc-2.20-fb/lib/libpthread.so.0
        #3  0x00000000031ed80c in rocksdb::NewReadaheadRandomAccessFile() at util/file_reader_writer.cc:383
        #4  0x00000000031ed80c in rocksdb::NewReadaheadRandomAccessFile() at util/file_reader_writer.cc:472
        #5  0x00000000031558e7 in rocksdb::TableCache::GetTableReader() at db/table_cache.cc:99
        #6  0x0000000003156329 in rocksdb::TableCache::NewIterator() at db/table_cache.cc:198
        #7  0x0000000003166568 in rocksdb::VersionSet::MakeInputIterator() at db/version_set.cc:3345
        #8  0x000000000324a94f in rocksdb::CompactionJob::ProcessKeyValueCompaction(rocksdb::CompactionJob::SubcompactionState*) () at db/compaction_job.cc:650
        #9  0x000000000324c2f6 in rocksdb::CompactionJob::Run() () at db/compaction_job.cc:530
        #10 0x00000000030f5ae5 in rocksdb::DBImpl::BackgroundCompaction() at db/db_impl.cc:3269
        #11 0x0000000003108d36 in rocksdb::DBImpl::BackgroundCallCompaction(void*) () at db/db_impl.cc:2970
        #12 0x00000000029a2a9a in facebook::logdevice::RocksDBEnv::callback(void*) () at logdevice/server/locallogstore/RocksDBEnv.cpp:26
        #13 0x00000000029a2a9a in facebook::logdevice::RocksDBEnv::callback(void*) () at logdevice/server/locallogstore/RocksDBEnv.cpp:30
        #14 0x00000000031e7521 in rocksdb::ThreadPool::BGThread() at util/threadpool.cc:230
        #15 0x00000000031e7663 in rocksdb::BGThreadWrapper(void*) () at util/threadpool.cc:254
        #16 0x00007fc85f2ea7f1 in start_thread () at /usr/local/fbcode/gcc-4.9-glibc-2.20-fb/lib/libpthread.so.0
        #17 0x00007fc85e8fb46d in clone () at /usr/local/fbcode/gcc-4.9-glibc-2.20-fb/lib/libc.so.6
      
      From looking at the code, probably what happened is this:
       - `TableCache::GetTableReader()` called `Env::NewRandomAccessFile()`, which dispatched to a `PosixEnv::NewRandomAccessFile()`, where probably an `open()` call failed, so the `NewRandomAccessFile()` left a nullptr in the resulting file,
       - `TableCache::GetTableReader()` called `NewReadaheadRandomAccessFile()` with that `nullptr` file,
       - it tried to call file's method and crashed.
      
      This diff is a trivial fix to this crash.
      
      Test Plan: `make -j check`
      
      Reviewers: sdong, andrewkr, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62451
      7b810951
    • A
      Thread-specific ticker statistics · 7c958683
      Andrew Kryczka 提交于
      Summary:
      The global atomics we previously used for tickers had poor cache performance
      since they were typically updated from different threads, causing frequent
      invalidations. In this diff,
      
      - recordTick() updates a local ticker value specific to the thread in which it was called
      - When a thread exits, its local ticker value is added into merged_sum
      - getTickerCount() returns the sum of all threads' local ticker values and the merged_sum
      - setTickerCount() resets all threads' local ticker values and sets merged_sum to the value provided by the caller.
      
      In a next diff I will make a similar change for histogram stats.
      
      Test Plan:
      before:
      
        $ TEST_TMPDIR=/dev/shm/ perf record -g ./db_bench --benchmarks=readwhilewriting --statistics --num=1000000 --use_existing_db --threads=64 --cache_size=250000000 --compression_type=lz4
        $ perf report -g --stdio | grep recordTick
        7.59%  db_bench     db_bench             [.] rocksdb::StatisticsImpl::recordTick
        ...
      
      after:
      
        $ TEST_TMPDIR=/dev/shm/ perf record -g ./db_bench --benchmarks=readwhilewriting --statistics --num=1000000 --use_existing_db --threads=64 --cache_size=250000000 --compression_type=lz4
        $ perf report -g --stdio | grep recordTick
        1.46%  db_bench     db_bench             [.] rocksdb::StatisticsImpl::recordTick
        ...
      
      Reviewers: kradhakrishnan, MarkCallaghan, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: yiwu, andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62337
      7c958683
    • J
      Add initial GitHub pages infra for RocksDB documentation move and update. (#1294) · ea9e0757
      Joel Marcey 提交于
      This is the initial commit with the templates necessary to have our RocksDB user documentation hosted on GitHub pages.
      
      Ensure you meet requirements here: https://help.github.com/articles/setting-up-your-github-pages-site-locally-with-jekyll/#requirements
      
      Then you can run this right now by doing the following:
      
      ```
      % bundle install
      % bundle exec jekyll serve --config=_config.yml,_config_local_dev.yml
      ```
      
      Then go to: http://127.0.0.1:4000/
      
      Obviously, this is just the skeleton. Moving forward we will do these things in separate pull requests:
      
      - Replace logos with RocksDB logos
      - Update the color schemes
      - Add current information on rocksdb.org to markdown in this infra
      - Migrate current Wodpress blog to Jekyll and Disqus comments
      - Etc.
      ea9e0757
    • I
      [Flaky Test] Disable DBPropertiesTest.GetProperty · 2a9c9710
      Islam AbdelRahman 提交于
      Summary: Disable flaky test
      
      Test Plan: run it
      
      Reviewers: yiwu, andrewkr, kradhakrishnan, yhchiang, lightmark, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62487
      2a9c9710
    • Y
      Disable ClockCache db_crashtest · d76ddf32
      Yi Wu 提交于
      Summary: Tempororily disable clock cache in db_crashtest while we investigate data race issue with clock cache.
      
      Test Plan:
          python ./tools/db_crashtest.py blackbox
      
      Reviewers: sdong, lightmark, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62481
      d76ddf32
  3. 24 8月, 2016 6 次提交
  4. 23 8月, 2016 4 次提交
    • A
      Fold function for thread-local data · 6584cec8
      Andrew Kryczka 提交于
      Summary:
      This function allows the user to provide a custom function to fold all
      threads' local data. It will be used in my next diff for aggregating statistics
      stored in thread-local data. Note the test case uses atomics as thread-local
      values due to the synchronization requirement (documented in code).
      
      Test Plan: unit test
      
      Reviewers: yhchiang, sdong, kradhakrishnan
      
      Reviewed By: kradhakrishnan
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62049
      6584cec8
    • A
      Add singleDelete to RocksJava (#1275) · 817eeb29
      Adam Retter 提交于
      * Rename RocksDB#remove -> RocksDB#delete to match C++ API; Added deprecated versions of RocksDB#remove for backwards compatibility.
      
      * Add missing experimental feature RocksDB#singleDelete
      817eeb29
    • A
      Add Status to RocksDBException so that meaningful function result Status from... · ffdf6eee
      Adam Retter 提交于
      Add Status to RocksDBException so that meaningful function result Status from the C++ API isn't lost (#1273)
      
      ffdf6eee
    • A
      Fix bug in printing values for block-based table · ecf90038
      Andrew Kryczka 提交于
      Summary: value is not an InternalKey, we do not need to decode it
      
      Test Plan:
      setup:
      
        $ ldb put --create_if_missing=true k v
        $ ldb put --db=./tmp --create_if_missing k v
        $ ldb compact --db=./tmp
      
      before:
      
        $ sst_dump --command=raw --file=./tmp/000004.sst
        ...
        terminate called after throwing an instance of 'std::length_error'
      
      after:
      
        $ ./sst_dump --command=raw --file=./tmp/000004.sst
        $ cat tmp/000004_dump.txt
        ...
        ASCII  k : v
        ...
      
      Reviewers: sdong, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62301
      ecf90038
  5. 20 8月, 2016 4 次提交
    • Y
      LRU cache mid-point insertion · 72f8cc70
      Yi Wu 提交于
      Summary:
      Add mid-point insertion functionality to LRU cache. Caller of `Cache::Insert()` can set an additional parameter to make a cache entry have higher priority. The LRU cache will reserve at most `capacity * high_pri_pool_pct` bytes for high-pri cache entries. If `high_pri_pool_pct` is zero, the cache degenerates to normal LRU cache.
      
      Context: If we are to put index and filter blocks into RocksDB block cache, index/filter block can be swap out too early. We want to add an option to RocksDB to reserve some capacity in block cache just for index/filter blocks, to mitigate the issue.
      
      In later diffs I'll update block based table reader to use the interface to cache index/filter blocks at high priority, and expose the option to `DBOptions` and make it dynamic changeable.
      
      Test Plan: unit test.
      
      Reviewers: IslamAbdelRahman, sdong, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, march, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61977
      72f8cc70
    • I
      Add TablePropertiesCollector support in SstFileWriter · 6a17b07c
      Islam AbdelRahman 提交于
      Summary: Update SstFileWriter to use user TablePropertiesCollectors that are passed in Options
      
      Test Plan: unittests
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: jkedgar, andrewkr, hermanlee4, dhruba, yoshinorim
      
      Differential Revision: https://reviews.facebook.net/D62253
      6a17b07c
    • W
      TableBuilder / TableReader support for range deletion · 78837f5d
      Wanning Jiang 提交于
      Summary: 1. Range Deletion Tombstone structure 2. Modify Add() in table_builder to make it usable for adding range del tombstones 3. Expose NewTombstoneIterator() API in table_reader
      
      Test Plan: table_test.cc (now BlockBasedTableBuilder::Add() only accepts InternalKey. I make table_test only pass InternalKey to BlockBasedTableBuidler. Also test writing/reading range deletion tombstones in table_test )
      
      Reviewers: sdong, IslamAbdelRahman, lightmark, andrewkr
      
      Reviewed By: andrewkr
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61473
      78837f5d
    • Y
      Introduce ClockCache · 4cc37f59
      Yi Wu 提交于
      Summary:
      Clock-based cache implemenetation aim to have better concurreny than
      default LRU cache. See inline comments for implementation details.
      
      Test Plan:
      Update cache_test to run on both LRUCache and ClockCache. Adding some
      new tests to catch some of the bugs that I fixed while implementing the
      cache.
      
      Reviewers: kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61647
      4cc37f59
  6. 19 8月, 2016 1 次提交
    • Y
      Adding TBB as dependency. · ff17a2ab
      Yi Wu 提交于
      Summary: Splitting the makefile part of D55581.
      
      Test Plan:
        make all check -j32
        ROCKSDB_FBCODE_BUILD_WITH_481=1 make all check -j32
        ROCKSDB_NO_FBCODE=1 make all check -j32
      
        export TBB_BASE=/mnt/gvfs/third-party2/tbb/afa54b33cfcf93f1d90a3160cdb894d6d63d5dca/4.0_update2/gcc-4.9-glibc-2.20/e9936bf;
        ROCKSDB_NO_FBCODE=1 CFLAGS="-I $TBB_BASE/include" LDFLAGS="-L $TBB_BASE/lib -Wl,-rpath=$TBB_BASE/lib" make all check -j32
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: kradhakrishnan, yhchiang, IslamAbdelRahman, andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D56979
      ff17a2ab
  7. 18 8月, 2016 2 次提交
  8. 17 8月, 2016 2 次提交
    • D
      Small nits (#1280) · 3981345b
      Dmitri Smirnov 提交于
      * Create rate limiter using factory function in the test.
      
      * Convert function local statics in option helper to a C array
        that does not perform dynamic memory allocation. This is helpful
        when you try to memory isolate different DB instances.
      3981345b
    • Y
      Move LRUCache structs to lru_cache.h header · 2a2ebb6f
      Yi Wu 提交于
      Summary: ... so that I can include the header and create LRUCache specific tests for D61977
      
      Test Plan:
         make check
      
      Reviewers: lightmark, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62145
      2a2ebb6f
  9. 16 8月, 2016 8 次提交