1. 04 8月, 2017 1 次提交
    • M
      Fix the overflow bug in AwaitState · 58410aee
      Maysam Yabandeh 提交于
      Summary:
      https://github.com/facebook/rocksdb/issues/2559 reports an overflow in AwaitState. nbronson has debugged the issue and presented the fix, which is applied to this patch. Moreover this patch adds more comments to clarify the logic in AwaitState.
      
      I tried with both 16 and 64 threads on update benchmark. The fix lowers cpu usage by 1.6 but also lowers the throughput by 1.6 and 2% respectively. Apparently the bug had favored using the spinning more often.
      
      Benchmarks:
      TEST_TMPDIR=/dev/shm/tmpdb time ./db_bench --benchmarks="fillrandom" --threads=16 --num=2000000
      TEST_TMPDIR=/dev/shm/tmpdb time ./db_bench --use_existing_db=1 --benchmarks="updaterandom[X3]" --threads=16 --num=2000000
      TEST_TMPDIR=/dev/shm/tmpdb time ./db_bench --use_existing_db=1 --benchmarks="updaterandom[X3]" --threads=64 --num=200000
      
      Results
      $ cat update-16t-bug.txt | tail -4
      updaterandom [AVG    3 runs] : 234117 ops/sec;   51.8 MB/sec
      updaterandom [MEDIAN 3 runs] : 233581 ops/sec;   51.7 MB/sec
      3896.42user 1539.12system 6:50.61elapsed 1323%CPU (0avgtext+0avgdata 331308maxresident)k
      0inputs+0outputs (0major+1281001minor)pagefaults 0swaps
      $ cat update-16t-fixed.txt | tail -4
      updaterandom [AVG    3 runs] : 230364 ops/sec;   51.0 MB/sec
      updaterandom [MEDIAN 3 runs] : 226169 ops/sec;   50.0 MB/sec
      3865.46user 1568.32system 6:57.63elapsed 1301%CPU (0avgtext+0avgdata 315012maxresident)k
      0inputs+0outputs (0major+1342568minor)pagefaults 0swaps
      
      $ cat update-64t-bug.txt | tail -4
      updaterandom [AVG    3 runs] : 261878 ops/sec;   57.9 MB/sec
      updaterandom [MEDIAN 3 runs] : 262859 ops/sec;   58.2 MB/sec
      926.27user 578.06system 2:27.46elapsed 1020%CPU (0avgtext+0avgdata 475480maxresident)k
      0inputs+0outputs (0major+1058728minor)pagefaults 0swaps
      $ cat update-64t-fixed.txt | tail -4
      updaterandom [AVG    3 runs] : 256699 ops/sec;   56.8 MB/sec
      updaterandom [MEDIAN 3 runs] : 256380 ops/sec;   56.7 MB/sec
      933.47user 575.37system 2:30.41elapsed 1003%CPU (0avgtext+0avgdata 482340maxresident)k
      0inputs+0outputs (0major+1078557minor)pagefaults 0swaps
      Closes https://github.com/facebook/rocksdb/pull/2679
      
      Differential Revision: D5553732
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 98b72dc3a8e0f22ea29d4f7c7790af10c369c5bb
      58410aee
  2. 03 8月, 2017 2 次提交
    • M
      Refactor TransactionImpl · c3d5c4d3
      Maysam Yabandeh 提交于
      Summary:
      This patch refactors TransactionImpl by separating the logic for pessimistic concurrency control from the implementation of how to write the data to rocksdb. The existing implementation is named WriteCommittedTxnImpl as it writes committed data to the db. A template named WritePreparedTxnImpl is also added which will be later completed to provide a an alternative implementation.
      Closes https://github.com/facebook/rocksdb/pull/2676
      
      Differential Revision: D5549998
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 16298e86b43ca4849324c1f35c731913c6d17bec
      c3d5c4d3
    • A
      support multiple CFs with OPTIONS file · 060ccd4f
      Andrew Kryczka 提交于
      Summary:
      Move an option necessary for running db_bench on multiple CFs into the general initialization area, so it works with both flag-based init and OPTIONS-based init.
      Closes https://github.com/facebook/rocksdb/pull/2675
      
      Differential Revision: D5541378
      
      Pulled By: ajkr
      
      fbshipit-source-id: 169926cb4ae95c17974f744faf7cc794d41e5c0a
      060ccd4f
  3. 02 8月, 2017 2 次提交
    • S
      Fix statistics in RocksJava sample · 34538706
      Sagar Vemuri 提交于
      Summary:
      I observed while doing a `make jtest` that the java sample was broken, due to the changes in #2551 .
      Closes https://github.com/facebook/rocksdb/pull/2674
      
      Differential Revision: D5539807
      
      Pulled By: sagar0
      
      fbshipit-source-id: 2c7e9d84778099dfa1c611996b444efe3c9fd466
      34538706
    • Y
      Dump Blob DB options to info log · 1900771b
      Yi Wu 提交于
      Summary:
      * Dump blob db options to info log
      * Remove BlobDBOptionsImpl to disallow dynamic cast *BlobDBOptions into *BlobDBOptionsImpl. Move options there to be constants or into BlobDBOptions. The dynamic cast is broken after #2645
      * Change some of the default options
      * Remove blob_db_options.min_blob_size, which is unimplemented. Will implement it soon.
      Closes https://github.com/facebook/rocksdb/pull/2671
      
      Differential Revision: D5529912
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: dcd58ca981db5bcc7f123b65a0d6f6ae0dc703c7
      1900771b
  4. 01 8月, 2017 3 次提交
  5. 29 7月, 2017 6 次提交
    • S
      Replace dynamic_cast<> · 21696ba5
      Siying Dong 提交于
      Summary:
      Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available.
      Some nontrivial changes:
      1. Add Comparator::GetRootComparator() to get around the internal comparator hack
      2. Add the two experiemental functions to DB
      3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string
      4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric.
      Closes https://github.com/facebook/rocksdb/pull/2645
      
      Differential Revision: D5502723
      
      Pulled By: siying
      
      fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c
      21696ba5
    • M
      Prevent empty memtables from using a lot of memory · e85f2c64
      Mike Kolupaev 提交于
      Summary:
      This fixes OOMs that we (logdevice) are currently having in production.
      
      SkipListRep constructor does a couple small allocations from ConcurrentArena (see InlineSkipList constructor). ConcurrentArena would sometimes allocate an entire block for that, which is a few megabytes (we use Options::arena_block_size = 4 MB). So an empty memtable can take take 4 MB of memory. We have ~40k column families (spread across 15 DB instances), so 4 MB per empty memtable easily OOMs a machine for us.
      
      This PR makes ConcurrentArena always allocate from Arena's inline block when possible. So as long as InlineSkipList's initial allocations are below 2 KB there would be no blocks allocated for empty memtables.
      Closes https://github.com/facebook/rocksdb/pull/2569
      
      Differential Revision: D5404029
      
      Pulled By: al13n321
      
      fbshipit-source-id: 568ec22a3fd1a485c06123f6b2dfc5e9ef67cd23
      e85f2c64
    • S
      Fix FIFO Compaction with TTL tests · ac748c57
      Sagar Vemuri 提交于
      Summary:
      - FIFOCompactionWithTTLTest was flaky when run in parallel earlier, and hence it was disabled. Fixed it now.
      - Also, faking sleep now instead of really sleeping to make tests more realistic by using TTLs like 1 hour and 1 day.
      Closes https://github.com/facebook/rocksdb/pull/2650
      
      Differential Revision: D5506038
      
      Pulled By: sagar0
      
      fbshipit-source-id: deb429a527f045e3e2c5138b547c3e8ac8586aa2
      ac748c57
    • Y
      Move blob_db/ttl_extractor.h into blob_db/blob_db.h · aaf42fe7
      Yi Wu 提交于
      Summary:
      Move blob_db/ttl_extractor.h into blob_db/blob_db.h
      Also exclude TTLExtractor from LITE build.
      Closes https://github.com/facebook/rocksdb/pull/2665
      
      Differential Revision: D5520009
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 4813dcc272c7cc4bf2cdac285256d9a17d78c7b7
      aaf42fe7
    • S
      Fix license headers in Cassandra related files · aace4651
      Sagar Vemuri 提交于
      Summary:
      I might have missed these while doing some recent cassandra code reviews.
      Closes https://github.com/facebook/rocksdb/pull/2663
      
      Differential Revision: D5520138
      
      Pulled By: sagar0
      
      fbshipit-source-id: 340930afe9efe03c75f535a1da1f89bd3e53c1f9
      aace4651
    • I
      CacheActivityLogger, component to log cache activity into a file · 50a96913
      Islam AbdelRahman 提交于
      Summary:
      Simple component that will add a new entry in a log file every time we lookup/insert a key in SimCache.
      API:
      ```
      SimCache::StartActivityLogging(<file_name>, <env>, <optional_max_size>)
      SimCache::StopActivityLogging()
      ```
      
      Sending for review, Still need to add more comments.
      
      I was thinking about a better approach, but I ended up deciding I will use a mutex to sync the writes to the file, since this feature should not be heavily used and only used to collect info that will be analyzed offline. I think it's okay to hold the mutex every time we lookup/add to the SimCache.
      Closes https://github.com/facebook/rocksdb/pull/2295
      
      Differential Revision: D5063826
      
      Pulled By: IslamAbdelRahman
      
      fbshipit-source-id: f3b5daed8b201987c9a071146ddd5c5740a2dd8c
      50a96913
  6. 28 7月, 2017 8 次提交
    • Y
      Blob DB TTL extractor · 6083bc79
      Yi Wu 提交于
      Summary:
      Introducing blob_db::TTLExtractor to replace extract_ttl_fn. The TTL
      extractor can be use to extract TTL from keys insert with Put or
      WriteBatch. Change over existing extract_ttl_fn are:
      * If value is changed, it will be return via std::string* (rather than Slice*). With Slice* the new value has to be part of the existing value. With std::string* the limitation is removed.
      * It can optionally return TTL or expiration.
      
      Other changes in this PR:
      * replace `std::chrono::system_clock` with `Env::NowMicros` so that I can mock time in tests.
      * add several TTL tests.
      * other minor naming change.
      Closes https://github.com/facebook/rocksdb/pull/2659
      
      Differential Revision: D5512627
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 0dfcb00d74d060b8534c6130c808e4d5d0a54440
      6083bc79
    • A
      fix asan/valgrind for TableCache cleanup · 710411ae
      Andrew Kryczka 提交于
      Summary:
      Breaking commit: d12691b8
      
      In the above commit, I moved the `TableCache` cleanup logic from `Version` destructor into `PurgeObsoleteFiles`. I missed cleaning up `TableCache` entries for the current `Version` during DB destruction.
      
      This PR adds that logic to `VersionSet` destructor. One unfortunate side effect is now we're potentially deleting `TableReader`s after `column_family_set_.reset()`, which means we can't call `BlockBasedTableReader::Close` a second time as the block cache might already be destroyed.
      Closes https://github.com/facebook/rocksdb/pull/2662
      
      Differential Revision: D5515108
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2cb820e19aa813e0d258d17f76b2d7b6b7ee0b18
      710411ae
    • Y
      TARGETS file not setting sse explicitly · 3a3fb00b
      Yi Wu 提交于
      Summary:
      We don't need to set them explicitly.
      Closes https://github.com/facebook/rocksdb/pull/2660
      
      Differential Revision: D5514141
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 10edebfc3cfe0afc00a34519f87fcea4d65069ae
      3a3fb00b
    • S
      Build fewer tests in Travis platform_dependent tests · fca4d6da
      Siying Dong 提交于
      Summary:
      platform_dependent tests in Travis now builds all tests, which is not needed. Only build those tests we need to run.
      Closes https://github.com/facebook/rocksdb/pull/2647
      
      Differential Revision: D5513954
      
      Pulled By: siying
      
      fbshipit-source-id: 4d540b146124e70dd25586c47939d19f93655b0a
      fca4d6da
    • A
      remove unnecessary internal_comparator param in newIterator · 8f553d3c
      Aaron Gao 提交于
      Summary:
      solved https://github.com/facebook/rocksdb/issues/2604
      Closes https://github.com/facebook/rocksdb/pull/2648
      
      Differential Revision: D5504875
      
      Pulled By: lightmark
      
      fbshipit-source-id: c14bb62ccbdc9e7bda9cd914cae4ea0765d882ee
      8f553d3c
    • S
      "ccache -C" in Travis · 7f6d012d
      Siying Dong 提交于
      Summary:
      This is to work around the problem of build error:
      
      util/threadpool_imp.o: file not recognized: File truncated
      
      Just to make the build go through. We should remove it later if we find the real long-term solution.
      Closes https://github.com/facebook/rocksdb/pull/2657
      
      Differential Revision: D5511034
      
      Pulled By: siying
      
      fbshipit-source-id: 229f024bd78ee96799017d4a89be74253058ec30
      7f6d012d
    • A
      move TableCache::EraseHandle outside of db mutex · d12691b8
      Andrew Kryczka 提交于
      Summary:
      Post-compaction work holds onto db mutex for the longest time (found by tracing lock acquires/releases with LTTng and correlating timestamps with our info log). Further experimentation showed `TableCache::EraseHandle` is responsible for ~86% of time mutex is held. We can just release the handle outside the db mutex.
      Closes https://github.com/facebook/rocksdb/pull/2654
      
      Differential Revision: D5507126
      
      Pulled By: ajkr
      
      fbshipit-source-id: 703c01ddf2aea16bc0f9e33c08935d78aa6b781d
      d12691b8
    • A
      fix db_bench argument type · f33f1136
      Andrew Kryczka 提交于
      Summary:
      it should be a bool
      Closes https://github.com/facebook/rocksdb/pull/2653
      
      Differential Revision: D5506148
      
      Pulled By: ajkr
      
      fbshipit-source-id: f142f0f3aa8b678c68adef12e5ac6e1e163306f3
      f33f1136
  7. 27 7月, 2017 5 次提交
  8. 26 7月, 2017 6 次提交
    • M
      Remove the orphan assert on !need_log_sync · 30b58cf7
      Maysam Yabandeh 提交于
      Summary:
      We initially had disabled support for write_options.sync when concurrent_prepare_ is set. We later added this support but the statement that asserts this combination is not used was left there. This patch cleans it up.
      Closes https://github.com/facebook/rocksdb/pull/2642
      
      Differential Revision: D5496101
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: becbc503446f2a51bee24cc861958c090c724ec2
      30b58cf7
    • Y
      Fix flaky write_callback_test · fe1a5559
      Yi Wu 提交于
      Summary:
      The test is failing occasionally on the assert: `ASSERT_TRUE(writer->state == WriteThread::State::STATE_INIT)`. This is because the test don't make the leader wait for long enough before updating state for its followers. The patch move the update to `threads_waiting` to the end of `WriteThread::JoinBatchGroup:Wait` callback to avoid this happening.
      
      Also adding `WriteThread::JoinBatchGroup:Start` and have each thread wait there while another thread is linking to the linked-list. This is to make the check of `is_leader` more deterministic.
      
      Also changing two while-loops of `compare_exchange_strong` to plain `fetch_add`, to make it look cleaner.
      Closes https://github.com/facebook/rocksdb/pull/2640
      
      Differential Revision: D5491525
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 6e897f122082bd6f98e6d51b31a25e5fd0a3fb82
      fe1a5559
    • Y
      5.6.1 release blog post · addbd279
      Yi Wu 提交于
      Summary:
      5.6.1 release blog post
      Closes https://github.com/facebook/rocksdb/pull/2638
      
      Differential Revision: D5491168
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 14e3a92a03684afa4bd19bfb3ffb053cc09f5d4a
      addbd279
    • A
      buckification: remove explicit `-msse*` compiler flags · 30edff30
      Andrew Gallagher 提交于
      Summary: These are implied by default platform flags, in particular, `-march=corei7`.
      
      Reviewed By: pixelb
      
      Differential Revision: D5485414
      
      fbshipit-source-id: 85f1329c71fa81a604760844187cc73877fb40e9
      30edff30
    • M
      Lower num of iterations in DeadlockCycle test · 2b259c9d
      Maysam Yabandeh 提交于
      Summary:
      Currently this test times out with tsan. This is likely due to decreased speed with tsan. By lowering the number of iterations we can still catch a bug as the test is run regularly and multiple runs of the test is equivalent with running the test with more iterations.
      Closes https://github.com/facebook/rocksdb/pull/2639
      
      Differential Revision: D5490549
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: bd69c42a9728d337ac95a06a401088384e51731a
      2b259c9d
    • M
      Release note for partitioned index/filters · 277f6f23
      Maysam Yabandeh 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/2637
      
      Differential Revision: D5489751
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 0298f8960d4f86ce67959616615beee4d802c2e4
      277f6f23
  9. 25 7月, 2017 7 次提交