1. 18 12月, 2018 5 次提交
  2. 15 12月, 2018 3 次提交
  3. 14 12月, 2018 6 次提交
    • A
      Refine db_stress params for atomic flush (#4781) · 8d2b74d2
      Andrew Kryczka 提交于
      Summary:
      Separate flag for enabling option from flag for enabling dedicated atomic stress test. I have found setting the former without setting the latter can detect different problems.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4781
      
      Differential Revision: D13463211
      
      Pulled By: ajkr
      
      fbshipit-source-id: 054f777885b2dc7d5ea99faafa21d6537eee45fd
      8d2b74d2
    • M
      Fix race condition on options_file_number_ (#4780) · 34954233
      Maysam Yabandeh 提交于
      Summary:
      options_file_number_ must be written under db::mutex_ sine its read is protected by mutex_ in ::GetLiveFiles(). However currently it is written in ::RenameTempFileToOptionsFile() which according to its contract must be called without holding db::mutex_. The patch fixes the race condition by also acquitting the mutex_ before writing options_file_number_. Also it does that only if the rename of option file is successful.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4780
      
      Differential Revision: D13461411
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 2d5bae96a1f3e969ef2505b737cf2d7ae749787b
      34954233
    • Y
      Improve flushing multiple column families (#4708) · 4fce44fc
      Yanqin Jin 提交于
      Summary:
      If one column family is dropped, we should simply skip it and continue to flush
      other active ones.
      Currently we use Status::ShutdownInProgress to notify caller of column families
      being dropped. In the future, we should consider using a different Status code.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4708
      
      Differential Revision: D13378954
      
      Pulled By: riversand963
      
      fbshipit-source-id: 42f248cdf2d32d4c0f677cd39012694b8f1328ca
      4fce44fc
    • M
      Reduce runtime of compact_on_deletion_collector_test (#4779) · 67e5b542
      Maysam Yabandeh 提交于
      Summary:
      It sometimes times out with it is run with TSAN. The patch reduces the iteration from 50 to 30. This reduces the normal runtime from 5.2 to 3.1 seconds and should similarly address the TSAN timeout problem.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4779
      
      Differential Revision: D13456862
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: fdc0ad7d781b1c33b771d2415ff5fa2f1b5e2537
      67e5b542
    • D
      Get `CompactionJobInfo` from CompactFiles · 2670fe8c
      DorianZheng 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4716
      
      Differential Revision: D13207677
      
      Pulled By: ajkr
      
      fbshipit-source-id: d0ccf5a66df6cbb07288b0c5ebad81fd9df3926b
      2670fe8c
    • B
      Concurrent task limiter for compaction thread control (#4332) · a8b9891f
      Burton Li 提交于
      Summary:
      The PR is targeting to resolve the issue of:
      https://github.com/facebook/rocksdb/issues/3972#issue-330771918
      
      We have a rocksdb created with leveled-compaction with multiple column families (CFs), some of CFs are using HDD to store big and less frequently accessed data and others are using SSD.
      When there are continuously write traffics going on to all CFs, the compaction thread pool is mostly occupied by those slow HDD compactions, which blocks fully utilize SSD bandwidth.
      Since atomic write and transaction is needed across CFs, so splitting it to multiple rocksdb instance is not an option for us.
      
      With the compaction thread control, we got 30%+ HDD write throughput gain, and also a lot smooth SSD write since less write stall happening.
      
      ConcurrentTaskLimiter can be shared with multi-CFs across rocksdb instances, so the feature does not only work for multi-CFs scenarios, but also for multi-rocksdbs scenarios, who need disk IO resource control per tenant.
      
      The usage is straight forward:
      e.g.:
      
      //
      // Enable compaction thread limiter thru ColumnFamilyOptions
      //
      std::shared_ptr<ConcurrentTaskLimiter> ctl(NewConcurrentTaskLimiter("foo_limiter", 4));
      Options options;
      ColumnFamilyOptions cf_opt(options);
      cf_opt.compaction_thread_limiter = ctl;
      ...
      
      //
      // Compaction thread limiter can be tuned or disabled on-the-fly
      //
      ctl->SetMaxOutstandingTask(12); // enlarge to 12 tasks
      ...
      ctl->ResetMaxOutstandingTask(); // disable (bypass) thread limiter
      ctl->SetMaxOutstandingTask(-1); // Same as above
      ...
      ctl->SetMaxOutstandingTask(0);  // full throttle (0 task)
      
      //
      // Sharing compaction thread limiter among CFs (to resolve multiple storage perf issue)
      //
      std::shared_ptr<ConcurrentTaskLimiter> ctl_ssd(NewConcurrentTaskLimiter("ssd_limiter", 8));
      std::shared_ptr<ConcurrentTaskLimiter> ctl_hdd(NewConcurrentTaskLimiter("hdd_limiter", 4));
      Options options;
      ColumnFamilyOptions cf_opt_ssd1(options);
      ColumnFamilyOptions cf_opt_ssd2(options);
      ColumnFamilyOptions cf_opt_hdd1(options);
      ColumnFamilyOptions cf_opt_hdd2(options);
      ColumnFamilyOptions cf_opt_hdd3(options);
      
      // SSD CFs
      cf_opt_ssd1.compaction_thread_limiter = ctl_ssd;
      cf_opt_ssd2.compaction_thread_limiter = ctl_ssd;
      
      // HDD CFs
      cf_opt_hdd1.compaction_thread_limiter = ctl_hdd;
      cf_opt_hdd2.compaction_thread_limiter = ctl_hdd;
      cf_opt_hdd3.compaction_thread_limiter = ctl_hdd;
      
      ...
      
      //
      // The limiter is disabled by default (or set to nullptr explicitly)
      //
      Options options;
      ColumnFamilyOptions cf_opt(options);
      cf_opt.compaction_thread_limiter = nullptr;
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4332
      
      Differential Revision: D13226590
      
      Pulled By: siying
      
      fbshipit-source-id: 14307aec55b8bd59c8223d04aa6db3c03d1b0c1d
      a8b9891f
  4. 13 12月, 2018 1 次提交
    • M
      Fix flaky test DBCompactionTest::DeleteFileRange (#4776) · 0aa17c10
      Maysam Yabandeh 提交于
      Summary:
      The test has been failing sporadically probably because the configured compaction options were actually unused. Verified that by the following:
      ```
      ~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.DeleteFileRange --repeat=1000
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4776
      
      Differential Revision: D13441052
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d35075b9e6cef9b9c9d0d571f9cd72ade8eda55d
      0aa17c10
  5. 12 12月, 2018 7 次提交
  6. 11 12月, 2018 4 次提交
    • B
      Promote CompactionFilter* accessors to ColumnFamilyOptionsInterface (#3461) · 8261e002
      Ben Clay 提交于
      Summary:
      When adding CompactionFilter and CompactionFilterFactory settings to the Java layer, ColumnFamilyOptions was modified directly instead of ColumnFamilyOptionsInterface. This meant that the old-stye Options monolith was left behind.
      
      This patch fixes that, by:
      - promoting the CompactionFilter + CompactionFilterFactory setters from ColumnFamilyOptions -> ColumnFamilyOptionsInterface
      - adding getters in ColumnFamilyOptionsInterface
      - implementing setters in Options
      - implementing getters in both ColumnFamilyOptions and Options
      - adding testcases
      - reusing a test CompactionFilterFactory by moving it to a common location
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/3461
      
      Differential Revision: D13278788
      
      Pulled By: sagar0
      
      fbshipit-source-id: 72602c6eb97dc80734e718abb5e2e9958d3c753b
      8261e002
    • A
      Properly set smallest key of subcompaction output (#4723) · 64aabc91
      Abhishek Madan 提交于
      Summary:
      It is possible to see a situation like the following when
      subcompactions are enabled:
      1. A subcompaction boundary is set to `[b, e)`.
      2. The first output file in a subcompaction has `c@20` as its smallest key
      3. The range tombstone `[a, d)30` is encountered.
      4. The tombstone is written to the range-del meta block and the new
         smallest key is set to `b@0` (since no keys in this subcompaction's
         output can be smaller than `b`).
      5. A key `b@10` in a lower level will now reappear, since it is not
         covered by the truncated start key `b@0`.
      
      In general, unless the smallest data key in a file has a seqnum of 0, it
      is not safe to truncate a tombstone at the start key to have a seqnum of
      0, since it can expose keys with a seqnum greater than 0 but less than
      the tombstone's actual seqnum.
      
      To fix this, when the lower bound of a file is from the subcompaction
      boundaries, we now set the seqnum of an artificially extended smallest
      key to the tombstone's seqnum. This is safe because subcompactions
      operate over disjoint sets of keys, and the subcompactions that can
      experience this problem are not the first subcompaction (which is
      unbounded on the left).
      
      Furthermore, there is now an assertion to detect the described anomalous
      case.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4723
      
      Differential Revision: D13236188
      
      Pulled By: abhimadan
      
      fbshipit-source-id: a6da6a113f2de1e2ff307ca72e055300c8fe5692
      64aabc91
    • A
      Reduce javadoc warnings (#4764) · 10e7de77
      Adam Singer 提交于
      Summary:
      Compile logs have a bit of noise due to missing javadoc annotations. Updating docs to reduce.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4764
      
      Differential Revision: D13400193
      
      Pulled By: sagar0
      
      fbshipit-source-id: 65c7efb70747cc3bb35a336a6881ea6536ae5ff4
      10e7de77
    • M
      Fix inline comments for assumed_tracked (#4762) · 21fca397
      Maysam Yabandeh 提交于
      Summary:
      Fix the definition of assumed_tracked in Transaction that was introduced in #4680
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4762
      
      Differential Revision: D13399150
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 2a30fe49e3c44adacd7e45cd48eae95023ca9dca
      21fca397
  7. 08 12月, 2018 5 次提交
  8. 07 12月, 2018 1 次提交
    • M
      Extend Transaction::GetForUpdate with do_validate (#4680) · b878f93c
      Maysam Yabandeh 提交于
      Summary:
      Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot (if there is any) before doing the read. After the read it also returns the latest value (expects the ReadOptions::snapshot to be nullptr). This allows RocksDB applications to use GetForUpdate similarly to how InnoDB does. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false).
      The Java APIs are accordingly updated.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4680
      
      Differential Revision: D13068508
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: f0b59db28f7f6a078b60844d902057140765e67d
      b878f93c
  9. 06 12月, 2018 4 次提交
    • Y
      Update HISTORY.md (#4753) · 1d679e35
      Yanqin Jin 提交于
      Summary:
      As titled. Update history to include a recent bug fix in
      9be3e6b4.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4753
      
      Differential Revision: D13350286
      
      Pulled By: riversand963
      
      fbshipit-source-id: b6324780dee4cb1757bc2209403a08531c150c08
      1d679e35
    • Y
      Allow file-ingest-triggered flush to skip waiting for write-stall clear (#4751) · 9be3e6b4
      Yanqin Jin 提交于
      Summary:
      When write stall has already been triggered due to number of L0 files reaching
      threshold, file ingestion must proceed with its flush without waiting for the
      write stall condition to cleared by the compaction because compaction can wait
      for ingestion to finish (circular wait).
      
      In order to avoid this wait, we can set `FlushOptions.allow_write_stall` to be
      true (default is false). Setting it to false can cause deadlock.
      
      This can happen when the number of compaction threads is low.
      
      Considere the following
      ```
      Time  compaction_thread                        ingestion_thread
       |                                             num_running_ingest_file_++
       |    while(num_running_ingest_file_>0){wait}
       |                                             flush
       V
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4751
      
      Differential Revision: D13343037
      
      Pulled By: riversand963
      
      fbshipit-source-id: d3b95938814af46ec4c463feff0b50c70bd8b23f
      9be3e6b4
    • Y
      Move a function to critical section (#4752) · b96fccb1
      Yanqin Jin 提交于
      Summary:
      Test plan
      ```
      $make clean && make -j32 all check
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4752
      
      Differential Revision: D13344705
      
      Pulled By: riversand963
      
      fbshipit-source-id: fc3a43174d09d70ccc2b09decd78e1da1b6ba9d1
      b96fccb1
    • A
      Fix buck dev mode fbcode builds (#4747) · e58d7695
      anand76 提交于
      Summary:
      Don't enable ROCKSDB_JEMALLOC unless the build mode is opt and default
      allocator is jemalloc. In dev mode, this is causing compile/link errors such as -
      ```
      stderr: buck-out/dev/gen/rocksdb/src/rocksdb_lib#compile-pic-malloc_stats.cc.o4768b59e,gcc-5-glibc-2.23-clang/db/malloc_stats.cc.o:malloc_stats.cc:function rocksdb::DumpMallocStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*): error: undefined reference to 'malloc_stats_print'
      clang-7.0: error: linker command failed with exit code 1
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4747
      
      Differential Revision: D13324840
      
      Pulled By: anand1976
      
      fbshipit-source-id: 45ffbd4f63fe4d9e8a0473d8f066155e4ef64a14
      e58d7695
  10. 04 12月, 2018 1 次提交
  11. 01 12月, 2018 3 次提交