1. 11 4月, 2020 7 次提交
  2. 10 4月, 2020 4 次提交
    • L
      Provide an allocator for new memory type to be used with RocksDB block cache (#6214) · 66a95f0f
      Luca Giacchino 提交于
      Summary:
      New memory technologies are being developed by various hardware vendors (Intel DCPMM is one such technology currently available). These new memory types require different libraries for allocation and management (such as PMDK and memkind). The high capacities available make it possible to provision large caches (up to several TBs in size), beyond what is achievable with DRAM.
      The new allocator provided in this PR uses the memkind library to allocate memory on different media.
      
      **Performance**
      
      We tested the new allocator using db_bench.
      - For each test, we vary the size of the block cache (relative to the size of the uncompressed data in the database).
      - The database is filled sequentially. Throughput is then measured with a readrandom benchmark.
      - We use a uniform distribution as a worst-case scenario.
      
      The plot shows throughput (ops/s) relative to a configuration with no block cache and default allocator.
      For all tests, p99 latency is below 500 us.
      
      ![image](https://user-images.githubusercontent.com/26400080/71108594-42479100-2178-11ea-8231-8a775bbc92db.png)
      
      **Changes**
      
      - Add MemkindKmemAllocator
      - Add --use_cache_memkind_kmem_allocator db_bench option (to create an LRU block cache with the new allocator)
      - Add detection of memkind library with KMEM DAX support
      - Add test for MemkindKmemAllocator
      
      **Minimum Requirements**
      
      - kernel 5.3.12
      - ndctl v67 - https://github.com/pmem/ndctl
      - memkind v1.10.0 - https://github.com/memkind/memkind
      
      **Memory Configuration**
      
      The allocator uses the MEMKIND_DAX_KMEM memory kind. Follow the instructions on[ memkind’s GitHub page](https://github.com/memkind/memkind) to set up NVDIMM memory accordingly.
      
      Note on memory allocation with NVDIMM memory exposed as system memory.
      - The MemkindKmemAllocator will only allocate from NVDIMM memory (using memkind_malloc with MEMKIND_DAX_KMEM kind).
      - The default allocator is not restricted to RAM by default. Based on NUMA node latency, the kernel should allocate from local RAM preferentially, but it’s a kernel decision. numactl --preferred/--membind can be used to allocate preferentially/exclusively from the local RAM node.
      
      **Usage**
      
      When creating an LRU cache, pass a MemkindKmemAllocator object as argument.
      For example (replace capacity with the desired value in bytes):
      
      ```
      #include "rocksdb/cache.h"
      #include "memory/memkind_kmem_allocator.h"
      
      NewLRUCache(
          capacity /*size_t*/,
          6 /*cache_numshardbits*/,
          false /*strict_capacity_limit*/,
          false /*cache_high_pri_pool_ratio*/,
          std::make_shared<MemkindKmemAllocator>());
      ```
      
      Refer to [RocksDB’s block cache documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache) to assign the LRU cache as block cache for a database.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6214
      
      Reviewed By: cheng-chang
      
      Differential Revision: D19292435
      
      fbshipit-source-id: 7202f47b769e7722b539c86c2ffd669f64d7b4e1
      66a95f0f
    • P
      Temporarily disable ppc64le unit tests in PRs (#6682) · 9d6974d3
      Peter Dillinger 提交于
      Summary:
      Until Travis gets its act together (https://github.com/facebook/rocksdb/issues/6653)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6682
      
      Test Plan: CI
      
      Reviewed By: riversand963
      
      Differential Revision: D20948865
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 215de523c91a83d2a159f466b853e700c925ba4f
      9d6974d3
    • S
      Fix memory corruption caused by new test in options_settable_test (#6676) · e860f884
      sdong 提交于
      Summary:
      https://github.com/facebook/rocksdb/pull/6668 added some new test code but it has a risk of memory corruption. Fix it
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6676
      
      Test Plan: Run the test under ASAN and see it passes.
      
      Reviewed By: ajkr
      
      Differential Revision: D20937108
      
      fbshipit-source-id: 22cc96bb02030df0a37a02e67a2cc37ca31ba22d
      e860f884
    • C
      Add two more optimization improvements to HISTORY (#6679) · 6e6f8079
      Cheng Chang 提交于
      Summary:
      Although these optimizations are not user facing, still feel it's valuable to call out in HISTORY.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6679
      
      Test Plan: no need
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D20945916
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: f3e790c07f3bcc4a8a74246c4fa232800ddd4438
      6e6f8079
  3. 09 4月, 2020 6 次提交
  4. 08 4月, 2020 6 次提交
  5. 07 4月, 2020 2 次提交
  6. 05 4月, 2020 1 次提交
    • P
      Add some timestamps in CI build+test output (#6643) · a67fb4c9
      Peter Dillinger 提交于
      Summary:
      When Travis times out, it's hard to determine whether
      the last executing thing took an excessively long time or the
      sum of all the work just exceeded the time limit. This
      change inserts some timestamps in the output that should
      make this easier to determine.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6643
      
      Test Plan: CI (Travis mostly)
      
      Reviewed By: anand1976
      
      Differential Revision: D20843901
      
      Pulled By: pdillinger
      
      fbshipit-source-id: e7aae5434b0c609931feddf238ce4355964488b7
      a67fb4c9
  7. 04 4月, 2020 5 次提交
    • S
      Fix clang anaylze warning caused by #6262 (#6641) · 00f8016b
      sdong 提交于
      Summary:
      https://github.com/facebook/rocksdb/pull/6262 causes CLANG analyze to complain. Add assertion to suppress the warning.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6641
      
      Test Plan: Run "clang analyze" and make sure it passes.
      
      Reviewed By: anand1976
      
      Differential Revision: D20841722
      
      fbshipit-source-id: 5fa6e0c5cfe7a822214c9b898a408df59d4fd2cd
      00f8016b
    • A
      fix compiler errors with -DNPERF_CONTEXT (#6642) · e60ea7fe
      Andrew Kryczka 提交于
      Summary:
      as titled
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6642
      
      Test Plan:
      ```
      $ EXTRA_CXXFLAGS="-DNPERF_CONTEXT" DEBUG_LEVEL=0 make -j48 db_bench
      ```
      
      Reviewed By: riversand963
      
      Differential Revision: D20842313
      
      Pulled By: ajkr
      
      fbshipit-source-id: a830cad312ca681591f06749242279503b101df2
      e60ea7fe
    • M
      Move the OptionTypeMap code closer to home (#6198) · 259b6ec8
      mrambacher 提交于
      Summary:
      This is a predecessor to the Configurable PR.  This change moves the OptionTypeInfo maps closer to where they will be used.
      
      When the Configurable changes are adopted, these values will become static and not associated with the OptionsHelper.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6198
      
      Reviewed By: siying
      
      Differential Revision: D20778108
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: a9f85fc73bc53503656e1958ecc1e764052fd1aa
      259b6ec8
    • P
      Revamp cache_bench to resemble a real workload (#6629) · 079e77ff
      Peter Dillinger 提交于
      Summary:
      I suspect LRUCache could use some optimization, and to support
      such an effort, a good benchmarking tool is needed. The existing
      cache_bench was heavily skewed toward insertion and lookup misses, and
      did not saturate memory with other work. This change should improve
      those things to better resemble a real workload.
      
      (All below using clang compiler, for some consistency, but not
      necessarily same version and settings.)
      
      The real workload is from production MySQL on RocksDB, filtering stacks
      containing "LRU", "ShardedCache" or "CacheShard."
      Lookup inclusive: 66%
      Insert inclusive: 17%
      Release inclusive: 15%
      
      An alternate simulated workload is MySQL running a LinkBench read test:
      Lookup inclusive: 54%
      Insert inclusive: 24%
      Release inclusive: 21%
      
      cache_bench default settings, prior to this change:
      Lookup inclusive: 35.8%
      Insert inclusive: 63.6%
      Release inclusive: 0%
      
      cache_bench after this change (intended as somewhat "tighter" workload
      than average production, more like LinkBench):
      Lookup inclusive: 52%
      Insert inclusive: 20%
      Release inclusive: 26%
      
      And top exclusive stacks (portion of stack samples as filtered above):
      Production MySQL:
      LRUHandleTable::FindPointer: 25.3%
      rocksdb::operator==: 15.1%  <-- Slice ==
      LRUCacheShard::LRU_Remove: 13.8%
      ShardedCache::Lookup: 8.9%
      __pthread_mutex_lock: 7.1%
      LRUCacheShard::LRU_Insert: 6.3%
      MurmurHash64A: 4.8%  <-- Since upgraded to XXH3p
      ...
      
      Old cache_bench:
      LRUHandleTable::FindPointer: 23.6%
      __pthread_mutex_lock: 15.0%
      __pthread_mutex_unlock_usercnt: 11.7%
      __lll_lock_wait: 8.6%
      __lll_unlock_wake: 6.8%
      LRUCacheShard::LRU_Insert: 6.0%
      ShardedCache::Lookup: 4.4%
      LRUCacheShard::LRU_Remove: 2.8%
      ...
      rocksdb::operator==: 0.2%  <-- Slice ==
      ...
      
      New cache_bench:
      LRUHandleTable::FindPointer: 22.8%
      __pthread_mutex_unlock_usercnt: 14.3%
      rocksdb::operator==: 10.5%  <-- Slice ==
      LRUCacheShard::LRU_Insert: 9.0%
      __pthread_mutex_lock: 5.9%
      LRUCacheShard::LRU_Remove: 5.0%
      ...
      ShardedCache::Lookup: 2.9%
      ...
      
      So there's a bit more lock contention in the benchmark than in
      production, but otherwise looks similar enough to me. At least it's a
      big improvement over the existing code.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6629
      
      Test Plan: No production code changes, ran cache_bench with ASAN
      
      Reviewed By: ltamasi
      
      Differential Revision: D20824318
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 6f8dc5891ead0f87edbed3a615ecd5289d9abe12
      079e77ff
    • B
      Fix msvc debug test failures (#6579) · df62cd5b
      Burton Li 提交于
      Summary:
      1. stats_history_test: one slice of stats history is 12526 Bytes, which is greater than original assumption.
      ![image](https://user-images.githubusercontent.com/17753898/77381970-5a611a80-6d3c-11ea-9d64-59d2e3c04f79.png)
      2. table_test: in VerifyBlockAccessTrace function, release trace reader before delete trace file.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6579
      
      Reviewed By: siying
      
      Differential Revision: D20767373
      
      Pulled By: pdillinger
      
      fbshipit-source-id: e8647d665cbe83a3f5429639c6219b50c0912124
      df62cd5b
  8. 03 4月, 2020 6 次提交
  9. 02 4月, 2020 3 次提交
    • Y
      Add counter in perf_context to time cipher time (#6596) · 2b02ea25
      Yi Wu 提交于
      Summary:
      Add `encrypt_data_time` and `decrypt_data_time` perf_context counters to time encryption/decryption time when `EnvEncryption` is enabled.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6596
      
      Test Plan: CI
      
      Reviewed By: anand1976
      
      Differential Revision: D20678617
      
      fbshipit-source-id: 7b57536143aa38509cde011f704de33382169e07
      2b02ea25
    • Z
      Add pipelined & parallel compression optimization (#6262) · 03a781a9
      Ziyue Yang 提交于
      Summary:
      This PR adds support for pipelined & parallel compression optimization for `BlockBasedTableBuilder`. This optimization makes block building, block compression and block appending a pipeline, and uses multiple threads to accelerate block compression. Users can set `CompressionOptions::parallel_threads` greater than 1 to enable compression parallelism.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6262
      
      Reviewed By: ajkr
      
      Differential Revision: D20651306
      
      fbshipit-source-id: 62125590a9c15b6d9071def9dc72589c1696a4cb
      03a781a9
    • S
      Add dependency of gtest on pthread (#6572) · 719c0f91
      Sylvain Oliver 提交于
      Summary:
      Compilation of rocksdb fails because -lpthread flag is needed by gtest
      
      **Before modification** :
      /usr/bin/c++   -W -Wextra -Wall -Wsign-compare -Wshadow -Wno-unused-parameter -Wno-unused-variable -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers -Wno-strict-aliasing -std=c++11 -march=native -Werror -fno-builtin-memcmp -g -DROCKSDB_USE_RTTI   CMakeFiles/table_reader_bench.dir/table/table_reader_bench.cc.o  -o table_reader_bench -Wl,-rpath,/develop/src/rocksdb/build librocksdb.so.6.8.0 libtestharness.a /usr/lib/x86_64-linux-gnu/libgflags.so -lpthread third-party/gtest-1.8.1/fused-src/gtest/libgtest.a
      
      **After modification** :
      /usr/bin/c++   -W -Wextra -Wall -Wsign-compare -Wshadow -Wno-unused-parameter -Wno-unused-variable -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers -Wno-strict-aliasing -std=c++11 -march=native -Werror -fno-builtin-memcmp -g -DROCKSDB_USE_RTTI   CMakeFiles/table_reader_bench.dir/table/table_reader_bench.cc.o  -o table_reader_bench -Wl,-rpath,/develop/src/rocksdb/build librocksdb.so.6.8.0 libtestharness.a /usr/lib/x86_64-linux-gnu/libgflags.so third-party/gtest-1.8.1/fused-src/gtest/libgtest.a -lpthread
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6572
      
      Reviewed By: anand1976
      
      Differential Revision: D20789059
      
      Pulled By: ajkr
      
      fbshipit-source-id: 97329f14b9044b12c8a415da3d5f27b256ff8ff7
      719c0f91