1. 16 4月, 2020 3 次提交
    • M
      Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) · e45673de
      Mike Kolupaev 提交于
      Summary:
      Context: Index type `kBinarySearchWithFirstKey` added the ability for sst file iterator to sometimes report a key from index without reading the corresponding data block. This is useful when sst blocks are cut at some meaningful boundaries (e.g. one block per key prefix), and many seeks land between blocks (e.g. for each prefix, the ranges of keys in different sst files are nearly disjoint, so a typical seek needs to read a data block from only one file even if all files have the prefix). But this added a new error condition, which rocksdb code was really not equipped to deal with: `InternalIterator::value()` may fail with an IO error or Status::Incomplete, but it's just a method returning a Slice, with no way to report error instead. Before this PR, this type of error wasn't handled at all (an empty slice was returned), and kBinarySearchWithFirstKey implementation was considered a prototype.
      
      Now that we (LogDevice) have experimented with kBinarySearchWithFirstKey for a while and confirmed that it's really useful, this PR is adding the missing error handling.
      
      It's a pretty inconvenient situation implementation-wise. The error needs to be reported from InternalIterator when trying to access value. But there are ~700 call sites of `InternalIterator::value()`, most of which either can't hit the error condition (because the iterator is reading from memtable or from index or something) or wouldn't benefit from the deferred loading of the value (e.g. compaction iterator that reads all values anyway). Adding error handling to all these call sites would needlessly bloat the code. So instead I made the deferred value loading optional: only the call sites that may use deferred loading have to call the new method `PrepareValue()` before calling `value()`. The feature is enabled with a new bool argument `allow_unprepared_value` to a bunch of methods that create iterators (it wouldn't make sense to put it in ReadOptions because it's completely internal to iterators, with virtually no user-visible effect). Lmk if you have better ideas.
      
      Note that the deferred value loading only happens for *internal* iterators. The user-visible iterator (DBIter) always prepares the value before returning from Seek/Next/etc. We could go further and add an API to defer that value loading too, but that's most likely not useful for LogDevice, so it doesn't seem worth the complexity for now.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6621
      
      Test Plan: make -j5 check . Will also deploy to some logdevice test clusters and look at stats.
      
      Reviewed By: siying
      
      Differential Revision: D20786930
      
      Pulled By: al13n321
      
      fbshipit-source-id: 6da77d918bad3780522e918f17f4d5513d3e99ee
      e45673de
    • A
      Remove a printf from db_stress that's not useful info (#6705) · 610a09cc
      anand76 提交于
      Summary:
      This was causing db_crashtest.py to wrongly assume an error by parsing the output. Hopefully this will stabilize the crash tests.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6705
      
      Test Plan: make blackbox_crash_test
      
      Reviewed By: ltamasi
      
      Differential Revision: D21043335
      
      Pulled By: anand1976
      
      fbshipit-source-id: 5cddd112b124d4e2ebd11724a17d4ef0f50c1cf8
      610a09cc
    • S
      Two Improvements to tools/check_format_compatible.sh (#6702) · 165560fb
      sdong 提交于
      Summary:
      Improve it in two ways:
      1. tools/check_format_compatible.sh is not friendly to run outside FB environment. remove the hard-coded http proxy setting. Instead, move it to Legocastle configuration
      2. Always disable warning as error, so that older build is more likely to pass.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6702
      
      Test Plan: Run the test and make sure at least it doesn't break.
      
      Reviewed By: riversand963
      
      Differential Revision: D21033329
      
      fbshipit-source-id: 88b4ec1ec49547b772790050a165466bdc4a62a0
      165560fb
  2. 15 4月, 2020 2 次提交
  3. 14 4月, 2020 6 次提交
  4. 12 4月, 2020 1 次提交
    • Y
      Fix release build (#6690) · eeb3cf3f
      Yanqin Jin 提交于
      Summary:
      Fix release build caused by variable defined but unused.
      
      Test plan (devserver)
      ```
      make release
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6690
      
      Reviewed By: cheng-chang
      
      Differential Revision: D20980571
      
      Pulled By: riversand963
      
      fbshipit-source-id: c3f3b13f81dce4bdb19876dc2e710d5902ff8a02
      eeb3cf3f
  5. 11 4月, 2020 8 次提交
  6. 10 4月, 2020 4 次提交
    • L
      Provide an allocator for new memory type to be used with RocksDB block cache (#6214) · 66a95f0f
      Luca Giacchino 提交于
      Summary:
      New memory technologies are being developed by various hardware vendors (Intel DCPMM is one such technology currently available). These new memory types require different libraries for allocation and management (such as PMDK and memkind). The high capacities available make it possible to provision large caches (up to several TBs in size), beyond what is achievable with DRAM.
      The new allocator provided in this PR uses the memkind library to allocate memory on different media.
      
      **Performance**
      
      We tested the new allocator using db_bench.
      - For each test, we vary the size of the block cache (relative to the size of the uncompressed data in the database).
      - The database is filled sequentially. Throughput is then measured with a readrandom benchmark.
      - We use a uniform distribution as a worst-case scenario.
      
      The plot shows throughput (ops/s) relative to a configuration with no block cache and default allocator.
      For all tests, p99 latency is below 500 us.
      
      ![image](https://user-images.githubusercontent.com/26400080/71108594-42479100-2178-11ea-8231-8a775bbc92db.png)
      
      **Changes**
      
      - Add MemkindKmemAllocator
      - Add --use_cache_memkind_kmem_allocator db_bench option (to create an LRU block cache with the new allocator)
      - Add detection of memkind library with KMEM DAX support
      - Add test for MemkindKmemAllocator
      
      **Minimum Requirements**
      
      - kernel 5.3.12
      - ndctl v67 - https://github.com/pmem/ndctl
      - memkind v1.10.0 - https://github.com/memkind/memkind
      
      **Memory Configuration**
      
      The allocator uses the MEMKIND_DAX_KMEM memory kind. Follow the instructions on[ memkind’s GitHub page](https://github.com/memkind/memkind) to set up NVDIMM memory accordingly.
      
      Note on memory allocation with NVDIMM memory exposed as system memory.
      - The MemkindKmemAllocator will only allocate from NVDIMM memory (using memkind_malloc with MEMKIND_DAX_KMEM kind).
      - The default allocator is not restricted to RAM by default. Based on NUMA node latency, the kernel should allocate from local RAM preferentially, but it’s a kernel decision. numactl --preferred/--membind can be used to allocate preferentially/exclusively from the local RAM node.
      
      **Usage**
      
      When creating an LRU cache, pass a MemkindKmemAllocator object as argument.
      For example (replace capacity with the desired value in bytes):
      
      ```
      #include "rocksdb/cache.h"
      #include "memory/memkind_kmem_allocator.h"
      
      NewLRUCache(
          capacity /*size_t*/,
          6 /*cache_numshardbits*/,
          false /*strict_capacity_limit*/,
          false /*cache_high_pri_pool_ratio*/,
          std::make_shared<MemkindKmemAllocator>());
      ```
      
      Refer to [RocksDB’s block cache documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache) to assign the LRU cache as block cache for a database.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6214
      
      Reviewed By: cheng-chang
      
      Differential Revision: D19292435
      
      fbshipit-source-id: 7202f47b769e7722b539c86c2ffd669f64d7b4e1
      66a95f0f
    • P
      Temporarily disable ppc64le unit tests in PRs (#6682) · 9d6974d3
      Peter Dillinger 提交于
      Summary:
      Until Travis gets its act together (https://github.com/facebook/rocksdb/issues/6653)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6682
      
      Test Plan: CI
      
      Reviewed By: riversand963
      
      Differential Revision: D20948865
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 215de523c91a83d2a159f466b853e700c925ba4f
      9d6974d3
    • S
      Fix memory corruption caused by new test in options_settable_test (#6676) · e860f884
      sdong 提交于
      Summary:
      https://github.com/facebook/rocksdb/pull/6668 added some new test code but it has a risk of memory corruption. Fix it
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6676
      
      Test Plan: Run the test under ASAN and see it passes.
      
      Reviewed By: ajkr
      
      Differential Revision: D20937108
      
      fbshipit-source-id: 22cc96bb02030df0a37a02e67a2cc37ca31ba22d
      e860f884
    • C
      Add two more optimization improvements to HISTORY (#6679) · 6e6f8079
      Cheng Chang 提交于
      Summary:
      Although these optimizations are not user facing, still feel it's valuable to call out in HISTORY.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6679
      
      Test Plan: no need
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D20945916
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: f3e790c07f3bcc4a8a74246c4fa232800ddd4438
      6e6f8079
  7. 09 4月, 2020 6 次提交
  8. 08 4月, 2020 6 次提交
  9. 07 4月, 2020 2 次提交
  10. 05 4月, 2020 1 次提交
    • P
      Add some timestamps in CI build+test output (#6643) · a67fb4c9
      Peter Dillinger 提交于
      Summary:
      When Travis times out, it's hard to determine whether
      the last executing thing took an excessively long time or the
      sum of all the work just exceeded the time limit. This
      change inserts some timestamps in the output that should
      make this easier to determine.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6643
      
      Test Plan: CI (Travis mostly)
      
      Reviewed By: anand1976
      
      Differential Revision: D20843901
      
      Pulled By: pdillinger
      
      fbshipit-source-id: e7aae5434b0c609931feddf238ce4355964488b7
      a67fb4c9
  11. 04 4月, 2020 1 次提交