1. 02 3月, 2022 1 次提交
  2. 18 12月, 2021 1 次提交
  3. 30 11月, 2021 1 次提交
  4. 08 9月, 2021 1 次提交
  5. 17 8月, 2021 1 次提交
    • A
      Add a stat to count secondary cache hits (#8666) · add68bd2
      anand76 提交于
      Summary:
      Add a stat for secondary cache hits. The ```Cache::Lookup``` API had an unused ```stats``` parameter. This PR uses that to pass the pointer to a ```Statistics``` object that ```LRUCache``` uses to record the stat.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8666
      
      Test Plan: Update a unit test in lru_cache_test
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30353816
      
      Pulled By: anand1976
      
      fbshipit-source-id: 2046f78b460428877a26ffdd2bb914ae47dfbe77
      add68bd2
  6. 20 5月, 2021 1 次提交
    • P
      Use deleters to label cache entries and collect stats (#8297) · 311a544c
      Peter Dillinger 提交于
      Summary:
      This change gathers and publishes statistics about the
      kinds of items in block cache. This is especially important for
      profiling relative usage of cache by index vs. filter vs. data blocks.
      It works by iterating over the cache during periodic stats dump
      (InternalStats, stats_dump_period_sec) or on demand when
      DB::Get(Map)Property(kBlockCacheEntryStats), except that for
      efficiency and sharing among column families, saved data from
      the last scan is used when the data is not considered too old.
      
      The new information can be seen in info LOG, for example:
      
          Block cache LRUCache@0x7fca62229330 capacity: 95.37 MB collections: 8 last_copies: 0 last_secs: 0.00178 secs_since: 0
          Block cache entry stats(count,size,portion): DataBlock(7092,28.24 MB,29.6136%) FilterBlock(215,867.90 KB,0.888728%) FilterMetaBlock(2,5.31 KB,0.00544%) IndexBlock(217,180.11 KB,0.184432%) WriteBuffer(1,256.00 KB,0.262144%) Misc(1,0.00 KB,0%)
      
      And also through DB::GetProperty and GetMapProperty (here using
      ldb just for demonstration):
      
          $ ./ldb --db=/dev/shm/dbbench/ get_property rocksdb.block-cache-entry-stats
          rocksdb.block-cache-entry-stats.bytes.data-block: 0
          rocksdb.block-cache-entry-stats.bytes.deprecated-filter-block: 0
          rocksdb.block-cache-entry-stats.bytes.filter-block: 0
          rocksdb.block-cache-entry-stats.bytes.filter-meta-block: 0
          rocksdb.block-cache-entry-stats.bytes.index-block: 178992
          rocksdb.block-cache-entry-stats.bytes.misc: 0
          rocksdb.block-cache-entry-stats.bytes.other-block: 0
          rocksdb.block-cache-entry-stats.bytes.write-buffer: 0
          rocksdb.block-cache-entry-stats.capacity: 8388608
          rocksdb.block-cache-entry-stats.count.data-block: 0
          rocksdb.block-cache-entry-stats.count.deprecated-filter-block: 0
          rocksdb.block-cache-entry-stats.count.filter-block: 0
          rocksdb.block-cache-entry-stats.count.filter-meta-block: 0
          rocksdb.block-cache-entry-stats.count.index-block: 215
          rocksdb.block-cache-entry-stats.count.misc: 1
          rocksdb.block-cache-entry-stats.count.other-block: 0
          rocksdb.block-cache-entry-stats.count.write-buffer: 0
          rocksdb.block-cache-entry-stats.id: LRUCache@0x7f3636661290
          rocksdb.block-cache-entry-stats.percent.data-block: 0.000000
          rocksdb.block-cache-entry-stats.percent.deprecated-filter-block: 0.000000
          rocksdb.block-cache-entry-stats.percent.filter-block: 0.000000
          rocksdb.block-cache-entry-stats.percent.filter-meta-block: 0.000000
          rocksdb.block-cache-entry-stats.percent.index-block: 2.133751
          rocksdb.block-cache-entry-stats.percent.misc: 0.000000
          rocksdb.block-cache-entry-stats.percent.other-block: 0.000000
          rocksdb.block-cache-entry-stats.percent.write-buffer: 0.000000
          rocksdb.block-cache-entry-stats.secs_for_last_collection: 0.000052
          rocksdb.block-cache-entry-stats.secs_since_last_collection: 0
      
      Solution detail - We need some way to flag what kind of blocks each
      entry belongs to, preferably without changing the Cache API.
      One of the complications is that Cache is a general interface that could
      have other users that don't adhere to whichever convention we decide
      on for keys and values. Or we would pay for an extra field in the Handle
      that would only be used for this purpose.
      
      This change uses a back-door approach, the deleter, to indicate the
      "role" of a Cache entry (in addition to the value type, implicitly).
      This has the added benefit of ensuring proper code origin whenever we
      recognize a particular role for a cache entry; if the entry came from
      some other part of the code, it will use an unrecognized deleter, which
      we simply attribute to the "Misc" role.
      
      An internal API makes for simple instantiation and automatic
      registration of Cache deleters for a given value type and "role".
      
      Another internal API, CacheEntryStatsCollector, solves the problem of
      caching the results of a scan and sharing them, to ensure scans are
      neither excessive nor redundant so as not to harm Cache performance.
      
      Because code is added to BlocklikeTraits, it is pulled out of
      block_based_table_reader.cc into its own file.
      
      This is a reformulation of https://github.com/facebook/rocksdb/issues/8276, without the type checking option
      (could still be added), and with actual stat gathering.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8297
      
      Test Plan: manual testing with db_bench, and a couple of basic unit tests
      
      Reviewed By: ltamasi
      
      Differential Revision: D28488721
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 472f524a9691b5afb107934be2d41d84f2b129fb
      311a544c
  7. 14 5月, 2021 1 次提交
    • A
      Initial support for secondary cache in LRUCache (#8271) · feb06e83
      anand76 提交于
      Summary:
      Defined the abstract interface for a secondary cache in include/rocksdb/secondary_cache.h, and updated LRUCacheOptions to take a std::shared_ptr<SecondaryCache>. An item is initially inserted into the LRU (primary) cache. When it ages out and evicted from memory, its inserted into the secondary cache. On a LRU cache miss and successful lookup in the secondary cache, the item is promoted to the LRU cache. Only support synchronous lookup currently. The secondary cache would be used to implement a persistent (flash cache) or compressed cache.
      
      Tests:
      Results from cache_bench and db_bench don't show any regression due to these changes.
      
      cache_bench results before and after this change -
      Command
      ```./cache_bench -ops_per_thread=10000000 -threads=1```
      Before
      ```Complete in 40.688 s; QPS = 245774```
      ```Complete in 40.486 s; QPS = 246996```
      ```Complete in 42.019 s; QPS = 237989```
      After
      ```Complete in 40.672 s; QPS = 245869```
      ```Complete in 44.622 s; QPS = 224107```
      ```Complete in 42.445 s; QPS = 235599```
      
      db_bench results before this change, and with this change + https://github.com/facebook/rocksdb/issues/8213 and https://github.com/facebook/rocksdb/issues/8191 -
      Commands
      ```./db_bench  --benchmarks="fillseq,compact" -num=30000000 -key_size=32 -value_size=256 -use_direct_io_for_flush_and_compaction=true -db=/home/anand76/nvm_cache/db -partition_index_and_filters=true```
      
      ```./db_bench -db=/home/anand76/nvm_cache/db -use_existing_db=true -benchmarks=readrandom -num=30000000 -key_size=32 -value_size=256 -use_direct_reads=true -cache_size=1073741824 -cache_numshardbits=6 -cache_index_and_filter_blocks=true -read_random_exp_range=17 -statistics -partition_index_and_filters=true -threads=16 -duration=300```
      Before
      ```
      DB path: [/home/anand76/nvm_cache/db]
      readrandom   :      80.702 micros/op 198104 ops/sec;   54.4 MB/s (3708999 of 3708999 found)
      ```
      ```
      DB path: [/home/anand76/nvm_cache/db]
      readrandom   :      87.124 micros/op 183625 ops/sec;   50.4 MB/s (3439999 of 3439999 found)
      ```
      After
      ```
      DB path: [/home/anand76/nvm_cache/db]
      readrandom   :      77.653 micros/op 206025 ops/sec;   56.6 MB/s (3866999 of 3866999 found)
      ```
      ```
      DB path: [/home/anand76/nvm_cache/db]
      readrandom   :      84.962 micros/op 188299 ops/sec;   51.7 MB/s (3535999 of 3535999 found)
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8271
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D28357511
      
      Pulled By: anand1976
      
      fbshipit-source-id: d1cfa236f00e649a18c53328be10a8062a4b6da2
      feb06e83
  8. 13 5月, 2021 1 次提交
    • J
      Fix a minor clang release build failure (#8290) · a6e425dc
      Jay Zhuang 提交于
      Summary:
      Error message:
      ```
      cache/clock_cache.cc:434:14: error: implicit conversion loses integer precision: 'size_t' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') [-Werror,-Wshorten-64-to-32]
          *state = end_idx;
                 ~ ^~~~~~~
      ```
      Make circleci to cover this case by install tbb.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8290
      
      Test Plan: `USE_CLANG=1 make -j1 release`
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D28374672
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: e8c3ee46f2a008e8a599413292e5a4b5151365df
      a6e425dc
  9. 12 5月, 2021 1 次提交
    • P
      New Cache API for gathering statistics (#8225) · 78a309bf
      Peter Dillinger 提交于
      Summary:
      Adds a new Cache::ApplyToAllEntries API that we expect to use
      (in follow-up PRs) for efficiently gathering block cache statistics.
      Notable features vs. old ApplyToAllCacheEntries:
      
      * Includes key and deleter (in addition to value and charge). We could
      have passed in a Handle but then more virtual function calls would be
      needed to get the "fields" of each entry. We expect to use the 'deleter'
      to identify the origin of entries, perhaps even more.
      * Heavily tuned to minimize latency impact on operating cache. It
      does this by iterating over small sections of each cache shard while
      cycling through the shards.
      * Supports tuning roughly how many entries to operate on for each
      lock acquire and release, to control the impact on the latency of other
      operations without excessive lock acquire & release. The right balance
      can depend on the cost of the callback. Good default seems to be
      around 256.
      * There should be no need to disable thread safety. (I would expect
      uncontended locks to be sufficiently fast.)
      
      I have enhanced cache_bench to validate this approach:
      
      * Reports a histogram of ns per operation, so we can look at the
      ditribution of times, not just throughput (average).
      * Can add a thread for simulated "gather stats" which calls
      ApplyToAllEntries at a specified interval. We also generate a histogram
      of time to run ApplyToAllEntries.
      
      To make the iteration over some entries of each shard work as cleanly as
      possible, even with resize between next set of entries, I have
      re-arranged which hash bits are used for sharding and which for indexing
      within a shard.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8225
      
      Test Plan:
      A couple of unit tests are added, but primary validation is manual, as
      the primary risk is to performance.
      
      The primary validation is using cache_bench to ensure that neither
      the minor hashing changes nor the simulated stats gathering
      significantly impact QPS or latency distribution. Note that adding op
      latency histogram seriously impacts the benchmark QPS, so for a
      fair baseline, we need the cache_bench changes (except remove simulated
      stat gathering to make it compile). In short, we don't see any
      reproducible difference in ops/sec or op latency unless we are gathering
      stats nearly continuously. Test uses 10GB block cache with
      8KB values to be somewhat realistic in the number of items to iterate
      over.
      
      Baseline typical output:
      
      ```
      Complete in 92.017 s; Rough parallel ops/sec = 869401
      Thread ops/sec = 54662
      
      Operation latency (ns):
      Count: 80000000 Average: 11223.9494  StdDev: 29.61
      Min: 0  Median: 7759.3973  Max: 9620500
      Percentiles: P50: 7759.40 P75: 14190.73 P99: 46922.75 P99.9: 77509.84 P99.99: 217030.58
      ------------------------------------------------------
      [       0,       1 ]       68   0.000%   0.000%
      (    2900,    4400 ]       89   0.000%   0.000%
      (    4400,    6600 ] 33630240  42.038%  42.038% ########
      (    6600,    9900 ] 18129842  22.662%  64.700% #####
      (    9900,   14000 ]  7877533   9.847%  74.547% ##
      (   14000,   22000 ] 15193238  18.992%  93.539% ####
      (   22000,   33000 ]  3037061   3.796%  97.335% #
      (   33000,   50000 ]  1626316   2.033%  99.368%
      (   50000,   75000 ]   421532   0.527%  99.895%
      (   75000,  110000 ]    56910   0.071%  99.966%
      (  110000,  170000 ]    16134   0.020%  99.986%
      (  170000,  250000 ]     5166   0.006%  99.993%
      (  250000,  380000 ]     3017   0.004%  99.996%
      (  380000,  570000 ]     1337   0.002%  99.998%
      (  570000,  860000 ]      805   0.001%  99.999%
      (  860000, 1200000 ]      319   0.000% 100.000%
      ( 1200000, 1900000 ]      231   0.000% 100.000%
      ( 1900000, 2900000 ]      100   0.000% 100.000%
      ( 2900000, 4300000 ]       39   0.000% 100.000%
      ( 4300000, 6500000 ]       16   0.000% 100.000%
      ( 6500000, 9800000 ]        7   0.000% 100.000%
      ```
      
      New, gather_stats=false. Median thread ops/sec of 5 runs:
      
      ```
      Complete in 92.030 s; Rough parallel ops/sec = 869285
      Thread ops/sec = 54458
      
      Operation latency (ns):
      Count: 80000000 Average: 11298.1027  StdDev: 42.18
      Min: 0  Median: 7722.0822  Max: 6398720
      Percentiles: P50: 7722.08 P75: 14294.68 P99: 47522.95 P99.9: 85292.16 P99.99: 228077.78
      ------------------------------------------------------
      [       0,       1 ]      109   0.000%   0.000%
      (    2900,    4400 ]      793   0.001%   0.001%
      (    4400,    6600 ] 34054563  42.568%  42.569% #########
      (    6600,    9900 ] 17482646  21.853%  64.423% ####
      (    9900,   14000 ]  7908180   9.885%  74.308% ##
      (   14000,   22000 ] 15032072  18.790%  93.098% ####
      (   22000,   33000 ]  3237834   4.047%  97.145% #
      (   33000,   50000 ]  1736882   2.171%  99.316%
      (   50000,   75000 ]   446851   0.559%  99.875%
      (   75000,  110000 ]    68251   0.085%  99.960%
      (  110000,  170000 ]    18592   0.023%  99.983%
      (  170000,  250000 ]     7200   0.009%  99.992%
      (  250000,  380000 ]     3334   0.004%  99.997%
      (  380000,  570000 ]     1393   0.002%  99.998%
      (  570000,  860000 ]      700   0.001%  99.999%
      (  860000, 1200000 ]      293   0.000% 100.000%
      ( 1200000, 1900000 ]      196   0.000% 100.000%
      ( 1900000, 2900000 ]       69   0.000% 100.000%
      ( 2900000, 4300000 ]       32   0.000% 100.000%
      ( 4300000, 6500000 ]       10   0.000% 100.000%
      ```
      
      New, gather_stats=true, 1 second delay between scans. Scans take about
      1 second here so it's spending about 50% time scanning. Still the effect on
      ops/sec and latency seems to be in the noise. Median thread ops/sec of 5 runs:
      
      ```
      Complete in 91.890 s; Rough parallel ops/sec = 870608
      Thread ops/sec = 54551
      
      Operation latency (ns):
      Count: 80000000 Average: 11311.2629  StdDev: 45.28
      Min: 0  Median: 7686.5458  Max: 10018340
      Percentiles: P50: 7686.55 P75: 14481.95 P99: 47232.60 P99.9: 79230.18 P99.99: 232998.86
      ------------------------------------------------------
      [       0,       1 ]       71   0.000%   0.000%
      (    2900,    4400 ]      291   0.000%   0.000%
      (    4400,    6600 ] 34492060  43.115%  43.116% #########
      (    6600,    9900 ] 16727328  20.909%  64.025% ####
      (    9900,   14000 ]  7845828   9.807%  73.832% ##
      (   14000,   22000 ] 15510654  19.388%  93.220% ####
      (   22000,   33000 ]  3216533   4.021%  97.241% #
      (   33000,   50000 ]  1680859   2.101%  99.342%
      (   50000,   75000 ]   439059   0.549%  99.891%
      (   75000,  110000 ]    60540   0.076%  99.967%
      (  110000,  170000 ]    14649   0.018%  99.985%
      (  170000,  250000 ]     5242   0.007%  99.991%
      (  250000,  380000 ]     3260   0.004%  99.995%
      (  380000,  570000 ]     1599   0.002%  99.997%
      (  570000,  860000 ]     1043   0.001%  99.999%
      (  860000, 1200000 ]      471   0.001%  99.999%
      ( 1200000, 1900000 ]      275   0.000% 100.000%
      ( 1900000, 2900000 ]      143   0.000% 100.000%
      ( 2900000, 4300000 ]       60   0.000% 100.000%
      ( 4300000, 6500000 ]       27   0.000% 100.000%
      ( 6500000, 9800000 ]        7   0.000% 100.000%
      ( 9800000, 14000000 ]        1   0.000% 100.000%
      
      Gather stats latency (us):
      Count: 46 Average: 980387.5870  StdDev: 60911.18
      Min: 879155  Median: 1033777.7778  Max: 1261431
      Percentiles: P50: 1033777.78 P75: 1120666.67 P99: 1261431.00 P99.9: 1261431.00 P99.99: 1261431.00
      ------------------------------------------------------
      (  860000, 1200000 ]       45  97.826%  97.826% ####################
      ( 1200000, 1900000 ]        1   2.174% 100.000%
      
      Most recent cache entry stats:
      Number of entries: 1295133
      Total charge: 9.88 GB
      Average key size: 23.4982
      Average charge: 8.00 KB
      Unique deleters: 3
      ```
      
      Reviewed By: mrambacher
      
      Differential Revision: D28295742
      
      Pulled By: pdillinger
      
      fbshipit-source-id: bbc4a552f91ba0fe10e5cc025c42cef5a81f2b95
      78a309bf
  10. 05 5月, 2021 1 次提交
    • P
      Fix use-after-free threading bug in ClockCache (#8261) · 3b981eaa
      Peter Dillinger 提交于
      Summary:
      In testing for https://github.com/facebook/rocksdb/issues/8225 I found cache_bench would crash with
      -use_clock_cache, as well as db_bench -use_clock_cache, but not
      single-threaded. Smaller cache size hits failure much faster. ASAN
      reported the failuer as calling malloc_usable_size on the `key` pointer
      of a ClockCache handle after it was reportedly freed. On detailed
      inspection I found this bad sequence of operations for a cache entry:
      
      state=InCache=1,refs=1
      [thread 1] Start ClockCacheShard::Unref (from Release, no mutex)
      [thread 1] Decrement ref count
      state=InCache=1,refs=0
      [thread 1] Suspend before CalcTotalCharge (no mutex)
      
      [thread 2] Start UnsetInCache (from Insert, mutex held)
      [thread 2] clear InCache bit
      state=InCache=0,refs=0
      [thread 2] Calls RecycleHandle (based on pre-updated state)
      [thread 2] Returns to Insert which calls Cleanup which deletes `key`
      
      [thread 1] Resume ClockCacheShard::Unref
      [thread 1] Read `key` in CalcTotalCharge
      
      To fix this, I've added a field to the handle to store the metadata
      charge so that we can efficiently remember everything we need from
      the handle in Unref. We must not read from the handle again if we
      decrement the count to zero with InCache=1, which means we don't own
      the entry and someone else could eject/overwrite it immediately.
      
      Note before this change, on amd64 sizeof(Handle) == 56 even though there
      are only 48 bytes of data. Grouping together the uint32_t fields would
      cut it down to 48, but I've added another uint32_t, which takes it
      back up to 56. Not a big deal.
      
      Also fixed DisownData to cooperate with ASAN as in LRUCache.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8261
      
      Test Plan:
      Manual + adding use_clock_cache to db_crashtest.py
      
      Base performance
      ./cache_bench -use_clock_cache
      Complete in 17.060 s; QPS = 2458513
      New performance
      ./cache_bench -use_clock_cache
      Complete in 17.052 s; QPS = 2459695
      
      Any difference is easily buried in small noise.
      
      Crash test shows still more bug(s) in ClockCache, so I'm expecting to
      disable ClockCache from production code in a follow-up PR (if we
      can't find and fix the bug(s))
      
      Reviewed By: mrambacher
      
      Differential Revision: D28207358
      
      Pulled By: pdillinger
      
      fbshipit-source-id: aa7a9322afc6f18f30e462c75dbbe4a1206eb294
      3b981eaa
  11. 12 5月, 2020 1 次提交
  12. 28 4月, 2020 1 次提交
    • P
      Stats for redundant insertions into block cache (#6681) · 249eff0f
      Peter Dillinger 提交于
      Summary:
      Since read threads do not coordinate on loading data into block
      cache, two threads between Lookup and Insert can end up loading and
      inserting the same data. This is particularly concerning with
      cache_index_and_filter_blocks since those are hot and more likely to
      be race targets if ejected from (or not pre-populated in) the cache.
      
      Particularly with moves toward disaggregated / network storage, the cost
      of redundant retrieval might be high, and we should at least have some
      hard statistics from which we can estimate impact.
      
      Example with full filter thrashing "cliff":
      
          $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
          ...
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 14181
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 476
          rocksdb.block.cache.data.add COUNT : 12749
          rocksdb.block.cache.data.add.redundant COUNT : 18
          rocksdb.block.cache.filter.add COUNT : 1003
          rocksdb.block.cache.filter.add.redundant COUNT : 217
          rocksdb.block.cache.index.add COUNT : 429
          rocksdb.block.cache.index.add.redundant COUNT : 241
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 1182223
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 302728
          rocksdb.block.cache.data.add COUNT : 31425
          rocksdb.block.cache.data.add.redundant COUNT : 12
          rocksdb.block.cache.filter.add COUNT : 795455
          rocksdb.block.cache.filter.add.redundant COUNT : 130238
          rocksdb.block.cache.index.add COUNT : 355343
          rocksdb.block.cache.index.add.redundant COUNT : 172478
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681
      
      Test Plan: Some manual testing (above) and unit test covering key metrics is included
      
      Reviewed By: ltamasi
      
      Differential Revision: D21134113
      
      Pulled By: pdillinger
      
      fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
      249eff0f
  13. 01 4月, 2020 1 次提交
  14. 27 3月, 2020 1 次提交
    • L
      Use function objects as deleters in the block cache (#6545) · 6301dbe7
      Levi Tamasi 提交于
      Summary:
      As the first step of reintroducing eviction statistics for the block
      cache, the patch switches from using simple function pointers as deleters
      to function objects implementing an interface. This will enable using
      deleters that have state, like a smart pointer to the statistics object
      that is to be updated when an entry is removed from the cache. For now,
      the patch adds a deleter template class `SimpleDeleter`, which simply
      casts the `value` pointer to its original type and calls `delete` or
      `delete[]` on it as appropriate. Note: to prevent object lifecycle
      issues, deleters must outlive the cache entries referring to them;
      `SimpleDeleter` ensures this by using the ("leaky") Meyers singleton
      pattern.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6545
      
      Test Plan: `make asan_check`
      
      Reviewed By: siying
      
      Differential Revision: D20475823
      
      Pulled By: ltamasi
      
      fbshipit-source-id: fe354c33dd96d9bafc094605462352305449a22a
      6301dbe7
  15. 21 2月, 2020 1 次提交
    • S
      Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) · fdf882de
      sdong 提交于
      Summary:
      When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
      
      Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
      
      Differential Revision: D19977691
      
      fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
      fdf882de
  16. 17 9月, 2019 1 次提交
    • M
      Charge block cache for cache internal usage (#5797) · 638d2395
      Maysam Yabandeh 提交于
      Summary:
      For our default block cache, each additional entry has extra memory overhead. It include LRUHandle (72 bytes currently) and the cache key (two varint64, file id and offset). The usage is not negligible. For example for block_size=4k, the overhead accounts for an extra 2% memory usage for the cache. The patch charging the cache for the extra usage, reducing untracked memory usage outside block cache. The feature is enabled by default and can be disabled by passing kDontChargeCacheMetadata to the cache constructor.
      This PR builds up on https://github.com/facebook/rocksdb/issues/4258
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5797
      
      Test Plan:
      - Existing tests are updated to either disable the feature when the test has too much dependency on the old way of accounting the usage or increasing the cache capacity to account for the additional charge of metadata.
      - The Usage tests in cache_test.cc are augmented to test the cache usage under kFullChargeCacheMetadata.
      
      Differential Revision: D17396833
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 7684ccb9f8a40ca595e4f5efcdb03623afea0c6f
      638d2395
  17. 06 4月, 2019 1 次提交
  18. 15 2月, 2019 1 次提交
    • M
      Apply modernize-use-override (2nd iteration) · ca89ac2b
      Michael Liu 提交于
      Summary:
      Use C++11’s override and remove virtual where applicable.
      Change are automatically generated.
      
      Reviewed By: Orvid
      
      Differential Revision: D14090024
      
      fbshipit-source-id: 1e9432e87d2657e1ff0028e15370a85d1739ba2a
      ca89ac2b
  19. 13 4月, 2018 1 次提交
  20. 06 4月, 2018 1 次提交
  21. 06 3月, 2018 1 次提交
  22. 23 2月, 2018 2 次提交
  23. 29 7月, 2017 1 次提交
    • S
      Replace dynamic_cast<> · 21696ba5
      Siying Dong 提交于
      Summary:
      Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available.
      Some nontrivial changes:
      1. Add Comparator::GetRootComparator() to get around the internal comparator hack
      2. Add the two experiemental functions to DB
      3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string
      4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric.
      Closes https://github.com/facebook/rocksdb/pull/2645
      
      Differential Revision: D5502723
      
      Pulled By: siying
      
      fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c
      21696ba5
  24. 22 7月, 2017 2 次提交
  25. 16 7月, 2017 1 次提交
  26. 28 4月, 2017 1 次提交
  27. 25 4月, 2017 1 次提交
    • M
      Add erase option to release cache · 4c9447d8
      Maysam Yabandeh 提交于
      Summary:
      This is useful when we put the entries in the block cache for accounting
      purposes and do not expect it to be used after it is released. If the cache does not
      erase the item in such cases not only the performance of cache is
      negatively affected but the item's destructor not being called at the
      time of release might violate the assumptions about the lifetime of the
      object.
      
      The new change adds a force_erase option to the Release method and
      returns a boolean to indicate whehter the item is successfully deleted.
      Closes https://github.com/facebook/rocksdb/pull/2180
      
      Differential Revision: D4916032
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 94409a346069923cac9de8e57adc313b4ed46f28
      4c9447d8
  28. 06 4月, 2017 1 次提交
  29. 27 1月, 2017 1 次提交
  30. 11 1月, 2017 1 次提交
    • A
      Allow incrementing refcount on cache handles · fe395fb6
      Andrew Kryczka 提交于
      Summary:
      Previously the only way to increment a handle's refcount was to invoke Lookup(), which (1) did hash table lookup to get cache handle, (2) incremented that handle's refcount. For a future DeleteRange optimization, I added a function, Ref(), for when the caller already has a cache handle and only needs to do (2).
      Closes https://github.com/facebook/rocksdb/pull/1761
      
      Differential Revision: D4397114
      
      Pulled By: ajkr
      
      fbshipit-source-id: 9addbe5
      fe395fb6
  31. 31 8月, 2016 1 次提交
    • Y
      Fix ClockCache memory leak · de47e2bd
      Yi Wu 提交于
      Summary:
      Fix ClockCache memory leak found by valgrind:
      # Add destructor to cleanup cached values.
      # Delete key with cache handle immediately after handle is recycled, and erase table entry immediately if duplicated cache entry is inserted.
      
      Test Plan:
          make DISABLE_JEMALLOC=1 valgrind_check
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62973
      de47e2bd
  32. 24 8月, 2016 1 次提交
  33. 20 8月, 2016 2 次提交
    • Y
      LRU cache mid-point insertion · 72f8cc70
      Yi Wu 提交于
      Summary:
      Add mid-point insertion functionality to LRU cache. Caller of `Cache::Insert()` can set an additional parameter to make a cache entry have higher priority. The LRU cache will reserve at most `capacity * high_pri_pool_pct` bytes for high-pri cache entries. If `high_pri_pool_pct` is zero, the cache degenerates to normal LRU cache.
      
      Context: If we are to put index and filter blocks into RocksDB block cache, index/filter block can be swap out too early. We want to add an option to RocksDB to reserve some capacity in block cache just for index/filter blocks, to mitigate the issue.
      
      In later diffs I'll update block based table reader to use the interface to cache index/filter blocks at high priority, and expose the option to `DBOptions` and make it dynamic changeable.
      
      Test Plan: unit test.
      
      Reviewers: IslamAbdelRahman, sdong, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, march, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61977
      72f8cc70
    • Y
      Introduce ClockCache · 4cc37f59
      Yi Wu 提交于
      Summary:
      Clock-based cache implemenetation aim to have better concurreny than
      default LRU cache. See inline comments for implementation details.
      
      Test Plan:
      Update cache_test to run on both LRUCache and ClockCache. Adding some
      new tests to catch some of the bugs that I fixed while implementing the
      cache.
      
      Reviewers: kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61647
      4cc37f59