• P
    New Cache API for gathering statistics (#8225) · 78a309bf
    Peter Dillinger 提交于
    Summary:
    Adds a new Cache::ApplyToAllEntries API that we expect to use
    (in follow-up PRs) for efficiently gathering block cache statistics.
    Notable features vs. old ApplyToAllCacheEntries:
    
    * Includes key and deleter (in addition to value and charge). We could
    have passed in a Handle but then more virtual function calls would be
    needed to get the "fields" of each entry. We expect to use the 'deleter'
    to identify the origin of entries, perhaps even more.
    * Heavily tuned to minimize latency impact on operating cache. It
    does this by iterating over small sections of each cache shard while
    cycling through the shards.
    * Supports tuning roughly how many entries to operate on for each
    lock acquire and release, to control the impact on the latency of other
    operations without excessive lock acquire & release. The right balance
    can depend on the cost of the callback. Good default seems to be
    around 256.
    * There should be no need to disable thread safety. (I would expect
    uncontended locks to be sufficiently fast.)
    
    I have enhanced cache_bench to validate this approach:
    
    * Reports a histogram of ns per operation, so we can look at the
    ditribution of times, not just throughput (average).
    * Can add a thread for simulated "gather stats" which calls
    ApplyToAllEntries at a specified interval. We also generate a histogram
    of time to run ApplyToAllEntries.
    
    To make the iteration over some entries of each shard work as cleanly as
    possible, even with resize between next set of entries, I have
    re-arranged which hash bits are used for sharding and which for indexing
    within a shard.
    
    Pull Request resolved: https://github.com/facebook/rocksdb/pull/8225
    
    Test Plan:
    A couple of unit tests are added, but primary validation is manual, as
    the primary risk is to performance.
    
    The primary validation is using cache_bench to ensure that neither
    the minor hashing changes nor the simulated stats gathering
    significantly impact QPS or latency distribution. Note that adding op
    latency histogram seriously impacts the benchmark QPS, so for a
    fair baseline, we need the cache_bench changes (except remove simulated
    stat gathering to make it compile). In short, we don't see any
    reproducible difference in ops/sec or op latency unless we are gathering
    stats nearly continuously. Test uses 10GB block cache with
    8KB values to be somewhat realistic in the number of items to iterate
    over.
    
    Baseline typical output:
    
    ```
    Complete in 92.017 s; Rough parallel ops/sec = 869401
    Thread ops/sec = 54662
    
    Operation latency (ns):
    Count: 80000000 Average: 11223.9494  StdDev: 29.61
    Min: 0  Median: 7759.3973  Max: 9620500
    Percentiles: P50: 7759.40 P75: 14190.73 P99: 46922.75 P99.9: 77509.84 P99.99: 217030.58
    ------------------------------------------------------
    [       0,       1 ]       68   0.000%   0.000%
    (    2900,    4400 ]       89   0.000%   0.000%
    (    4400,    6600 ] 33630240  42.038%  42.038% ########
    (    6600,    9900 ] 18129842  22.662%  64.700% #####
    (    9900,   14000 ]  7877533   9.847%  74.547% ##
    (   14000,   22000 ] 15193238  18.992%  93.539% ####
    (   22000,   33000 ]  3037061   3.796%  97.335% #
    (   33000,   50000 ]  1626316   2.033%  99.368%
    (   50000,   75000 ]   421532   0.527%  99.895%
    (   75000,  110000 ]    56910   0.071%  99.966%
    (  110000,  170000 ]    16134   0.020%  99.986%
    (  170000,  250000 ]     5166   0.006%  99.993%
    (  250000,  380000 ]     3017   0.004%  99.996%
    (  380000,  570000 ]     1337   0.002%  99.998%
    (  570000,  860000 ]      805   0.001%  99.999%
    (  860000, 1200000 ]      319   0.000% 100.000%
    ( 1200000, 1900000 ]      231   0.000% 100.000%
    ( 1900000, 2900000 ]      100   0.000% 100.000%
    ( 2900000, 4300000 ]       39   0.000% 100.000%
    ( 4300000, 6500000 ]       16   0.000% 100.000%
    ( 6500000, 9800000 ]        7   0.000% 100.000%
    ```
    
    New, gather_stats=false. Median thread ops/sec of 5 runs:
    
    ```
    Complete in 92.030 s; Rough parallel ops/sec = 869285
    Thread ops/sec = 54458
    
    Operation latency (ns):
    Count: 80000000 Average: 11298.1027  StdDev: 42.18
    Min: 0  Median: 7722.0822  Max: 6398720
    Percentiles: P50: 7722.08 P75: 14294.68 P99: 47522.95 P99.9: 85292.16 P99.99: 228077.78
    ------------------------------------------------------
    [       0,       1 ]      109   0.000%   0.000%
    (    2900,    4400 ]      793   0.001%   0.001%
    (    4400,    6600 ] 34054563  42.568%  42.569% #########
    (    6600,    9900 ] 17482646  21.853%  64.423% ####
    (    9900,   14000 ]  7908180   9.885%  74.308% ##
    (   14000,   22000 ] 15032072  18.790%  93.098% ####
    (   22000,   33000 ]  3237834   4.047%  97.145% #
    (   33000,   50000 ]  1736882   2.171%  99.316%
    (   50000,   75000 ]   446851   0.559%  99.875%
    (   75000,  110000 ]    68251   0.085%  99.960%
    (  110000,  170000 ]    18592   0.023%  99.983%
    (  170000,  250000 ]     7200   0.009%  99.992%
    (  250000,  380000 ]     3334   0.004%  99.997%
    (  380000,  570000 ]     1393   0.002%  99.998%
    (  570000,  860000 ]      700   0.001%  99.999%
    (  860000, 1200000 ]      293   0.000% 100.000%
    ( 1200000, 1900000 ]      196   0.000% 100.000%
    ( 1900000, 2900000 ]       69   0.000% 100.000%
    ( 2900000, 4300000 ]       32   0.000% 100.000%
    ( 4300000, 6500000 ]       10   0.000% 100.000%
    ```
    
    New, gather_stats=true, 1 second delay between scans. Scans take about
    1 second here so it's spending about 50% time scanning. Still the effect on
    ops/sec and latency seems to be in the noise. Median thread ops/sec of 5 runs:
    
    ```
    Complete in 91.890 s; Rough parallel ops/sec = 870608
    Thread ops/sec = 54551
    
    Operation latency (ns):
    Count: 80000000 Average: 11311.2629  StdDev: 45.28
    Min: 0  Median: 7686.5458  Max: 10018340
    Percentiles: P50: 7686.55 P75: 14481.95 P99: 47232.60 P99.9: 79230.18 P99.99: 232998.86
    ------------------------------------------------------
    [       0,       1 ]       71   0.000%   0.000%
    (    2900,    4400 ]      291   0.000%   0.000%
    (    4400,    6600 ] 34492060  43.115%  43.116% #########
    (    6600,    9900 ] 16727328  20.909%  64.025% ####
    (    9900,   14000 ]  7845828   9.807%  73.832% ##
    (   14000,   22000 ] 15510654  19.388%  93.220% ####
    (   22000,   33000 ]  3216533   4.021%  97.241% #
    (   33000,   50000 ]  1680859   2.101%  99.342%
    (   50000,   75000 ]   439059   0.549%  99.891%
    (   75000,  110000 ]    60540   0.076%  99.967%
    (  110000,  170000 ]    14649   0.018%  99.985%
    (  170000,  250000 ]     5242   0.007%  99.991%
    (  250000,  380000 ]     3260   0.004%  99.995%
    (  380000,  570000 ]     1599   0.002%  99.997%
    (  570000,  860000 ]     1043   0.001%  99.999%
    (  860000, 1200000 ]      471   0.001%  99.999%
    ( 1200000, 1900000 ]      275   0.000% 100.000%
    ( 1900000, 2900000 ]      143   0.000% 100.000%
    ( 2900000, 4300000 ]       60   0.000% 100.000%
    ( 4300000, 6500000 ]       27   0.000% 100.000%
    ( 6500000, 9800000 ]        7   0.000% 100.000%
    ( 9800000, 14000000 ]        1   0.000% 100.000%
    
    Gather stats latency (us):
    Count: 46 Average: 980387.5870  StdDev: 60911.18
    Min: 879155  Median: 1033777.7778  Max: 1261431
    Percentiles: P50: 1033777.78 P75: 1120666.67 P99: 1261431.00 P99.9: 1261431.00 P99.99: 1261431.00
    ------------------------------------------------------
    (  860000, 1200000 ]       45  97.826%  97.826% ####################
    ( 1200000, 1900000 ]        1   2.174% 100.000%
    
    Most recent cache entry stats:
    Number of entries: 1295133
    Total charge: 9.88 GB
    Average key size: 23.4982
    Average charge: 8.00 KB
    Unique deleters: 3
    ```
    
    Reviewed By: mrambacher
    
    Differential Revision: D28295742
    
    Pulled By: pdillinger
    
    fbshipit-source-id: bbc4a552f91ba0fe10e5cc025c42cef5a81f2b95
    78a309bf
clock_cache.cc 29.1 KB