1. 02 7月, 2019 1 次提交
    • Y
      Add secondary instance to stress test (#5479) · c3606757
      Yanqin Jin 提交于
      Summary:
      This PR allows users to run stress tests on secondary instance.
      
      Test plan (on devserver)
      ```
      ./db_stress -ops_per_thread=100000 -enable_secondary=true -threads=32 -secondary_catch_up_one_in=10000 -clear_column_family_one_in=1000 -reopen=100
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5479
      
      Differential Revision: D16074325
      
      Pulled By: riversand963
      
      fbshipit-source-id: c0ed959e7b6c7cda3efd0b3070ab379de3b29f1c
      c3606757
  2. 01 7月, 2019 2 次提交
  3. 29 6月, 2019 1 次提交
  4. 28 6月, 2019 2 次提交
  5. 27 6月, 2019 2 次提交
    • Y
      Add C binding for secondary instance (#5505) · c08c0ae7
      Yanqin Jin 提交于
      Summary:
      Add C binding for secondary instance as well as unit test.
      
      Test plan (on devserver)
      ```
      $make clean && COMPILE_WITH_ASAN=1 make -j20 all
      $./c_test
      $make check
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5505
      
      Differential Revision: D16000043
      
      Pulled By: riversand963
      
      fbshipit-source-id: 3361ef6bfdf4ce12438cee7290a0ac203b5250bd
      c08c0ae7
    • H
      Block cache tracer: Do not populate block cache trace record when tracing is disabled. (#5510) · a8975b62
      haoyuhuang 提交于
      Summary:
      This PR makes sure that trace record is not populated when tracing is disabled.
      
      Before this PR:
      DB path: [/data/mysql/rocks_regression_tests/OPTIONS-myrocks-40-33-10000000/2019-06-26-13-04-41/db]
      readwhilewriting :       9.803 micros/op 1550408 ops/sec;  107.9 MB/s (5000000 of 5000000 found)
      Microseconds per read:
      Count: 80000000 Average: 9.8045  StdDev: 12.64
      Min: 1  Median: 7.5246  Max: 25343
      Percentiles: P50: 7.52 P75: 12.10 P99: 37.44 P99.9: 75.07 P99.99: 133.60
      
      After this PR:
      DB path: [/data/mysql/rocks_regression_tests/OPTIONS-myrocks-40-33-10000000/2019-06-26-14-08-21/db]
      readwhilewriting :       8.723 micros/op 1662882 ops/sec;  115.8 MB/s (5000000 of 5000000 found)
      Microseconds per read:
      Count: 80000000 Average: 8.7236  StdDev: 12.19
      Min: 1  Median: 6.7262  Max: 25229
      Percentiles: P50: 6.73 P75: 10.50 P99: 31.54 P99.9: 74.81 P99.99: 132.82
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5510
      
      Differential Revision: D16016428
      
      Pulled By: HaoyuHuang
      
      fbshipit-source-id: 3b3d11e6accf207d18ec2545b802aa01ee65901f
      a8975b62
  6. 26 6月, 2019 1 次提交
  7. 25 6月, 2019 6 次提交
    • M
      Add an option to put first key of each sst block in the index (#5289) · b4d72094
      Mike Kolupaev 提交于
      Summary:
      The first key is used to defer reading the data block until this file gets to the top of merging iterator's heap. For short range scans, most files never make it to the top of the heap, so this change can reduce read amplification by a lot sometimes.
      
      Consider the following workload. There are a few data streams (we'll be calling them "logs"), each stream consisting of a sequence of blobs (we'll be calling them "records"). Each record is identified by log ID and a sequence number within the log. RocksDB key is concatenation of log ID and sequence number (big endian). Reads are mostly relatively short range scans, each within a single log. Writes are mostly sequential for each log, but writes to different logs are randomly interleaved. Compactions are disabled; instead, when we accumulate a few tens of sst files, we create a new column family and start writing to it.
      
      So, a typical sst file consists of a few ranges of blocks, each range corresponding to one log ID (we use FlushBlockPolicy to cut blocks at log boundaries). A typical read would go like this. First, iterator Seek() reads one block from each sst file. Then a series of Next()s move through one sst file (since writes to each log are mostly sequential) until the subiterator reaches the end of this log in this sst file; then Next() switches to the next sst file and reads sequentially from that, and so on. Often a range scan will only return records from a small number of blocks in small number of sst files; in this case, the cost of initial Seek() reading one block from each file may be bigger than the cost of reading the actually useful blocks.
      
      Neither iterate_upper_bound nor bloom filters can prevent reading one block from each file in Seek(). But this PR can: if the index contains first key from each block, we don't have to read the block until this block actually makes it to the top of merging iterator's heap, so for short range scans we won't read any blocks from most of the sst files.
      
      This PR does the deferred block loading inside value() call. This is not ideal: there's no good way to report an IO error from inside value(). As discussed with siying offline, it would probably be better to change InternalIterator's interface to explicitly fetch deferred value and get status. I'll do it in a separate PR.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5289
      
      Differential Revision: D15256423
      
      Pulled By: al13n321
      
      fbshipit-source-id: 750e4c39ce88e8d41662f701cf6275d9388ba46a
      b4d72094
    • H
      Block cache trace analysis: Write time series graphs in csv files (#5490) · 554a6456
      haoyuhuang 提交于
      Summary:
      This PR adds a feature in block cache trace analysis tool to write statistics into csv files.
      1. The analysis tool supports grouping the number of accesses per second by various labels, e.g., block, column family, block type, or a combination of them.
      2. It also computes reuse distance and reuse interval.
      
      Reuse distance: The cumulated size of unique blocks read between two consecutive accesses on the same block.
      Reuse interval: The time between two consecutive accesses on the same block.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5490
      
      Differential Revision: D15901322
      
      Pulled By: HaoyuHuang
      
      fbshipit-source-id: b5454fea408a32757a80be63de6fe1c8149ca70e
      554a6456
    • H
      Fix build jemalloc api (#5470) · acb80534
      Huisheng Liu 提交于
      Summary:
      There is a compile error on Windows with MSVC in malloc_stats.cc where malloc_stats_print is referenced. The compiler only knows je_malloc_stats_print from jemalloc.h. Adding JEMALLOC_NO_RENAME replaces malloc_stats_print with je_malloc_stats_print.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5470
      
      Differential Revision: D15978720
      
      fbshipit-source-id: c05757a2e89e2e015a661d9626c352e4f32f97e4
      acb80534
    • S
      C file should not include <cinttypes>, it is a C++ header. (#5499) · e731f440
      Sergei Petrunia 提交于
      Summary:
      Include <inttypes.h> instead.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5499
      
      Differential Revision: D15966937
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 2156c4329b91d26d447de94f1231264d52786350
      e731f440
    • J
      JNI: Do not create 8M block cache for negative blockCacheSize values (#5465) · c92c58f8
      Jermy Li 提交于
      Summary:
      As [BlockBasedTableConfig setBlockCacheSize()](https://github.com/facebook/rocksdb/blob/1966a7c055f6e182d627275051f5c09441aa922d/java/src/main/java/org/rocksdb/BlockBasedTableConfig.java#L728) said, If cacheSize is non-positive, then cache will not be used. but when we configure a negative number or 0, there is an unexpected result: the block cache becomes 8M.
      
      - Allow 0 as a valid size. When block cache size is 0, an 8MB block cache is created, as it is the default C++ API behavior. Also updated the comment.
      - Set no_block_cache true if negative value is passed to block cache size, and no block cache will be created.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5465
      
      Differential Revision: D15968788
      
      Pulled By: sagar0
      
      fbshipit-source-id: ee02d6e95841c9e2c316a64bfdf192d46ff5638a
      c92c58f8
    • A
      Also build compression libraries on AppVeyor CI (#5226) · 68980df8
      Adam Retter 提交于
      Summary:
      This adds some compression dependencies to AppVeyor CI (those whose builds can be easily scripted on Windows, i.e. Snappy, LZ4, and ZStd).
      
      Let's see if the CI passes ;-)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5226
      
      Differential Revision: D15967223
      
      fbshipit-source-id: 0914c613ac358cbb248df75cdee8099e836828dc
      68980df8
  8. 22 6月, 2019 2 次提交
    • V
      Compaction Reads should read no more than compaction_readahead_size bytes, when set! (#5498) · 22028aa9
      Vijay Nadimpalli 提交于
      Summary:
      As a result of https://github.com/facebook/rocksdb/issues/5431 the compaction_readahead_size given by a user was not used exactly, the reason being the code behind readahead for user-read and compaction-read was unified in the above PR and the behavior for user-read is to read readahead_size+n bytes (see FilePrefetchBuffer::TryReadFromCache method). Before the unification the ReadaheadRandomAccessFileReader used compaction_readahead_size as it is.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5498
      
      Test Plan:
      Ran strace command : strace -e pread64 -f -T -t ./db_compaction_test --gtest_filter=DBCompactionTest.PartialManualCompaction
      
      In the test the compaction_readahead_size was configured to 2MB and verified the pread syscall did indeed request 2MB. Before the change it was requesting more than 2MB.
      
      Strace Output:
      strace: Process 3798982 attached
      Note: Google Test filter = DBCompactionTest.PartialManualCompaction
      [==========] Running 1 test from 1 test case.
      [----------] Global test environment set-up.
      [----------] 1 test from DBCompactionTest
      [ RUN      ] DBCompactionTest.PartialManualCompaction
      strace: Process 3798983 attached
      strace: Process 3798984 attached
      strace: Process 3798985 attached
      strace: Process 3798986 attached
      strace: Process 3798987 attached
      strace: Process 3798992 attached
      [pid 3798987] 12:07:05 +++ exited with 0 +++
      strace: Process 3798993 attached
      [pid 3798993] 12:07:05 +++ exited with 0 +++
      strace: Process 3798994 attached
      strace: Process 3799008 attached
      strace: Process 3799009 attached
      [pid 3799008] 12:07:05 +++ exited with 0 +++
      strace: Process 3799010 attached
      [pid 3799009] 12:07:05 +++ exited with 0 +++
      strace: Process 3799011 attached
      [pid 3799010] 12:07:05 +++ exited with 0 +++
      [pid 3799011] 12:07:05 +++ exited with 0 +++
      strace: Process 3799012 attached
      [pid 3799012] 12:07:05 +++ exited with 0 +++
      strace: Process 3799013 attached
      strace: Process 3799014 attached
      [pid 3799013] 12:07:05 +++ exited with 0 +++
      strace: Process 3799015 attached
      [pid 3799014] 12:07:05 +++ exited with 0 +++
      [pid 3799015] 12:07:05 +++ exited with 0 +++
      strace: Process 3799016 attached
      [pid 3799016] 12:07:05 +++ exited with 0 +++
      strace: Process 3799017 attached
      [pid 3799017] 12:07:05 +++ exited with 0 +++
      strace: Process 3799019 attached
      [pid 3799019] 12:07:05 +++ exited with 0 +++
      strace: Process 3799020 attached
      strace: Process 3799021 attached
      [pid 3799020] 12:07:05 +++ exited with 0 +++
      [pid 3799021] 12:07:05 +++ exited with 0 +++
      strace: Process 3799022 attached
      [pid 3799022] 12:07:05 +++ exited with 0 +++
      strace: Process 3799023 attached
      [pid 3799023] 12:07:05 +++ exited with 0 +++
      strace: Process 3799047 attached
      strace: Process 3799048 attached
      [pid 3799047] 12:07:06 +++ exited with 0 +++
      [pid 3799048] 12:07:06 +++ exited with 0 +++
      [pid 3798994] 12:07:06 +++ exited with 0 +++
      strace: Process 3799052 attached
      [pid 3799052] 12:07:06 +++ exited with 0 +++
      strace: Process 3799054 attached
      strace: Process 3799069 attached
      strace: Process 3799070 attached
      [pid 3799069] 12:07:06 +++ exited with 0 +++
      strace: Process 3799071 attached
      [pid 3799070] 12:07:06 +++ exited with 0 +++
      [pid 3799071] 12:07:06 +++ exited with 0 +++
      strace: Process 3799072 attached
      strace: Process 3799073 attached
      [pid 3799072] 12:07:06 +++ exited with 0 +++
      [pid 3799073] 12:07:06 +++ exited with 0 +++
      strace: Process 3799074 attached
      [pid 3799074] 12:07:06 +++ exited with 0 +++
      strace: Process 3799075 attached
      [pid 3799075] 12:07:06 +++ exited with 0 +++
      strace: Process 3799076 attached
      [pid 3799076] 12:07:06 +++ exited with 0 +++
      strace: Process 3799077 attached
      [pid 3799077] 12:07:06 +++ exited with 0 +++
      strace: Process 3799078 attached
      [pid 3799078] 12:07:06 +++ exited with 0 +++
      strace: Process 3799079 attached
      [pid 3799079] 12:07:06 +++ exited with 0 +++
      strace: Process 3799080 attached
      [pid 3799080] 12:07:06 +++ exited with 0 +++
      strace: Process 3799081 attached
      [pid 3799081] 12:07:06 +++ exited with 0 +++
      strace: Process 3799082 attached
      [pid 3799082] 12:07:06 +++ exited with 0 +++
      strace: Process 3799083 attached
      [pid 3799083] 12:07:06 +++ exited with 0 +++
      strace: Process 3799086 attached
      strace: Process 3799087 attached
      [pid 3798984] 12:07:06 pread64(9, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000121>
      [pid 3798984] 12:07:06 pread64(9, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000106>
      [pid 3798984] 12:07:06 pread64(9, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000081>
      [pid 3798984] 12:07:06 pread64(9, "\0\v\3foo\2\7\0\0\0\0\0\0\0\270 \0\v\4foo\2\3\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000138>
      [pid 3798984] 12:07:06 pread64(11, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000097>
      [pid 3798984] 12:07:06 pread64(11, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000086>
      [pid 3798984] 12:07:06 pread64(11, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000064>
      [pid 3798984] 12:07:06 pread64(11, "\0\v\3foo\2\21\0\0\0\0\0\0\0\270 \0\v\4foo\2\r\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000064>
      [pid 3798984] 12:07:06 pread64(12, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000080>
      [pid 3798984] 12:07:06 pread64(12, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000090>
      [pid 3798984] 12:07:06 pread64(12, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000059>
      [pid 3798984] 12:07:06 pread64(12, "\0\v\3foo\2\33\0\0\0\0\0\0\0\270 \0\v\4foo\2\27\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000065>
      [pid 3798984] 12:07:06 pread64(13, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000070>
      [pid 3798984] 12:07:06 pread64(13, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000059>
      [pid 3798984] 12:07:06 pread64(13, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000061>
      [pid 3798984] 12:07:06 pread64(13, "\0\v\3foo\2%\0\0\0\0\0\0\0\270 \0\v\4foo\2!\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000065>
      [pid 3798984] 12:07:06 pread64(14, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000118>
      [pid 3798984] 12:07:06 pread64(14, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000093>
      [pid 3798984] 12:07:06 pread64(14, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000050>
      [pid 3798984] 12:07:06 pread64(14, "\0\v\3foo\2/\0\0\0\0\0\0\0\270 \0\v\4foo\2+\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000082>
      [pid 3798984] 12:07:06 pread64(15, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000080>
      [pid 3798984] 12:07:06 pread64(15, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000086>
      [pid 3798984] 12:07:06 pread64(15, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000091>
      [pid 3798984] 12:07:06 pread64(15, "\0\v\3foo\0029\0\0\0\0\0\0\0\270 \0\v\4foo\0025\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000174>
      [pid 3798984] 12:07:06 pread64(16, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000080>
      [pid 3798984] 12:07:06 pread64(16, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000093>
      [pid 3798984] 12:07:06 pread64(16, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000194>
      [pid 3798984] 12:07:06 pread64(16, "\0\v\3foo\2C\0\0\0\0\0\0\0\270 \0\v\4foo\2?\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000086>
      [pid 3798984] 12:07:06 pread64(17, "\1\203W!\241QE\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 11177) = 53 <0.000079>
      [pid 3798984] 12:07:06 pread64(17, "\0\22\4rocksdb.properties\353Q\223\5\0\0\0\0\1\0\0"..., 38, 11139) = 38 <0.000047>
      [pid 3798984] 12:07:06 pread64(17, "\0$\4rocksdb.block.based.table.ind"..., 664, 10475) = 664 <0.000045>
      [pid 3798984] 12:07:06 pread64(17, "\0\v\3foo\2M\0\0\0\0\0\0\0\270 \0\v\4foo\2I\0\0\0\0\0\0\275"..., 74, 10401) = 74 <0.000107>
      [pid 3798983] 12:07:06 pread64(17, "\0\v\200\10foo\2P\0\0\0\0\0\0)U?MSg_)j(roFn($e"..., 2097152, 0) = 11230 <0.000091>
      [pid 3798983] 12:07:06 pread64(17, "", 2085922, 11230) = 0 <0.000073>
      [pid 3798983] 12:07:06 pread64(16, "\0\v\200\10foo\2F\0\0\0\0\0\0k[h3%.OPH_^:\\S7T&"..., 2097152, 0) = 11230 <0.000083>
      [pid 3798983] 12:07:06 pread64(16, "", 2085922, 11230) = 0 <0.000078>
      [pid 3798983] 12:07:06 pread64(15, "\0\v\200\10foo\2<\0\0\0\0\0\0+qToi_c{*S+4:N(:"..., 2097152, 0) = 11230 <0.000095>
      [pid 3798983] 12:07:06 pread64(15, "", 2085922, 11230) = 0 <0.000067>
      [pid 3798983] 12:07:06 pread64(14, "\0\v\200\10foo\0022\0\0\0\0\0\0%hw%OMa\"}9I609Q!B"..., 2097152, 0) = 11230 <0.000111>
      [pid 3798983] 12:07:06 pread64(14, "", 2085922, 11230) = 0 <0.000093>
      [pid 3798983] 12:07:06 pread64(13, "\0\v\200\10foo\2(\0\0\0\0\0\0p}Y&mu^DcaSGb2&nP"..., 2097152, 0) = 11230 <0.000128>
      [pid 3798983] 12:07:06 pread64(13, "", 2085922, 11230) = 0 <0.000076>
      [pid 3798983] 12:07:06 pread64(12, "\0\v\200\10foo\2\36\0\0\0\0\0\0YIyW#]oSs^6VHfB<`"..., 2097152, 0) = 11230 <0.000092>
      [pid 3798983] 12:07:06 pread64(12, "", 2085922, 11230) = 0 <0.000073>
      [pid 3798983] 12:07:06 pread64(11, "\0\v\200\10foo\2\24\0\0\0\0\0\0mfF8Jel/*Zf :-#s("..., 2097152, 0) = 11230 <0.000088>
      [pid 3798983] 12:07:06 pread64(11, "", 2085922, 11230) = 0 <0.000067>
      [pid 3798983] 12:07:06 pread64(9, "\0\v\200\10foo\2\n\0\0\0\0\0\0\\X'cjiHX)D,RSj1X!"..., 2097152, 0) = 11230 <0.000115>
      [pid 3798983] 12:07:06 pread64(9, "", 2085922, 11230) = 0 <0.000073>
      [pid 3798983] 12:07:06 pread64(8, "\1\315\5 \36\30\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 53, 754) = 53 <0.000098>
      [pid 3798983] 12:07:06 pread64(8, "\0\22\3rocksdb.properties;\215\5\0\0\0\0\1\0\0\0"..., 37, 717) = 37 <0.000064>
      [pid 3798983] 12:07:06 pread64(8, "\0$\4rocksdb.block.based.table.ind"..., 658, 59) = 658 <0.000074>
      [pid 3798983] 12:07:06 pread64(8, "\0\v\2foo\1\0\0\0\0\0\0\0\0\31\0\0\0\0\1\0\0\0\0\212\216\222P", 29, 30) = 29 <0.000064>
      [pid 3799086] 12:07:06 +++ exited with 0 +++
      [pid 3799087] 12:07:06 +++ exited with 0 +++
      [pid 3799054] 12:07:06 +++ exited with 0 +++
      strace: Process 3799104 attached
      [pid 3799104] 12:07:06 +++ exited with 0 +++
      [       OK ] DBCompactionTest.PartialManualCompaction (757 ms)
      [----------] 1 test from DBCompactionTest (758 ms total)
      
      [----------] Global test environment tear-down
      [==========] 1 test from 1 test case ran. (759 ms total)
      [  PASSED  ] 1 test.
      [pid 3798983] 12:07:06 +++ exited with 0 +++
      [pid 3798984] 12:07:06 +++ exited with 0 +++
      [pid 3798992] 12:07:06 +++ exited with 0 +++
      [pid 3798986] 12:07:06 +++ exited with 0 +++
      [pid 3798982] 12:07:06 +++ exited with 0 +++
      [pid 3798985] 12:07:06 +++ exited with 0 +++
      12:07:06 +++ exited with 0 +++
      
      Differential Revision: D15948422
      
      Pulled By: vjnadimpalli
      
      fbshipit-source-id: 9b189d1e8675d290c7784e4b33e5d3b5761d2ac8
      22028aa9
    • Y
      Fix ingested file and direcotry not being sync (#5435) · 2730fe69
      Yi Wu 提交于
      Summary:
      It it not safe to assume application had sync the SST file before ingest it into DB. Also the directory to put the ingested file needs to be fsync, otherwise the file can be lost. For integrity of RocksDB we need to sync the ingested file and directory before apply the change to manifest.
      
      Also syncing after writing global sequence when write_global_seqno=true was removed in https://github.com/facebook/rocksdb/issues/4172. Adding it back.
      
      Fixes https://github.com/facebook/rocksdb/issues/5287.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5435
      
      Test Plan:
      Test ingest file with ldb command and observe fsync/fdatasync in strace output. Tried both move_files=true and move_files=false.
      https://gist.github.com/yiwu-arbug/650a4023f57979056d83485fa863bef9
      
      More test suggestions are welcome.
      
      Differential Revision: D15941675
      
      Pulled By: riversand963
      
      fbshipit-source-id: 389533f3923065a96df2cdde23ff4724a1810d78
      2730fe69
  9. 21 6月, 2019 4 次提交
    • Y
      Stop printing after verification fails (#5493) · 1bfeffab
      Yanqin Jin 提交于
      Summary:
      Stop verification and printing once verification fails.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5493
      
      Differential Revision: D15928992
      
      Pulled By: riversand963
      
      fbshipit-source-id: 699feac034a217d57280aa3fb50f5aba06adf317
      1bfeffab
    • H
      Add more callers for table reader. (#5454) · 705b8eec
      haoyuhuang 提交于
      Summary:
      This PR adds more callers for table readers. These information are only used for block cache analysis so that we can know which caller accesses a block.
      1. It renames the BlockCacheLookupCaller to TableReaderCaller as passing the caller from upstream requires changes to table_reader.h and TableReaderCaller is a more appropriate name.
      2. It adds more table reader callers in table/table_reader_caller.h, e.g., kCompactionRefill, kExternalSSTIngestion, and kBuildTable.
      
      This PR is long as it requires modification of interfaces in table_reader.h, e.g., NewIterator.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5454
      
      Test Plan: make clean && COMPILE_WITH_ASAN=1 make check -j32.
      
      Differential Revision: D15819451
      
      Pulled By: HaoyuHuang
      
      fbshipit-source-id: b6caa704c8fb96ddd15b9a934b7e7ea87f88092d
      705b8eec
    • F
      Fix segfalut in ~DBWithTTLImpl() when called after Close() (#5485) · 0b0cb6f1
      feilongliu 提交于
      Summary:
      ~DBWithTTLImpl() fails after calling Close() function (will invoke the
      Close() function of DBImpl), because the Close() function deletes
      default_cf_handle_ which is used in the GetOptions() function called
      in ~DBWithTTLImpl(), hence lead to segfault.
      
      Fix by creating a Close() function for the DBWithTTLImpl class and do
      the close and the work originally in ~DBWithTTLImpl(). If the Close()
      function is not called, it will be called in the ~DBWithTTLImpl()
      function.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5485
      
      Test Plan: make clean;  USE_CLANG=1 make all check -j
      
      Differential Revision: D15924498
      
      fbshipit-source-id: 567397fb972961059083a1ae0f9f99ff74872b78
      0b0cb6f1
    • Z
      sanitize and limit block_size under 4GB (#5492) · 24f73436
      Zhongyi Xie 提交于
      Summary:
      `Block::restart_index_`, `Block::restarts_`, and `Block::current_` are defined as uint32_t but  `BlockBasedTableOptions::block_size` is defined as a size_t so user might see corruption as in https://github.com/facebook/rocksdb/issues/5486.
      This PR adds a check in `BlockBasedTableFactory::SanitizeOptions` to disallow such configurations.
      yiwu-arbug
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5492
      
      Differential Revision: D15914047
      
      Pulled By: miasantreble
      
      fbshipit-source-id: c943f153d967e15aee7f2795730ab8259e2be201
      24f73436
  10. 20 6月, 2019 3 次提交
    • S
      Fix AlignedBuffer's usage in Encryption Env (#5396) · 68614a96
      Sagar Vemuri 提交于
      Summary:
      The usage of `AlignedBuffer` in env_encryption.cc writes and reads to/from the AlignedBuffer's internal buffer directly without going through AlignedBuffer's APIs (like `Append` and `Read`), causing encapsulation to break in some cases. The writes are especially problematic as after the data is written to the buffer (directly using either memmove or memcpy), the size of the buffer is not updated ... causing the AlignedBuffer to lose track of the encapsulated buffer's current size.
      Fixed this by updating the buffer size after every write.
      
      Todo for later:
      Add an overloaded method to AlignedBuffer to support a memmove in addition to a memcopy. Encryption env does a memmove, and hence I couldn't switch to using `AlignedBuffer.Append()`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5396
      
      Test Plan: `make check`
      
      Differential Revision: D15764756
      
      Pulled By: sagar0
      
      fbshipit-source-id: 2e24b52bd3b4b5056c5c1da157f91ddf89370183
      68614a96
    • J
      Java: Make the generics of the Options interfaces more strict (#5461) · 5830c619
      Jurriaan Mous 提交于
      Summary:
      Make the generics of the Options interfaces more strict so they are usable in a Kotlin Multiplatform expect/actual typealias implementation without causing a Violation of Finite Bound Restriction.
      
      This fix would enable the creation of a generic Kotlin multiplatform library by just typealiasing the JVM implementation to the current Java implementation.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5461
      
      Differential Revision: D15903288
      
      Pulled By: sagar0
      
      fbshipit-source-id: 75e83fdf5d2fcede40744a17e767563d6a4b0696
      5830c619
    • V
      Combine the read-ahead logic for user reads and compaction reads (#5431) · 24b118ad
      Vijay Nadimpalli 提交于
      Summary:
      Currently the read-ahead logic for user reads and compaction reads go through different code paths where compaction reads create new table readers and use `ReadaheadRandomAccessFile`. This change is to unify read-ahead logic to use read-ahead in BlockBasedTableReader::InitDataBlock(). As a result of the change  `ReadAheadRandomAccessFile` class and `new_table_reader_for_compaction_inputs` option will no longer be used.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5431
      
      Test Plan:
      make check
      
      Here is the benchmarking - https://gist.github.com/vjnadimpalli/083cf423f7b6aa12dcdb14c858bc18a5
      
      Differential Revision: D15772533
      
      Pulled By: vjnadimpalli
      
      fbshipit-source-id: b71dca710590471ede6fb37553388654e2e479b9
      24b118ad
  11. 19 6月, 2019 10 次提交
  12. 18 6月, 2019 6 次提交
    • Z
      fix rocksdb lite and clang contrun test failures (#5477) · ddd088c8
      Zhongyi Xie 提交于
      Summary:
      recent commit 671d15cb introduced some test failures:
      ```
      ===== Running stats_history_test
      [==========] Running 9 tests from 1 test case.
      [----------] Global test environment set-up.
      [----------] 9 tests from StatsHistoryTest
      [ RUN      ] StatsHistoryTest.RunStatsDumpPeriodSec
      monitoring/stats_history_test.cc:63: Failure
      dbfull()->SetDBOptions({{"stats_dump_period_sec", "0"}})
      Not implemented: Not supported in ROCKSDB LITE
      
      db/db_options_test.cc:28:11: error: unused variable 'kMicrosInSec' [-Werror,-Wunused-const-variable]
      const int kMicrosInSec = 1000000;
      ```
      This PR fixes these failures
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5477
      
      Differential Revision: D15871814
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 0a7023914d2c1784d9d2d3f5bfb47310d4855394
      ddd088c8
    • H
      Block cache tracing: Fix minor bugs with downsampling and some benchmark results. (#5473) · bcfc53b4
      haoyuhuang 提交于
      Summary:
      As the code changes for block cache tracing are almost complete, I did a benchmark to compare the performance when block cache tracing is enabled/disabled.
      
       With 1% downsampling ratio, the performance overhead of block cache tracing is negligible. When we trace all block accesses, the throughput drops by 6 folds with 16 threads issuing random reads and all reads are served in block cache.
      
      Setup:
      RocksDB:    version 6.2
      Date:       Mon Jun 17 17:11:13 2019
      CPU:        24 * Intel Core Processor (Skylake)
      CPUCache:   16384 KB
      Keys:       20 bytes each
      Values:     100 bytes each (100 bytes after compression)
      Entries:    10000000
      Prefix:    20 bytes
      Keys per prefix:    0
      RawSize:    1144.4 MB (estimated)
      FileSize:   1144.4 MB (estimated)
      Write rate: 0 bytes/second
      Read rate: 0 ops/second
      Compression: NoCompression
      Compression sampling rate: 0
      Memtablerep: skip_list
      Perf Level: 1
      
      I ran the readrandom workload for 1 minute. Detailed throughput results:  (ops/second)
      Sample rate 0: no block cache tracing.
      Sample rate 1: trace all block accesses.
      Sample rate 100: trace accesses 1% blocks.
      1 thread |   |   |  -- | -- | -- | --
      Sample rate | 0 | 1 | 100
      1 MB block cache size | 13,094 | 13,166 | 13,341
      10 GB block cache size | 202,243 | 188,677 | 229,182
      
      16 threads |   |   |  -- | -- | -- | --
      Sample rate | 0 | 1 | 100
      1 MB block cache size | 208,761 | 178,700 | 201,872
      10 GB block cache size | 2,645,996 | 426,295 | 2,587,605
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5473
      
      Differential Revision: D15869479
      
      Pulled By: HaoyuHuang
      
      fbshipit-source-id: 7ae802abe84811281a6af8649f489887cd7c4618
      bcfc53b4
    • H
      Support computing miss ratio curves using sim_cache. (#5449) · 2d1dd5bc
      haoyuhuang 提交于
      Summary:
      This PR adds a BlockCacheTraceSimulator that reports the miss ratios given different cache configurations. A cache configuration contains "cache_name,num_shard_bits,cache_capacities". For example, "lru, 1, 1K, 2K, 4M, 4G".
      
      When we replay the trace, we also perform lookups and inserts on the simulated caches.
      In the end, it reports the miss ratio for each tuple <cache_name, num_shard_bits, cache_capacity> in a output file.
      
      This PR also adds a main source block_cache_trace_analyzer so that we can run the analyzer in command line.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5449
      
      Test Plan:
      Added tests for block_cache_trace_analyzer.
      COMPILE_WITH_ASAN=1 make check -j32.
      
      Differential Revision: D15797073
      
      Pulled By: HaoyuHuang
      
      fbshipit-source-id: aef0c5c2e7938f3e8b6a10d4a6a50e6928ecf408
      2d1dd5bc
    • Y
      Override check consistency for DBImplSecondary (#5469) · 7d8d5641
      Yanqin Jin 提交于
      Summary:
      `DBImplSecondary` calls `CheckConsistency()` during open. In the past, `DBImplSecondary` did not override this function thus `DBImpl::CheckConsistency()` is called.
      The following can happen. The secondary instance is performing consistency check which calls `GetFileSize(file_path)` but the file at `file_path` is deleted by the primary instance. `DBImpl::CheckConsistency` does not account for this and fails the consistency check. This is undesirable. The solution is that, we call `DBImpl::CheckConsistency()` first. If it passes, then we are good. If not, we give it a second chance and handles the case of file(s) being deleted.
      
      Test plan (on dev server):
      ```
      $make clean && make -j20 all
      $./db_secondary_test
      ```
      All other existing unit tests must pass as well.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5469
      
      Differential Revision: D15861845
      
      Pulled By: riversand963
      
      fbshipit-source-id: 507d72392508caed3cd003bb2e2aa43f993dd597
      7d8d5641
    • Z
      Persistent Stats: persist stats history to disk (#5046) · 671d15cb
      Zhongyi Xie 提交于
      Summary:
      This PR continues the work in https://github.com/facebook/rocksdb/pull/4748 and https://github.com/facebook/rocksdb/pull/4535 by adding a new DBOption `persist_stats_to_disk` which instructs RocksDB to persist stats history to RocksDB itself. When statistics is enabled, and  both options `stats_persist_period_sec` and `persist_stats_to_disk` are set, RocksDB will periodically write stats to a built-in column family in the following form: key -> (timestamp in microseconds)#(stats name), value -> stats value. The existing API `GetStatsHistory` will detect the current value of `persist_stats_to_disk` and either read from in-memory data structure or from the hidden column family on disk.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5046
      
      Differential Revision: D15863138
      
      Pulled By: miasantreble
      
      fbshipit-source-id: bb82abdb3f2ca581aa42531734ac799f113e931b
      671d15cb
    • M
      Make db_bloom_filter_test parallel (#5467) · ee294c24
      Maysam Yabandeh 提交于
      Summary:
      When run under TSAN it sometimes goes over 10m and times out. The slowest ones are `DBBloomFilterTestWithParam.BloomFilter` which we have 6 of them. Making the tests run in parallel should take care of the timeout issue.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5467
      
      Differential Revision: D15856912
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 26c43c55312974c1b809c070342dee037d0219f4
      ee294c24