1. 10 5月, 2018 2 次提交
    • A
      Apply use_direct_io_for_flush_and_compaction to writes only · 072ae671
      Andrew Kryczka 提交于
      Summary:
      Previously `DBOptions::use_direct_io_for_flush_and_compaction=true` combined with `DBOptions::use_direct_reads=false` could cause RocksDB to simultaneously read from two file descriptors for the same file, where background reads used direct I/O and foreground reads used buffered I/O. Our measurements found this mixed-mode I/O negatively impacted foreground read perf, compared to when only buffered I/O was used.
      
      This PR makes the mixed-mode I/O situation impossible by repurposing `DBOptions::use_direct_io_for_flush_and_compaction` to only apply to background writes, and `DBOptions::use_direct_reads` to apply to all reads. There is no risk of direct background direct writes happening simultaneously with buffered reads since we never read from and write to the same file simultaneously.
      Closes https://github.com/facebook/rocksdb/pull/3829
      
      Differential Revision: D7915443
      
      Pulled By: ajkr
      
      fbshipit-source-id: 78bcbf276449b7e7766ab6b0db246f789fb1b279
      072ae671
    • S
      Disallow to open RandomRW file if the file doesn't exist · 3690276e
      Siying Dong 提交于
      Summary:
      The only use of RandomRW is to change seqno when bulkloading, and in this use case, the file should exist. We should fail the file opening in this case.
      Closes https://github.com/facebook/rocksdb/pull/3827
      
      Differential Revision: D7913719
      
      Pulled By: siying
      
      fbshipit-source-id: 62cf6734f1a6acb9e14f715b927da388131c3492
      3690276e
  2. 27 4月, 2018 1 次提交
  3. 24 4月, 2018 2 次提交
    • G
      Support lowering CPU priority of background threads · 090c78a0
      Gabriel Wicke 提交于
      Summary:
      Background activities like compaction can negatively affect
      latency of higher-priority tasks like request processing. To avoid this,
      rocksdb already lowers the IO priority of background threads on Linux
      systems. While this takes care of typical IO-bound systems, it does not
      help much when CPU (temporarily) becomes the bottleneck. This is
      especially likely when using more expensive compression settings.
      
      This patch adds an API to allow for lowering the CPU priority of
      background threads, modeled on the IO priority API. Benchmarks (see
      below) show significant latency and throughput improvements when CPU
      bound. As a result, workloads with some CPU usage bursts should benefit
      from lower latencies at a given utilization, or should be able to push
      utilization higher at a given request latency target.
      
      A useful side effect is that compaction CPU usage is now easily visible
      in common tools, allowing for an easier estimation of the contribution
      of compaction vs. request processing threads.
      
      As with IO priority, the implementation is limited to Linux, degrading
      to a no-op on other systems.
      Closes https://github.com/facebook/rocksdb/pull/3763
      
      Differential Revision: D7740096
      
      Pulled By: gwicke
      
      fbshipit-source-id: e5d32373e8dc403a7b0c2227023f9ce4f22b413c
      090c78a0
    • M
      Improve write time breakdown stats · affe01b0
      Mike Kolupaev 提交于
      Summary:
      There's a group of stats in PerfContext for profiling the write path. They break down the write time into WAL write, memtable insert, throttling, and everything else. We use these stats a lot for figuring out the cause of slow writes.
      
      These stats got a bit out of date and are now categorizing some interesting things as "everything else", and also do some double counting. This PR fixes it and adds two new stats: time spent waiting for other threads of the batch group, and time spent waiting for scheduling flushes/compactions. Probably these will be enough to explain all the occasional abnormally slow (multiple seconds) writes that we're seeing.
      Closes https://github.com/facebook/rocksdb/pull/3602
      
      Differential Revision: D7251562
      
      Pulled By: al13n321
      
      fbshipit-source-id: 0a2d0f5a4fa5677455e1f566da931cb46efe2a0d
      affe01b0
  4. 21 4月, 2018 1 次提交
  5. 19 4月, 2018 2 次提交
    • Y
      Add block cache related DB properties · ad511684
      Yi Wu 提交于
      Summary:
      Add DB properties "rocksdb.block-cache-capacity", "rocksdb.block-cache-usage", "rocksdb.block-cache-pinned-usage" to show block cache usage.
      Closes https://github.com/facebook/rocksdb/pull/3734
      
      Differential Revision: D7657180
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: dd34a019d5878dab539c51ee82669e97b2b745fd
      ad511684
    • A
      include thread-pool priority in thread names · 3cea6139
      Andrew Kryczka 提交于
      Summary:
      Previously threads were named "rocksdb:bg\<index in thread pool\>", so the first thread in all thread pools would be named "rocksdb:bg0". Users want to be able to distinguish threads used for flush (high-pri) vs regular compaction (low-pri) vs compaction to bottom-level (bottom-pri). So I changed the thread naming convention to include the thread-pool priority.
      Closes https://github.com/facebook/rocksdb/pull/3702
      
      Differential Revision: D7581415
      
      Pulled By: ajkr
      
      fbshipit-source-id: ce04482b6acd956a401ef22dc168b84f76f7d7c1
      3cea6139
  6. 13 4月, 2018 1 次提交
    • M
      WritePrepared Txn: rollback_merge_operands hack · d15397ba
      Maysam Yabandeh 提交于
      Summary:
      This is a hack as temporary fix of MyRocks with rollbacking  the merge operands. The way MyRocks uses merge operands is without protection of locks, which violates the assumption behind the rollback algorithm. They are ok with not being rolled back as it would just create a gap in the autoincrement column. The hack add an option to disable the rollback of merge operands by default and only enables it to let the unit test pass.
      Closes https://github.com/facebook/rocksdb/pull/3711
      
      Differential Revision: D7597177
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 544be0f666c7e7abb7f651ec8b23124e05056728
      d15397ba
  7. 10 4月, 2018 1 次提交
    • M
      Fix the memory leak with pinned partitioned filters · d2bcd761
      Maysam Yabandeh 提交于
      Summary:
      The existing unit test did not set the level so the check for pinned partitioned filter/index being properly released from the block cache was not properly exercised as they only take effect in level 0. As a result a memory leak in pinned partitioned filters was hidden. The patch fix the test as well as the bug.
      Closes https://github.com/facebook/rocksdb/pull/3692
      
      Differential Revision: D7559763
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 55eff274945838af983c764a7d71e8daff092e4a
      d2bcd761
  8. 07 4月, 2018 1 次提交
  9. 06 4月, 2018 1 次提交
    • A
      protect valid backup files when max_valid_backups_to_open is set · faba3fb5
      Andrew Kryczka 提交于
      Summary:
      When `max_valid_backups_to_open` is set, the `BackupEngine` doesn't know about the files referenced by existing backups. This PR prevents us from deleting valid files when that option is set, in cases where we are unable to accurately determine refcount. There are warnings logged when we may miss deleting unreferenced files, and a recommendation in the header for users to periodically unset this option and run a full `GarbageCollect`.
      Closes https://github.com/facebook/rocksdb/pull/3518
      
      Differential Revision: D7008331
      
      Pulled By: ajkr
      
      fbshipit-source-id: 87907f964dc9716e229d08636a895d2fc7b72305
      faba3fb5
  10. 03 4月, 2018 1 次提交
    • S
      Level Compaction with TTL · 04c11b86
      Sagar Vemuri 提交于
      Summary:
      Level Compaction with TTL.
      
      As of today, a file could exist in the LSM tree without going through the compaction process for a really long time if there are no updates to the data in the file's key range. For example, in certain use cases, the keys are not actually "deleted"; instead they are just set to empty values. There might not be any more writes to this "deleted" key range, and if so, such data could remain in the LSM for a really long time resulting in wasted space.
      
      Introducing a TTL could solve this problem. Files (and, in turn, data) older than TTL will be scheduled for compaction when there is no other background work. This will make the data go through the regular compaction process and get rid of old unwanted data.
      This also has the (good) side-effect of all the data in the non-bottommost level being newer than ttl, and all data in the bottommost level older than ttl. It could lead to more writes while reducing space.
      
      This functionality can be controlled by the newly introduced column family option -- ttl.
      
      TODO for later:
      - Make ttl mutable
      - Extend TTL to Universal compaction as well? (TTL is already supported in FIFO)
      - Maybe deprecate CompactionOptionsFIFO.ttl in favor of this new ttl option.
      Closes https://github.com/facebook/rocksdb/pull/3591
      
      Differential Revision: D7275442
      
      Pulled By: sagar0
      
      fbshipit-source-id: dcba484717341200d419b0953dafcdf9eb2f0267
      04c11b86
  11. 27 3月, 2018 2 次提交
    • A
      Align SST file data blocks to avoid spanning multiple pages · f9f4d40f
      Anand Ananthabhotla 提交于
      Summary:
      Provide a block_align option in BlockBasedTableOptions to allow
      alignment of SST file data blocks. This will avoid higher
      IOPS/throughput load due to < 4KB data blocks spanning 2 4KB pages.
      When this option is set to true, the block alignment is set to lower of
      block size and 4KB.
      Closes https://github.com/facebook/rocksdb/pull/3502
      
      Differential Revision: D7400897
      
      Pulled By: anand1976
      
      fbshipit-source-id: 04cc3bd144e88e3431a4f97604e63ad7a0f06d44
      f9f4d40f
    • M
      Fix race condition via concurrent FlushWAL · 35a4469b
      Maysam Yabandeh 提交于
      Summary:
      Currently log_writer->AddRecord in WriteImpl is protected from concurrent calls via FlushWAL only if two_write_queues_ option is set. The patch fixes the problem by i) skip log_writer->AddRecord in FlushWAL if manual_wal_flush is not set, ii) protects log_writer->AddRecord in WriteImpl via log_write_mutex_ if manual_wal_flush_ is set but two_write_queues_ is not.
      
      Fixes #3599
      Closes https://github.com/facebook/rocksdb/pull/3656
      
      Differential Revision: D7405608
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d6cc265051c77ae49c7c6df4f427350baaf46934
      35a4469b
  12. 24 3月, 2018 1 次提交
    • S
      Add Java-API-Changes section to History · 7ffce280
      Sagar Vemuri 提交于
      Summary:
      We have not been updating our HISTORY.md change log with the RocksJava changes. Going forward, lets add Java changes also to HISTORY.md.
      There is an old java/HISTORY-JAVA.md, but it hasn't been updated in years. It is much easier to remember to update the change log in a single file, HISTORY.md.
      
      I added information about shared block cache here, which was introduced in #3623.
      Closes https://github.com/facebook/rocksdb/pull/3647
      
      Differential Revision: D7384448
      
      Pulled By: sagar0
      
      fbshipit-source-id: 9b6e569f44e6df5cb7ba06413d9975df0b517d20
      7ffce280
  13. 23 3月, 2018 3 次提交
  14. 15 3月, 2018 1 次提交
    • A
      Fix WAL corruption from checkpoint/backup race condition · 0cdaa1a8
      Andrew Kryczka 提交于
      Summary:
      `Writer::WriteBuffer` was always called at the beginning of checkpoint/backup. But that log writer has no internal synchronization, which meant the same buffer could be flushed twice in a race condition case, causing a WAL entry to be duplicated. Then subsequent WAL entries would be at unexpected offsets, causing the 32KB block boundaries to be overlapped and manifesting as a corruption.
      
      This PR fixes the behavior to only use `WriteBuffer` (via `FlushWAL`) in checkpoint/backup when manual WAL flush is enabled. In that case, users are responsible for providing synchronization between WAL flushes. We can also consider removing the call entirely.
      Closes https://github.com/facebook/rocksdb/pull/3603
      
      Differential Revision: D7277447
      
      Pulled By: ajkr
      
      fbshipit-source-id: 1b15bd7fd930511222b075418c10de0aaa70a35a
      0cdaa1a8
  15. 09 3月, 2018 1 次提交
  16. 07 3月, 2018 2 次提交
    • A
      Disallow compactions if there isn't enough free space · 0a3db28d
      amytai 提交于
      Summary:
      This diff handles cases where compaction causes an ENOSPC error.
      This does not handle corner cases where another background job is started while compaction is running, and the other background job triggers ENOSPC, although we do allow the user to provision for these background jobs with SstFileManager::SetCompactionBufferSize.
      It also does not handle the case where compaction has finished and some other background job independently triggers ENOSPC.
      
      Usage: Functionality is inside SstFileManager. In particular, users should set SstFileManager::SetMaxAllowedSpaceUsage, which is the reference highwatermark for determining whether to cancel compactions.
      Closes https://github.com/facebook/rocksdb/pull/3449
      
      Differential Revision: D7016941
      
      Pulled By: amytai
      
      fbshipit-source-id: 8965ab8dd8b00972e771637a41b4e6c645450445
      0a3db28d
    • A
      Enable subcompactions in manual level-based compaction · 20c508c1
      Andrew Kryczka 提交于
      Summary:
      This is the simplest way I could think of to speed up `CompactRange`. It works but isn't that optimal because it relies on the same `max_compaction_bytes` and `max_subcompactions` options that are used in other places. If it turns out to be useful we can allow overriding these in `CompactRangeOptions` in the future.
      Closes https://github.com/facebook/rocksdb/pull/3549
      
      Differential Revision: D7117634
      
      Pulled By: ajkr
      
      fbshipit-source-id: d0cd03d6bd0d2fd7ea3fb13cd3b8bf7c47d11e42
      20c508c1
  17. 03 3月, 2018 1 次提交
    • Y
      Blob DB: remove existing garbage collection implementation · 1209b6db
      Yi Wu 提交于
      Summary:
      Red diff to remove existing implementation of garbage collection. The current approach is reference counting kind of approach and require a lot of effort to get the size counter right on compaction and deletion. I'm going to go with a simple mark-sweep kind of approach and will send another PR for that.
      
      CompactionEventListener was added solely for blob db and it adds complexity and overhead to compaction iterator. Removing it as well.
      Closes https://github.com/facebook/rocksdb/pull/3551
      
      Differential Revision: D7130190
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: c3a375ad2639a3f6ed179df6eda602372cc5b8df
      1209b6db
  18. 02 3月, 2018 2 次提交
    • M
      Fix a leak in prepared_section_completed_ · d060421c
      Maysam Yabandeh 提交于
      Summary:
      The zeroed entries were not removed from prepared_section_completed_ map. This patch adds a unit test to show the problem and fixes that by refactoring the code. The new code is more efficient since i) it uses two separate mutex to avoid contention between commit and prepare threads, ii) it uses a sorted vector for maintaining uniq log entires with prepare which avoids a very large heap with many duplicate entries.
      Closes https://github.com/facebook/rocksdb/pull/3545
      
      Differential Revision: D7106071
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: b3ae17cb6cd37ef10b6b35e0086c15c758768a48
      d060421c
    • Y
      Add "rocksdb.live-sst-files-size" DB property · bf937cf1
      Yi Wu 提交于
      Summary:
      Add "rocksdb.live-sst-files-size" DB property which only include files of latest version. Existing "rocksdb.total-sst-files-size" include files from all versions and thus include files that's obsolete but not yet deleted. I'm going to use this new property to cap blob db sst + blob files size.
      Closes https://github.com/facebook/rocksdb/pull/3548
      
      Differential Revision: D7116939
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: c6a52e45ce0f24ef78708156e1a923c1dd6bc79a
      bf937cf1
  19. 28 2月, 2018 1 次提交
    • A
      skip CompactRange flush based on memtable contents · 3ae00472
      Andrew Kryczka 提交于
      Summary:
      CompactRange has a call to Flush because we guarantee that, at the time it's called, all existing keys in the range will be pushed through the user's compaction filter. However, previously the flush was done blindly, so it'd happen even if the memtable does not contain keys in the range specified by the user. This caused unnecessarily many L0 files to be created, leading to write stalls in some cases. This PR checks the memtable's contents, and decides to flush only if it overlaps with `CompactRange`'s range.
      
      - Move the memtable overlap check logic from `ExternalSstFileIngestionJob` to `ColumnFamilyData::RangesOverlapWithMemtables`
      - Reuse the above logic in `CompactRange` and skip flushing if no overlap
      Closes https://github.com/facebook/rocksdb/pull/3520
      
      Differential Revision: D7018897
      
      Pulled By: ajkr
      
      fbshipit-source-id: a3c6b1cfae56687b49dd89ccac7c948e53545934
      3ae00472
  20. 23 2月, 2018 2 次提交
  21. 21 2月, 2018 1 次提交
    • A
      fix handling of empty string as checkpoint directory · 1960e73e
      Andrew Kryczka 提交于
      Summary:
      - made `CreateCheckpoint` properly return `InvalidArgument` when called with an empty directory. Previously it triggered an assertion failure due to a bug in the logic.
      - made `ldb` set empty `checkpoint_dir` if that's what the user specifies, so that we can use it to properly test `CreateCheckpoint` in the future.
      
      Differential Revision: D6874562
      
      fbshipit-source-id: dcc1bd41768261d9338987fa7711444289707ed7
      1960e73e
  22. 13 2月, 2018 1 次提交
    • A
      Add delay before flush in CompactRange to avoid write stalling · ee1c8026
      Andrew Kryczka 提交于
      Summary:
      - Refactored logic for checking write stall condition to a helper function: `GetWriteStallConditionAndCause`. Now it is decoupled from the logic for updating WriteController / stats in `RecalculateWriteStallConditions`, so we can reuse it for predicting whether write stall will occur.
      - Updated `CompactRange` to first check whether the one additional immutable memtable / L0 file would cause stalling before it flushes. If so, it waits until that is no longer true.
      - Updated `bg_cv_` to be signaled on `SetOptions` calls. The stall conditions `CompactRange` cares about can change when (1) flush finishes, (2) compaction finishes, or (3) options dynamically change. The cv was already signaled for (1) and (2) but not yet for (3).
      Closes https://github.com/facebook/rocksdb/pull/3381
      
      Differential Revision: D6754983
      
      Pulled By: ajkr
      
      fbshipit-source-id: 5613e03f1524df7192dc6ae885d40fd8f091d972
      ee1c8026
  23. 06 2月, 2018 1 次提交
  24. 31 1月, 2018 1 次提交
  25. 26 1月, 2018 1 次提交
    • S
      Improve performance of long range scans with readahead · d938226a
      Sagar Vemuri 提交于
      Summary:
      This change improves the performance of iterators doing long range scans (e.g. big/full table scans in MyRocks) by using readahead and prefetching additional data on each disk IO. This prefetching is automatically enabled on noticing more than 2 IOs for the same table file during iteration. The readahead size starts with 8KB and is exponentially increased on each additional sequential IO, up to a max of 256 KB. This helps in cutting down the number of IOs needed to complete the range scan.
      
      Constraints:
      - The prefetched data is stored by the OS in page cache. So this currently works only for non direct-reads use-cases i.e applications which use page cache. (Direct-I/O support will be enabled in a later PR).
      - This gets currently enabled only when ReadOptions.readahead_size = 0 (which is the default value).
      
      Thanks to siying for the original idea and implementation.
      
      **Benchmarks:**
      Data fill:
      ```
      TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=fillrandom -num=1000000000 -compression_type="none" -level_compaction_dynamic_level_bytes
      ```
      Do a long range scan: Seekrandom with large number of nexts
      ```
      TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=seekrandom -duration=60 -num=1000000000 -use_existing_db -seek_nexts=10000 -statistics -histogram
      ```
      
      Page cache was cleared before each experiment with the command:
      ```
      sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
      ```
      ```
      Before:
      seekrandom   :   34020.945 micros/op 29 ops/sec;   32.5 MB/s (1636 of 1999 found)
      With this change:
      seekrandom   :    8726.912 micros/op 114 ops/sec;  126.8 MB/s (5702 of 6999 found)
      ```
      ~3.9X performance improvement.
      
      Also verified with strace and gdb that the readahead size is increasing as expected.
      ```
      strace -e readahead -f -T -t -p <db_bench process pid>
      ```
      Closes https://github.com/facebook/rocksdb/pull/3282
      
      Differential Revision: D6586477
      
      Pulled By: sagar0
      
      fbshipit-source-id: 8a118a0ed4594fbb7f5b1cafb242d7a4033cb58c
      d938226a
  26. 24 1月, 2018 1 次提交
  27. 19 1月, 2018 1 次提交
    • Y
      Fix Flush() keep waiting after flush finish · f1cb83fc
      Yi Wu 提交于
      Summary:
      Flush() call could be waiting indefinitely if min_write_buffer_number_to_merge is used. Consider the sequence:
      1. User call Flush() with flush_options.wait = true
      2. The manual flush started in the background
      3. New memtable become immutable because of writes. The new memtable will not trigger flush if min_write_buffer_number_to_merge is not reached.
      4. The manual flush finish.
      
      Because of the new memtable created at step 3 not being flush, previous logic of WaitForFlushMemTable() keep waiting, despite the memtables it intent to flush has been flushed.
      
      Here instead of checking if there are any more memtables to flush, WaitForFlushMemTable() also check the id of the earliest memtable. If the id is larger than that of latest memtable at the time flush was initiated, it means all the memtable at the time of flush start has all been flush.
      Closes https://github.com/facebook/rocksdb/pull/3378
      
      Differential Revision: D6746789
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 35e698f71c7f90b06337a93e6825f4ea3b619bfa
      f1cb83fc
  28. 18 1月, 2018 1 次提交
    • A
      fix live WALs purged while file deletions disabled · 46e599fc
      Andrew Kryczka 提交于
      Summary:
      When calling `DisableFileDeletions` followed by `GetSortedWalFiles`, we guarantee the files returned by the latter call won't be deleted until after file deletions are re-enabled. However, `GetSortedWalFiles` didn't omit files already planned for deletion via `PurgeObsoleteFiles`, so the guarantee could be broken.
      
      We fix it by making `GetSortedWalFiles` wait for the number of pending purges to hit zero if file deletions are disabled. This condition is eventually met since `PurgeObsoleteFiles` is guaranteed to be called for the existing pending purges, and new purges cannot be scheduled while file deletions are disabled. Once the condition is met, `GetSortedWalFiles` simply returns the content of DB and archive directories, which nobody can delete (except for deletion scheduler, for which I plan to fix this bug later) until deletions are re-enabled.
      Closes https://github.com/facebook/rocksdb/pull/3341
      
      Differential Revision: D6681131
      
      Pulled By: ajkr
      
      fbshipit-source-id: 90b1e2f2362ea9ef715623841c0826611a817634
      46e599fc
  29. 20 12月, 2017 1 次提交
    • Y
      Port 3 way SSE4.2 crc32c implementation from Folly · f54d7f5f
      yingsu00 提交于
      Summary:
      **# Summary**
      
      RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.
      
      This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.
      
      **# Performance Test Results**
      We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.
      
      Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.
      
      1) ReadRandom in db_bench overall metrics
      
          PER RUN
          Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
          3-way      |  1   | 4.143   | 241387 | 26.7
          3-way      |  2   | 3.775   | 264872 | 29.3
          3-way      | 3    | 4.116   | 242929 | 26.9
          FastCrc32c|1  | 4.037   | 247727 | 27.4
          FastCrc32c|2  | 4.648   | 215166 | 23.8
          FastCrc32c|3  | 4.352   | 229799 | 25.4
      
           AVG
          Algorithm     |    Average of micros/op |   Average of ops/sec |    Average of Throughput (MB/s)
          3-way           |     4.01                               |      249,729                 |      27.63
          FastCrc32c  |     4.35                              |     230,897                  |      25.53
      
       2)   Crc32c computation CPU cost (inclusive samples percentage)
          PER RUN
          Implementation | run |  TotalSamples   | Crc32c percentage
          3-way                 |  1    |  4,572,250,000 | 4.37%
          3-way                 |  2    |  3,779,250,000 | 4.62%
          3-way                 |  3    |  4,129,500,000 | 4.48%
          FastCrc32c       |  1    |  4,663,500,000 | 11.24%
          FastCrc32c       |  2    |  4,047,500,000 | 12.34%
          FastCrc32c       |  3    |  4,366,750,000 | 11.68%
      
       **# Test Plan**
           make -j64 corruption_test && ./corruption_test
            By default it uses 3-way SSE algorithm
      
           NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test
      
          make clean && DEBUG_LEVEL=0 make -j64 db_bench
          make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
      Closes https://github.com/facebook/rocksdb/pull/3173
      
      Differential Revision: D6330882
      
      Pulled By: yingsu00
      
      fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
      f54d7f5f
  30. 12 12月, 2017 1 次提交
  31. 08 12月, 2017 1 次提交