- 21 7月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: This diff update the code to pin the merge operator operands while the merge operation is done, so that we can eliminate the memcpy cost, to do that we need a new public API for FullMerge that replace the std::deque<std::string> with std::vector<Slice> This diff is stacked on top of D56493 and D56511 In this diff we - Update FullMergeV2 arguments to be encapsulated in MergeOperationInput and MergeOperationOutput which will make it easier to add new arguments in the future - Replace std::deque<std::string> with std::vector<Slice> to pass operands - Replace MergeContext std::deque with std::vector (based on a simple benchmark I ran https://gist.github.com/IslamAbdelRahman/78fc86c9ab9f52b1df791e58943fb187) - Allow FullMergeV2 output to be an existing operand ``` [Everything in Memtable | 10K operands | 10 KB each | 1 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=10000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000 [FullMergeV2] readseq : 0.607 micros/op 1648235 ops/sec; 16121.2 MB/s readseq : 0.478 micros/op 2091546 ops/sec; 20457.2 MB/s readseq : 0.252 micros/op 3972081 ops/sec; 38850.5 MB/s readseq : 0.237 micros/op 4218328 ops/sec; 41259.0 MB/s readseq : 0.247 micros/op 4043927 ops/sec; 39553.2 MB/s [master] readseq : 3.935 micros/op 254140 ops/sec; 2485.7 MB/s readseq : 3.722 micros/op 268657 ops/sec; 2627.7 MB/s readseq : 3.149 micros/op 317605 ops/sec; 3106.5 MB/s readseq : 3.125 micros/op 320024 ops/sec; 3130.1 MB/s readseq : 4.075 micros/op 245374 ops/sec; 2400.0 MB/s ``` ``` [Everything in Memtable | 10K operands | 10 KB each | 10 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=1000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000 [FullMergeV2] readseq : 3.472 micros/op 288018 ops/sec; 2817.1 MB/s readseq : 2.304 micros/op 434027 ops/sec; 4245.2 MB/s readseq : 1.163 micros/op 859845 ops/sec; 8410.0 MB/s readseq : 1.192 micros/op 838926 ops/sec; 8205.4 MB/s readseq : 1.250 micros/op 800000 ops/sec; 7824.7 MB/s [master] readseq : 24.025 micros/op 41623 ops/sec; 407.1 MB/s readseq : 18.489 micros/op 54086 ops/sec; 529.0 MB/s readseq : 18.693 micros/op 53495 ops/sec; 523.2 MB/s readseq : 23.621 micros/op 42335 ops/sec; 414.1 MB/s readseq : 18.775 micros/op 53262 ops/sec; 521.0 MB/s ``` ``` [Everything in Block cache | 10K operands | 10 KB each | 1 operand per key] [FullMergeV2] $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions readseq : 14.741 micros/op 67837 ops/sec; 663.5 MB/s readseq : 1.029 micros/op 971446 ops/sec; 9501.6 MB/s readseq : 0.974 micros/op 1026229 ops/sec; 10037.4 MB/s readseq : 0.965 micros/op 1036080 ops/sec; 10133.8 MB/s readseq : 0.943 micros/op 1060657 ops/sec; 10374.2 MB/s [master] readseq : 16.735 micros/op 59755 ops/sec; 584.5 MB/s readseq : 3.029 micros/op 330151 ops/sec; 3229.2 MB/s readseq : 3.136 micros/op 318883 ops/sec; 3119.0 MB/s readseq : 3.065 micros/op 326245 ops/sec; 3191.0 MB/s readseq : 3.014 micros/op 331813 ops/sec; 3245.4 MB/s ``` ``` [Everything in Block cache | 10K operands | 10 KB each | 10 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10-operands-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions [FullMergeV2] readseq : 24.325 micros/op 41109 ops/sec; 402.1 MB/s readseq : 1.470 micros/op 680272 ops/sec; 6653.7 MB/s readseq : 1.231 micros/op 812347 ops/sec; 7945.5 MB/s readseq : 1.091 micros/op 916590 ops/sec; 8965.1 MB/s readseq : 1.109 micros/op 901713 ops/sec; 8819.6 MB/s [master] readseq : 27.257 micros/op 36687 ops/sec; 358.8 MB/s readseq : 4.443 micros/op 225073 ops/sec; 2201.4 MB/s readseq : 5.830 micros/op 171526 ops/sec; 1677.7 MB/s readseq : 4.173 micros/op 239635 ops/sec; 2343.8 MB/s readseq : 4.150 micros/op 240963 ops/sec; 2356.8 MB/s ``` Test Plan: COMPILE_WITH_ASAN=1 make check -j64 Reviewers: yhchiang, andrewkr, sdong Reviewed By: sdong Subscribers: lovro, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D57075
-
- 14 6月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: We have alot of code duplication whenever we call FullMerge we keep duplicating the instrumentation and statistics code This is a simple diff to refactor the code to use TimedFullMerge instead of FullMerge Test Plan: COMPILE_WITH_ASAN=1 make check -j64 Reviewers: andrewkr, yhchiang, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D59577
-
- 04 5月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: We should not use IterKey::SetKey with copy = false except if we are pinning the iterator thru it's life time, otherwise we may release the temporarily pinned blocks and in this case the IterKey will be pointing to freed memory Test Plan: added a new test Reviewers: sdong, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D57561
-
- 03 5月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: This diff is stacked on top of this diff https://reviews.facebook.net/D56493 The current Iterator::Prev() implementation need to copy every value since the underlying Iterator may move after reading the value. This can be optimized by making sure that the block containing the value is pinned until the Iterator move. which will improve the throughput by up to 1.5X master ``` ==> 1000000_Keys_100Byte.txt <== readreverse : 0.449 micros/op 2225887 ops/sec; 246.2 MB/s readreverse : 0.433 micros/op 2311508 ops/sec; 255.7 MB/s readreverse : 0.436 micros/op 2294335 ops/sec; 253.8 MB/s readreverse : 0.471 micros/op 2121295 ops/sec; 234.7 MB/s readreverse : 0.465 micros/op 2152227 ops/sec; 238.1 MB/s readreverse : 0.454 micros/op 2203011 ops/sec; 243.7 MB/s readreverse : 0.451 micros/op 2216095 ops/sec; 245.2 MB/s readreverse : 0.462 micros/op 2162447 ops/sec; 239.2 MB/s readreverse : 0.476 micros/op 2099151 ops/sec; 232.2 MB/s readreverse : 0.472 micros/op 2120710 ops/sec; 234.6 MB/s avg : 242.34 MB/s ==> 1000000_Keys_1KB.txt <== readreverse : 1.013 micros/op 986793 ops/sec; 978.7 MB/s readreverse : 0.942 micros/op 1061136 ops/sec; 1052.5 MB/s readreverse : 0.951 micros/op 1051901 ops/sec; 1043.3 MB/s readreverse : 0.932 micros/op 1072894 ops/sec; 1064.1 MB/s readreverse : 1.024 micros/op 976720 ops/sec; 968.7 MB/s readreverse : 0.935 micros/op 1069169 ops/sec; 1060.4 MB/s readreverse : 1.012 micros/op 988132 ops/sec; 980.1 MB/s readreverse : 0.962 micros/op 1039579 ops/sec; 1031.1 MB/s readreverse : 0.991 micros/op 1008924 ops/sec; 1000.7 MB/s readreverse : 1.004 micros/op 996144 ops/sec; 988.0 MB/s avg : 1016.76 MB/s ==> 1000000_Keys_10KB.txt <== readreverse : 4.167 micros/op 239952 ops/sec; 2346.9 MB/s readreverse : 4.070 micros/op 245713 ops/sec; 2403.3 MB/s readreverse : 4.572 micros/op 218733 ops/sec; 2139.4 MB/s readreverse : 4.497 micros/op 222388 ops/sec; 2175.2 MB/s readreverse : 4.203 micros/op 237920 ops/sec; 2327.1 MB/s readreverse : 4.206 micros/op 237756 ops/sec; 2325.5 MB/s readreverse : 4.181 micros/op 239149 ops/sec; 2339.1 MB/s readreverse : 4.157 micros/op 240552 ops/sec; 2352.8 MB/s readreverse : 4.187 micros/op 238848 ops/sec; 2336.1 MB/s readreverse : 4.106 micros/op 243575 ops/sec; 2382.4 MB/s avg : 2312.78 MB/s ==> 100000_Keys_100KB.txt <== readreverse : 41.281 micros/op 24224 ops/sec; 2366.0 MB/s readreverse : 39.722 micros/op 25175 ops/sec; 2458.9 MB/s readreverse : 40.319 micros/op 24802 ops/sec; 2422.5 MB/s readreverse : 39.762 micros/op 25149 ops/sec; 2456.4 MB/s readreverse : 40.916 micros/op 24440 ops/sec; 2387.1 MB/s readreverse : 41.188 micros/op 24278 ops/sec; 2371.4 MB/s readreverse : 40.061 micros/op 24962 ops/sec; 2438.1 MB/s readreverse : 40.221 micros/op 24862 ops/sec; 2428.4 MB/s readreverse : 40.084 micros/op 24947 ops/sec; 2436.7 MB/s readreverse : 40.655 micros/op 24597 ops/sec; 2402.4 MB/s avg : 2416.79 MB/s ==> 10000_Keys_1MB.txt <== readreverse : 298.038 micros/op 3355 ops/sec; 3355.3 MB/s readreverse : 335.001 micros/op 2985 ops/sec; 2985.1 MB/s readreverse : 286.956 micros/op 3484 ops/sec; 3484.9 MB/s readreverse : 329.954 micros/op 3030 ops/sec; 3030.8 MB/s readreverse : 306.428 micros/op 3263 ops/sec; 3263.5 MB/s readreverse : 330.749 micros/op 3023 ops/sec; 3023.5 MB/s readreverse : 328.903 micros/op 3040 ops/sec; 3040.5 MB/s readreverse : 324.853 micros/op 3078 ops/sec; 3078.4 MB/s readreverse : 320.488 micros/op 3120 ops/sec; 3120.3 MB/s readreverse : 320.536 micros/op 3119 ops/sec; 3119.8 MB/s avg : 3150.21 MB/s ``` After memcpy elimination ``` ==> 1000000_Keys_100Byte.txt <== readreverse : 0.395 micros/op 2529890 ops/sec; 279.9 MB/s readreverse : 0.368 micros/op 2715922 ops/sec; 300.5 MB/s readreverse : 0.384 micros/op 2603929 ops/sec; 288.1 MB/s readreverse : 0.375 micros/op 2663286 ops/sec; 294.6 MB/s readreverse : 0.357 micros/op 2802180 ops/sec; 310.0 MB/s readreverse : 0.363 micros/op 2757684 ops/sec; 305.1 MB/s readreverse : 0.372 micros/op 2689603 ops/sec; 297.5 MB/s readreverse : 0.379 micros/op 2638599 ops/sec; 291.9 MB/s readreverse : 0.375 micros/op 2663803 ops/sec; 294.7 MB/s readreverse : 0.375 micros/op 2665579 ops/sec; 294.9 MB/s avg: 295.72 MB/s (1.22 X) ==> 1000000_Keys_1KB.txt <== readreverse : 0.879 micros/op 1138112 ops/sec; 1128.8 MB/s readreverse : 0.842 micros/op 1187998 ops/sec; 1178.3 MB/s readreverse : 0.837 micros/op 1194915 ops/sec; 1185.1 MB/s readreverse : 0.845 micros/op 1182983 ops/sec; 1173.3 MB/s readreverse : 0.877 micros/op 1140308 ops/sec; 1131.0 MB/s readreverse : 0.849 micros/op 1177581 ops/sec; 1168.0 MB/s readreverse : 0.915 micros/op 1093284 ops/sec; 1084.3 MB/s readreverse : 0.863 micros/op 1159418 ops/sec; 1149.9 MB/s readreverse : 0.895 micros/op 1117670 ops/sec; 1108.5 MB/s readreverse : 0.852 micros/op 1174116 ops/sec; 1164.5 MB/s avg: 1147.17 MB/s (1.12 X) ==> 1000000_Keys_10KB.txt <== readreverse : 3.870 micros/op 258386 ops/sec; 2527.2 MB/s readreverse : 3.568 micros/op 280296 ops/sec; 2741.5 MB/s readreverse : 4.005 micros/op 249694 ops/sec; 2442.2 MB/s readreverse : 3.550 micros/op 281719 ops/sec; 2755.5 MB/s readreverse : 3.562 micros/op 280758 ops/sec; 2746.1 MB/s readreverse : 3.507 micros/op 285125 ops/sec; 2788.8 MB/s readreverse : 3.463 micros/op 288739 ops/sec; 2824.1 MB/s readreverse : 3.428 micros/op 291734 ops/sec; 2853.4 MB/s readreverse : 3.553 micros/op 281491 ops/sec; 2753.2 MB/s readreverse : 3.535 micros/op 282885 ops/sec; 2766.9 MB/s avg : 2719.89 MB/s (1.17 X) ==> 100000_Keys_100KB.txt <== readreverse : 22.815 micros/op 43830 ops/sec; 4281.0 MB/s readreverse : 29.957 micros/op 33381 ops/sec; 3260.4 MB/s readreverse : 25.334 micros/op 39473 ops/sec; 3855.4 MB/s readreverse : 23.037 micros/op 43409 ops/sec; 4239.8 MB/s readreverse : 27.810 micros/op 35958 ops/sec; 3512.1 MB/s readreverse : 30.327 micros/op 32973 ops/sec; 3220.6 MB/s readreverse : 29.704 micros/op 33665 ops/sec; 3288.2 MB/s readreverse : 29.423 micros/op 33987 ops/sec; 3319.6 MB/s readreverse : 23.334 micros/op 42856 ops/sec; 4185.9 MB/s readreverse : 29.969 micros/op 33368 ops/sec; 3259.1 MB/s avg : 3642.21 MB/s (1.5 X) ==> 10000_Keys_1MB.txt <== readreverse : 244.748 micros/op 4085 ops/sec; 4085.9 MB/s readreverse : 230.208 micros/op 4343 ops/sec; 4344.0 MB/s readreverse : 235.655 micros/op 4243 ops/sec; 4243.6 MB/s readreverse : 235.730 micros/op 4242 ops/sec; 4242.2 MB/s readreverse : 237.346 micros/op 4213 ops/sec; 4213.3 MB/s readreverse : 227.306 micros/op 4399 ops/sec; 4399.4 MB/s readreverse : 194.957 micros/op 5129 ops/sec; 5129.4 MB/s readreverse : 238.359 micros/op 4195 ops/sec; 4195.4 MB/s readreverse : 221.588 micros/op 4512 ops/sec; 4513.0 MB/s readreverse : 235.911 micros/op 4238 ops/sec; 4239.0 MB/s avg : 4360.52 MB/s (1.38 X) ``` Test Plan: COMPILE_WITH_ASAN=1 make check -j64 Reviewers: andrewkr, yhchiang, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D56511
-
- 29 4月, 2016 1 次提交
-
-
由 Peter Mattis 提交于
This avoids excessive iteration in tombstone fields.
-
- 27 4月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: While trying to reuse PinData() / ReleasePinnedData() .. to optimize away some memcpys I realized that there is a significant overhead for using PinData() / ReleasePinnedData if they were called many times. This diff refactor the pinning logic by introducing PinnedIteratorsManager a centralized component that will be created once and will be notified whenever we need to Pin an Iterator. This implementation have much less overhead than the original implementation Test Plan: make check -j64 COMPILE_WITH_ASAN=1 make check -j64 Reviewers: yhchiang, sdong, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D56493
-
- 02 4月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: This patch is similar to D52563, When we iterate over a DB with merge operands we keep creating std::queue to store the operands, optimize this by reusing merge_operands_ data member Before the patch ``` ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq" --db="/dev/shm/bench_merge_memcpy_on_the_fly/" --merge_operator="put" --merge_keys=10000 --num=10000 DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] mergerandom : 3.757 micros/op 266141 ops/sec; 29.4 MB/s ( updates:10000) DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.413 micros/op 2423538 ops/sec; 268.1 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.451 micros/op 2219071 ops/sec; 245.5 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.420 micros/op 2382039 ops/sec; 263.5 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.408 micros/op 2452017 ops/sec; 271.3 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] mergerandom : 3.947 micros/op 253376 ops/sec; 28.0 MB/s ( updates:10000) DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.441 micros/op 2266473 ops/sec; 250.7 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.471 micros/op 2122033 ops/sec; 234.8 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.440 micros/op 2271407 ops/sec; 251.3 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.429 micros/op 2331471 ops/sec; 257.9 MB/s ``` with the patch ``` ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq" --db="/dev/shm/bench_merge_memcpy_on_the_fly/" --merge_operator="put" --merge_keys=10000 --num=10000 DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] mergerandom : 4.080 micros/op 245092 ops/sec; 27.1 MB/s ( updates:10000) DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.308 micros/op 3241843 ops/sec; 358.6 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.312 micros/op 3200408 ops/sec; 354.0 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.332 micros/op 3013962 ops/sec; 333.4 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.300 micros/op 3328017 ops/sec; 368.2 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] mergerandom : 3.973 micros/op 251705 ops/sec; 27.8 MB/s ( updates:10000) DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.320 micros/op 3123752 ops/sec; 345.6 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.335 micros/op 2986641 ops/sec; 330.4 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.339 micros/op 2950047 ops/sec; 326.4 MB/s DB path: [/dev/shm/bench_merge_memcpy_on_the_fly/] readseq : 0.319 micros/op 3131565 ops/sec; 346.4 MB/s ``` Test Plan: make check -j64 Reviewers: yhchiang, andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D56031
-
- 12 3月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: This patch bump the counters in the frequent code path DBIter::Next() / DBIter::Prev() in a local data members and send them to Statistics when the iterator is destroyed A better solution will be to have thread_local implementation for Statistics New performance ``` readseq : 0.035 micros/op 28597881 ops/sec; 3163.7 MB/s 1,851,568,819 stalled-cycles-frontend # 31.29% frontend cycles idle [49.86%] 884,929,823 stalled-cycles-backend # 14.95% backend cycles idle [50.21%] readreverse : 0.071 micros/op 14077393 ops/sec; 1557.3 MB/s 3,239,575,993 stalled-cycles-frontend # 27.36% frontend cycles idle [49.96%] 1,558,253,983 stalled-cycles-backend # 13.16% backend cycles idle [50.14%] ``` Existing performance ``` readreverse : 0.174 micros/op 5732342 ops/sec; 634.1 MB/s 20,570,209,389 stalled-cycles-frontend # 70.71% frontend cycles idle [50.01%] 18,422,816,837 stalled-cycles-backend # 63.33% backend cycles idle [50.04%] readseq : 0.119 micros/op 8400537 ops/sec; 929.3 MB/s 15,634,225,844 stalled-cycles-frontend # 79.07% frontend cycles idle [49.96%] 14,227,427,453 stalled-cycles-backend # 71.95% backend cycles idle [50.09%] ``` Test Plan: unit tests Reviewers: yhchiang, sdong, igor Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55107
-
- 05 3月, 2016 1 次提交
-
-
由 sdong 提交于
Change Property name from "rocksdb.current_version_number" to "rocksdb.current-super-version-number" Summary: I realized I again is wrong about the naming convention. Let me change it to the correct one. Test Plan: Run unit tests. Reviewers: IslamAbdelRahman, kradhakrishnan, yhchiang, andrewkr Reviewed By: andrewkr Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D55041
-
- 03 3月, 2016 1 次提交
-
-
由 sdong 提交于
Summary: We want to provide a way to detect whether an iterator is stale and needs to be recreated. Add a iterator property to return version number. Test Plan: Add two unit tests for it. Reviewers: IslamAbdelRahman, yhchiang, anthony, kradhakrishnan, andrewkr Reviewed By: andrewkr Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54921
-
- 02 3月, 2016 1 次提交
-
-
由 sdong 提交于
Summary: Rename iterator property to folow property naming convention. Test Plan: Run all existing tests. Reviewers: andrewkr, anthony, yhchiang, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54957
-
- 01 3月, 2016 1 次提交
-
-
由 sdong 提交于
Summary: Add Iterator::GetProperty(), a way for users to communicate with iterator, and turn Iterator::IsKeyPinned() with it. As a follow-up, I'll ask a property as the version number attached to the iterator Test Plan: Rerun existing tests and add a negative test case. Reviewers: yhchiang, andrewkr, kradhakrishnan, anthony, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54783
-
- 11 2月, 2016 1 次提交
-
-
由 Peter Mattis 提交于
Fixes #983.
-
- 10 2月, 2016 1 次提交
-
-
由 Baraa Hamodi 提交于
-
- 07 1月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: It looks like we are spending significant amount of time creating std::deque<std::string> every time we do Iterator::Prev() {F921567} By using merge_operands_ as a DBIter data member w create it once and reduce this overhead and see ~30% performance improvement when using Iterator::Prev() on hot data Orignal performance ``` DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readreverse" --db="/dev/shm/bench_prev_opt/" --use_existing_db --disable_auto_compactions readreverse : 0.713 micros/op 1402219 ops/sec; 155.1 MB/s readreverse : 0.609 micros/op 1641386 ops/sec; 181.6 MB/s readreverse : 0.684 micros/op 1461150 ops/sec; 161.6 MB/s readreverse : 0.629 micros/op 1589842 ops/sec; 175.9 MB/s readreverse : 0.647 micros/op 1544530 ops/sec; 170.9 MB/s ``` After optimization ``` DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readreverse" --db="/dev/shm/bench_prev_opt/" --use_existing_db --disable_auto_compactions readreverse : 0.488 micros/op 2051189 ops/sec; 226.9 MB/s readreverse : 0.505 micros/op 1980892 ops/sec; 219.1 MB/s readreverse : 0.541 micros/op 1846971 ops/sec; 204.3 MB/s readreverse : 0.497 micros/op 2013612 ops/sec; 222.8 MB/s readreverse : 0.480 micros/op 2082665 ops/sec; 230.4 MB/s ``` Test Plan: make check -j64 Reviewers: sdong, anthony, rven, igor, yhchiang Reviewed By: yhchiang Subscribers: jkedgar, dhruba Differential Revision: https://reviews.facebook.net/D52563
-
- 17 12月, 2015 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: This patch update the Iterator API to introduce new functions that allow users to keep the Slices returned by key() valid as long as the Iterator is not deleted ReadOptions::pin_data : If true keep loaded blocks in memory as long as the iterator is not deleted Iterator::IsKeyPinned() : If true, this mean that the Slice returned by key() is valid as long as the iterator is not deleted Also add a new option BlockBasedTableOptions::use_delta_encoding to allow users to disable delta_encoding if needed. Benchmark results (using https://phabricator.fb.com/P20083553) ``` // $ du -h /home/tec/local/normal.4K.Snappy/db10077 // 6.1G /home/tec/local/normal.4K.Snappy/db10077 // $ du -h /home/tec/local/zero.8K.LZ4/db10077 // 6.4G /home/tec/local/zero.8K.LZ4/db10077 // Benchmarks for shard db10077 // _build/opt/rocks/benchmark/rocks_copy_benchmark \ // --normal_db_path="/home/tec/local/normal.4K.Snappy/db10077" \ // --zero_db_path="/home/tec/local/zero.8K.LZ4/db10077" // First run // ============================================================================ // rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s // ============================================================================ // BM_StringCopy 1.73s 576.97m // BM_StringPiece 103.74% 1.67s 598.55m // ============================================================================ // Match rate : 1000000 / 1000000 // Second run // ============================================================================ // rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s // ============================================================================ // BM_StringCopy 611.99ms 1.63 // BM_StringPiece 203.76% 300.35ms 3.33 // ============================================================================ // Match rate : 1000000 / 1000000 ``` Test Plan: Unit tests Reviewers: sdong, igor, anthony, yhchiang, rven Reviewed By: rven Subscribers: dhruba, lovro, adsharma Differential Revision: https://reviews.facebook.net/D48999
-
- 01 12月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: With recent commit 33e0c938, db iterator skips perf context counter internal_key_skipped_count when blindly issuing internal Next(). Now increment the counter by one when issuing this Next() Test Plan: Run all existing tests Reviewers: rven, yhchiang, IslamAbdelRahman, kradhakrishnan, igor, anthony Reviewed By: anthony Subscribers: yoshinorim, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51465
-
- 25 11月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: Now DBIter::Next() always compares with current key with itself first, which is unnecessary if the last key is not a merge key. I made the change and didn't see db_iter_test fails. Want to hear whether people have any idea what I miss. Test Plan: Run all unit tests Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D48279
-
- 06 11月, 2015 2 次提交
-
-
由 Venkatesh Radhakrishnan 提交于
Summary: Use IterKey to store prefix_start_ so that it doesn't get freed Test Plan: PrefixTest.PrefixValid Reviewers: anthony, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D50289
-
由 Venkatesh Radhakrishnan 提交于
Summary: MyRocks testing found an issue that while iterating over keys that are outside the prefix, sometimes wrong results were seen for keys outside the prefix. We now tighten the range of keys seen with a new read option called prefix_seen_at_start. This remembers the starting prefix and then compares it on a Next for equality of prefix. If they are from a different prefix, it sets valid to false. Test Plan: PrefixTest.PrefixValid Reviewers: IslamAbdelRahman, sdong, yhchiang, anthony Reviewed By: anthony Subscribers: spetrunia, hermanlee4, yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D50211
-
- 14 10月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type. This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's. At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it. Test Plan: Run all existing tests. Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven Reviewed By: rven Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D48549
-
- 18 9月, 2015 1 次提交
-
-
由 Andres Noetzli 提交于
Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179
-
- 12 9月, 2015 1 次提交
-
-
由 Manuel Ung 提交于
Summary: There are currently no statistics on seeks, only on gets. This adds the following counters: rocksdb.number.db.seek rocksdb.number.db.next rocksdb.number.db.prev (number of calls) rocksdb.db.iterate.bytes.read (number of bytes read from key + value using seek/next/prev) rocksdb.number.keys.seek.found rocksdb.number.keys.next.found rocksdb.number.keys.prev.found (number of calls where seek/next/prev found a value) Test Plan: ./db_bench -statistics -benchmarks fillrandom,seekrandom -seek_nexts 5 ./db_bench -statistics -benchmarks fillrandom,seekrandom -seek_nexts 5 -reverse_iterator Reviewers: yhchiang, rven, kradhakrishnan, IslamAbdelRahman, MarkCallaghan, sdong, igor Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D46605
-
- 09 9月, 2015 1 次提交
-
-
由 Andres Noetzli 提交于
Summary: In some cases, equality comparisons can be done more efficiently than three-way comparisons. There are quite a few places in the code where we only care about equality. This patch adds an Equal() method that defaults to using the Compare() method. Test Plan: make clean all check Reviewers: rven, anthony, yhchiang, igor, sdong Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46233
-
- 27 8月, 2015 1 次提交
-
-
由 sdong 提交于
DBIter to out extra keys with higher sequence numbers when changing direction from forward to backward Summary: When DBIter changes iterating direction from forward to backward, it might see some much larger keys with higher sequence ID. With this commit, these rows will be actively filtered out. It should fix existing disabled tests in db_iter_test. This may not be a perfect fix, but it introduces least impact on existing codes, in order to be safe. Test Plan: Enable existing tests and make sure they pass. Add a new test DBIterWithMergeIterTest.InnerMergeIteratorDataRace8. Also run all existing tests. Reviewers: yhchiang, rven, anthony, IslamAbdelRahman, kradhakrishnan, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D45567
-
- 20 8月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: There is a check to fail the iterator if prefix extractor is specified but upper bound is out of the prefix for the seek key. Relax this constraint to allow users to set upper bound to the next prefix of the current one. Test Plan: make commit-prereq Reviewers: igor, anthony, kradhakrishnan, yhchiang, rven Reviewed By: rven Subscribers: tnovak, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D44949
-
- 12 8月, 2015 1 次提交
-
-
由 Andres Notzli 提交于
Summary: While working on single delete support for db_bench, I realized that db_bench/db_stress contain a bunch of duplicate code related to copmression and found some typos. This patch removes duplicate code, typos and a redundant #ifndef in internal_stats.cc. Test Plan: make db_stress && make db_bench && ./db_bench --benchmarks=compress,uncompress Reviewers: yhchiang, sdong, rven, anthony, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43965
-
- 07 8月, 2015 1 次提交
-
-
由 Andres Noetzli 提交于
Summary: When seeking to the last occurrence of a key with sequence number 0, db_iter ends up in an endless loop because it seeks to type kValueTypeForSeek which is larger than kTypeDeletion/kTypeValue. Added test case that triggers the behavior. Test Plan: make clean all check Reviewers: igor, rven, anthony, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43653
-
- 06 8月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: While doing forward iterating, if current key is merge, internal iterator position is placed to the next key. If Prev() is called now, needs to do extra Prev() to recover the location. This is second attempt of fixing after reverting ec70fea4. This time shrink the fix to only merge key is the current key and avoid the reseeking logic for max_iterating skipping Test Plan: enable the two disabled tests and make sure they pass Reviewers: rven, IslamAbdelRahman, kradhakrishnan, tnovak, yhchiang Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D43557
-
- 08 7月, 2015 1 次提交
-
-
由 Yueh-Hsuan Chiang 提交于
Summary: This diff reverts the following two previous diffs related to DBIter::FindPrevUserKey(), which makes db_stress unstable. We should bake a better fix for this. * "Fix a comparison in DBIter::FindPrevUserKey()" ec70fea4. * "Fixed endless loop in DBIter::FindPrevUserKey()" acee2b08. Test Plan: db_stress Reviewers: anthony, igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D41301
-
- 30 6月, 2015 1 次提交
-
-
由 Tomislav Novak 提交于
Summary: When seek target is a merge key (`kTypeMerge`), `DBIter::FindNextUserEntry()` advances the underlying iterator _past_ the current key (`saved_key_`); see `MergeValuesNewToOld()`. However, `FindPrevUserKey()` assumes that `iter_` points to an entry with the same user key as `saved_key_`. As a result, `it->Seek(key) && it->Prev()` can cause the iterator to be positioned at the _next_, instead of the previous, entry (new test, written by @lovro, reproduces the bug). This diff changes `FindPrevUserKey()` to also skip keys that are _greater_ than `saved_key_`. Test Plan: db_test Reviewers: igor, sdong Reviewed By: sdong Subscribers: leveldb, dhruba, lovro Differential Revision: https://reviews.facebook.net/D40791
-
- 26 6月, 2015 1 次提交
-
-
Summary: #7124486: RocksDB's Iterator.SeekToLast should seek to the last key before iterate_upper_bound if presents Test Plan: ./db_iter_test run successfully with the new testcase Reviewers: rven, yhchiang, igor, anthony, kradhakrishnan, sdong Reviewed By: sdong Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D40425
-
- 25 4月, 2015 1 次提交
-
-
由 clark.kang 提交于
-
- 25 3月, 2015 1 次提交
-
-
由 Anurag Indu 提交于
Summary: We have addded new stats and perf_context for measuring the merge and filter operation time consumption. We have bounded all the merge operations within the GUARD statment and collected the total time for these operations in the DB. Test Plan: WIP Reviewers: rven, yhchiang, kradhakrishnan, igor, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D34377
-
- 27 2月, 2015 1 次提交
-
-
由 Igor Sugak 提交于
Summary: When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes. Prerequisites: bear and clang 3.5 build with extra tools ```lang=bash % USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html % clang-modernize -p . -include . -add-override % make format ``` Test Plan: Make sure all tests are passing. ```lang=bash % #Use default fb code clang. % make check ``` Verify less error and no missing override errors. ```lang=bash % # Have trunk clang present in path. % ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make ``` Reviewers: igor, kradhakrishnan, rven, meyering, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D34077
-
- 24 2月, 2015 1 次提交
-
-
由 Igor Sugak 提交于
Summary: This diff contains trivial fixes for 6 scan-build warnings: **db/c_test.c** `db` variable is never read. Removed assignment. scan-build report: http://home.fburl.com/~sugak/latest20/report-9b77d2.html#EndPath **db/db_iter.cc** `skipping` local variable is assigned to false. Then in the next switch block the only "non return" case assign `skipping` to true, the rest cases don't use it and all do return. scan-build report: http://home.fburl.com/~sugak/latest20/report-13fca7.html#EndPath **db/log_reader.cc** In `bool Reader::SkipToInitialBlock()` `offset_in_block` local variable is assigned to 0 `if (offset_in_block > kBlockSize - 6)` and then never used. Removed the assignment and renamed it to `initial_offset_in_block` to avoid confusion. scan-build report: http://home.fburl.com/~sugak/latest20/report-a618dd.html#EndPath In `bool Reader::ReadRecord(Slice* record, std::string* scratch)` local variable `in_fragmented_record` in switch case `kFullType` block is assigned to false and then does `return` without use. In the other switch case `kFirstType` block the same `in_fragmented_record` is assigned to false, but later assigned to true without prior use. Removed assignment for both cases. scan-build reprots: http://home.fburl.com/~sugak/latest20/report-bb86b0.html#EndPath http://home.fburl.com/~sugak/latest20/report-a975be.html#EndPath **table/plain_table_key_coding.cc** Local variable `user_key_size` is assigned when declared. But then in both places where it is used assigned to `static_cast<uint32_t>(key.size() - 8)`. Changed to initialize the variable to the proper value in declaration. scan-build report: http://home.fburl.com/~sugak/latest20/report-9e6b86.html#EndPath **tools/db_stress.cc** Missing `break` in switch case block. This seems to be a bug. Added missing `break`. Test Plan: Make sure all tests are passing and scan-build does not report 'Dead assignment' and 'Dead initialization' bugs. ```lang=bash % make check % make analyze ``` Reviewers: meyering, igor, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D33795
-
- 05 12月, 2014 1 次提交
-
-
由 Yueh-Hsuan Chiang 提交于
Summary: Replace exception by setting valid_ = false in DBIter::MergeValuesNewToOld(). Test Plan: Not sure if I am right at this, but it seems we currently don't have a good way to test that code path as it requires dynamically set merge_operator = nullptr at the time while Merge() is calling. Reviewers: igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D29811
-
- 07 11月, 2014 1 次提交
-
-
由 Igor Canadi 提交于
Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc. Test Plan: compiles Reviewers: ljin, yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D28131
-
- 31 10月, 2014 1 次提交
-
-
由 Yueh-Hsuan Chiang 提交于
Summary: Apply InfoLogLevel to the logs in db/db_iter.cc Test Plan: make Reviewers: igor, ljin, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D27861
-
- 01 10月, 2014 1 次提交
-
-
由 Danny Al-Gaaf 提交于
Signed-off-by: NDanny Al-Gaaf <danny.al-gaaf@bisect.de>
-