- 28 5月, 2016 1 次提交
-
-
由 sdong 提交于
Summary: When rate_bytes_per_sec * refill_period_us_ overflows, the actual limited rate is very low. Handle this case so the rate will be large. Test Plan: Add a unit test for it. Reviewers: IslamAbdelRahman, andrewkr Reviewed By: andrewkr Subscribers: yiwu, lightmark, leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58929
-
- 20 5月, 2016 2 次提交
-
-
由 Aaron Orenstein 提交于
Summary: Reduce use of argument-dependent name lookup in RocksDB. Test Plan: 'make check' passed. Reviewers: andrewkr Reviewed By: andrewkr Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58203
-
由 Dmitri Smirnov 提交于
For ease of reuse and customization as a library without wrapping. WinEnvThreads is a class for replacement. WintEnvIO is a class for reuse and behavior override. Added private virtual functions for custom override of fallocate pread for io classes.
-
- 17 5月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
Implement GetUniqueIdFromFile to support new tests and the feature.
-
- 13 5月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
Conditionally retrofit thread_posix for use with std::thread and reuse the same logic. Posix users continue using Posix interfaces. Enable XPRESS compression in test runs. Fix master introduced signed/unsigned mismatch.
-
- 30 4月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
make preallocation inline with other writable files make sure that we map no more than pre-allocated size.
-
- 29 4月, 2016 1 次提交
-
-
由 PraveenSinghRao 提交于
This reverts commit b54c3474. Revert async file handle change as it causes failures with appveyor
-
- 28 4月, 2016 1 次提交
-
-
由 Li Peng 提交于
fix typos and remove duplicated words
-
- 23 4月, 2016 2 次提交
-
-
由 dx9 提交于
* Musl libc does not provide adaptive mutex. Added feature test for PTHREAD_MUTEX_ADAPTIVE_NP. * Musl libc does not provide backtrace(3). Added a feature check for backtrace(3). * Fixed compiler error. * Musl libc does not implement backtrace(3). Added platform check for libexecinfo. * Alpine does not appear to support gcc -pg option. By default (gcc has PIE option enabled) it fails with: gcc: error: -pie and -pg|p|profile are incompatible when linking When -fno-PIE and -nopie are used it fails with: /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find gcrt1.o: No such file or directory Added gcc -pg platform test and output PROFILING_FLAGS accordingly. Replaced pg var in Makefile with PROFILING_FLAGS. * fix segfault when TEST_IOCTL_FRIENDLY_TMPDIR is undefined and default candidates are not suitable * use ASSERT_DOUBLE_EQ instead of ASSERT_EQ * When compiled with ROCKSDB_MALLOC_USABLE_SIZE UniversalCompactionFourPaths and UniversalCompactionSecondPathRatio tests fail due to premature memtable flushes on systems with 16-byte alignment. Arena runs out of block space before GenerateNewFile() completes. Increased options.write_buffer_size.
-
由 PraveenSinghRao 提交于
-
- 20 4月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
Comparable with Snappy on comp ratio. Implemented using Windows API, does not require external package. Avaiable since Windows 8 and server 2012. Use -DXPRESS=1 with CMake to enable.
-
- 01 4月, 2016 1 次提交
-
-
由 Yueh-Hsuan Chiang 提交于
Summary: Fixed the following compile warnings: /Users/yhchiang/rocksdb/util/posix_logger.h:32:11: error: unused variable 'kDebugLogChunkSize' [-Werror,-Wunused-const-variable] const int kDebugLogChunkSize = 128 * 1024; ^ /Users/yhchiang/rocksdb/util/coding.h:24:20: error: unused variable 'kMaxVarint32Length' [-Werror,-Wunused-const-variable] const unsigned int kMaxVarint32Length = 5; ^ 2 errors generated. Test Plan: make clean rocksdb Reviewers: igor, sdong, anthony, IslamAbdelRahman, rven, kradhakrishnan, adamretter Reviewed By: adamretter Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D56223
-
- 18 3月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
calls. #ifdef in the source code and make this a default build option.
-
- 20 2月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
by using default implementation for now as it works.
-
- 18 2月, 2016 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: I broke it in D53781. Test Plan: tried the same code in util/env_posix.cc and it compiled successfully Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54303
-
- 10 2月, 2016 2 次提交
-
-
由 Baraa Hamodi 提交于
-
由 Andrew Kryczka 提交于
Summary: Added this new function, which returns filename, size, and modified timestamp for each file in the provided directory. The default implementation retrieves the metadata sequentially using existing functions. In the next diff I'll make HdfsEnv override this function to use libhdfs's bulk get function. This won't work on windows due to the path separator. Test Plan: new unit test $ ./env_test --gtest_filter=EnvPosixTest.ConsistentChildrenMetadata Reviewers: yhchiang, sdong Reviewed By: sdong Subscribers: IslamAbdelRahman, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53781
-
- 02 2月, 2016 3 次提交
-
-
由 Tomas Kolda 提交于
-
由 Tomas Kolda 提交于
break compatibility with Windows 7. The issue with rotated logs was fixed other way.
-
由 Dmitri Smirnov 提交于
This change impacts only non-buffered I/O on Windows. Currently, there is a buffer per RandomAccessFile instance that is protected by a lock. The reason we maintain the buffer is non-buffered I/O requires an aligned buffer to work. XPerf traces demonstrate that we accumulate a considerable wait time while waiting for that lock. This change enables to set random access buffer size to zero which would indicate a per request allocation. We are expecting that allocation expense would be much less than I/O costs plus wait time due to the fact that the memory heap would tend to re-use page aligned allocations especially with the use of Jemalloc. This change does not affect buffer use as a read_ahead_buffer for compaction purposes.
-
- 14 1月, 2016 1 次提交
-
-
由 Dmitri Smirnov 提交于
Use Yield macro to make it a little more portable between platforms.
-
- 05 1月, 2016 1 次提交
-
-
由 Marek Kurdej 提交于
-
- 26 12月, 2015 1 次提交
-
-
由 Nathan Bronson 提交于
Summary: This diff adds support for concurrent adds to the skiplist memtable implementations. Memory allocation is made thread-safe by the addition of a spinlock, with small per-core buffers to avoid contention. Concurrent memtable writes are made via an additional method and don't impose a performance overhead on the non-concurrent case, so parallelism can be selected on a per-batch basis. Write thread synchronization is an increasing bottleneck for higher levels of concurrency, so this diff adds --enable_write_thread_adaptive_yield (default off). This feature causes threads joining a write batch group to spin for a short time (default 100 usec) using sched_yield, rather than going to sleep on a mutex. If the timing of the yield calls indicates that another thread has actually run during the yield then spinning is avoided. This option improves performance for concurrent situations even without parallel adds, although it has the potential to increase CPU usage (and the heuristic adaptation is not yet mature). Parallel writes are not currently compatible with inplace updates, update callbacks, or delete filtering. Enable it with --allow_concurrent_memtable_write (and --enable_write_thread_adaptive_yield). Parallel memtable writes are performance neutral when there is no actual parallelism, and in my experiments (SSD server-class Linux and varying contention and key sizes for fillrandom) they are always a performance win when there is more than one thread. Statistics are updated earlier in the write path, dropping the number of DB mutex acquisitions from 2 to 1 for almost all cases. This diff was motivated and inspired by Yahoo's cLSM work. It is more conservative than cLSM: RocksDB's write batch group leader role is preserved (along with all of the existing flush and write throttling logic) and concurrent writers are blocked until all memtable insertions have completed and the sequence number has been advanced, to preserve linearizability. My test config is "db_bench -benchmarks=fillrandom -threads=$T -batch_size=1 -memtablerep=skip_list -value_size=100 --num=1000000/$T -level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999 -disable_auto_compactions --max_write_buffer_number=8 -max_background_flushes=8 --disable_wal --write_buffer_size=160000000 --block_size=16384 --allow_concurrent_memtable_write" on a two-socket Xeon E5-2660 @ 2.2Ghz with lots of memory and an SSD hard drive. With 1 thread I get ~440Kops/sec. Peak performance for 1 socket (numactl -N1) is slightly more than 1Mops/sec, at 16 threads. Peak performance across both sockets happens at 30 threads, and is ~900Kops/sec, although with fewer threads there is less performance loss when the system has background work. Test Plan: 1. concurrent stress tests for InlineSkipList and DynamicBloom 2. make clean; make check 3. make clean; DISABLE_JEMALLOC=1 make valgrind_check; valgrind db_bench 4. make clean; COMPILE_WITH_TSAN=1 make all check; db_bench 5. make clean; COMPILE_WITH_ASAN=1 make all check; db_bench 6. make clean; OPT=-DROCKSDB_LITE make check 7. verify no perf regressions when disabled Reviewers: igor, sdong Reviewed By: sdong Subscribers: MarkCallaghan, IslamAbdelRahman, anthony, yhchiang, rven, sdong, guyg8, kradhakrishnan, dhruba Differential Revision: https://reviews.facebook.net/D50589
-
- 15 12月, 2015 1 次提交
-
-
由 Venkatesh Radhakrishnan 提交于
Running manual compactions in parallel with other automatic or manual compactions in restricted cases Summary: This diff provides a framework for doing manual compactions in parallel with other compactions. We now have a deque of manual compactions. We also pass manual compactions as an argument from RunManualCompactions down to BackgroundCompactions, so that RunManualCompactions can be reentrant. Parallelism is controlled by the two routines ConflictingManualCompaction to allow/disallow new parallel/manual compactions based on already existing ManualCompactions. In this diff, by default manual compactions still have to run exclusive of other compactions. However, by setting the compaction option, exclusive_manual_compaction to false, it is possible to run other compactions in parallel with a manual compaction. However, we are still restricted to one manual compaction per column family at a time. All of these restrictions will be relaxed in future diffs. I will be adding more tests later. Test Plan: Rocksdb regression + new tests + valgrind Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong Reviewed By: sdong Subscribers: yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47973
-
- 12 12月, 2015 1 次提交
-
-
由 Dmitri Smirnov 提交于
Mostly due to the fact that there are differences in sizes of int,long on 64 bit systems vs GNU.
-
- 11 12月, 2015 1 次提交
-
-
由 charsyam 提交于
-
- 09 12月, 2015 1 次提交
-
-
由 yuslepukhin 提交于
Fix warnings Take advantage of native snprintf on VS 15
-
- 24 11月, 2015 1 次提交
-
-
由 Vasili Svirski 提交于
* conversion from 'size_t' to 'type', by add static_cast Tested: * by build solution on Windows, Linux locally, * run tests * build CI system successful
-
- 21 11月, 2015 1 次提交
-
-
由 yuslepukhin 提交于
-
- 17 11月, 2015 3 次提交
-
-
由 Dmitri Smirnov 提交于
-
由 Dmitri Smirnov 提交于
-
由 Islam AbdelRahman 提交于
Summary: ``` arc2 lint --everything ``` run the linter on the whole code repo to fix exisitng lint issues Test Plan: make check -j64 Reviewers: sdong, rven, anthony, kradhakrishnan, yhchiang Reviewed By: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D50769
-
- 11 11月, 2015 2 次提交
-
-
由 Dmitri Smirnov 提交于
in 64-bit. Currently, a signed off_t type is being used for the following interfaces for both offset and the length in bytes: * `Allocate` * `RangeSync` On Linux `off_t` is automatically either 32 or 64-bit depending on the platform. On Windows it is always a 32-bit signed long which limits file access and in particular space pre-allocation to effectively 2 Gb. Proposal is to replace off_t with uint64_t as a portable type always access files with 64-bit interfaces. May need to modify posix code but lack resources to test it.
-
由 Dmitri Smirnov 提交于
in 64-bit. Currently, a signed off_t type is being used for the following interfaces for both offset and the length in bytes: * `Allocate` * `RangeSync` On Linux `off_t` is automatically either 32 or 64-bit depending on the platform. On Windows it is always a 32-bit signed long which limits file access and in particular space pre-allocation to effectively 2 Gb. Proposal is to replace off_t with uint64_t as a portable type always access files with 64-bit interfaces. May need to modify posix code but lack resources to test it.
-
- 30 10月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: Run "make format" for some recent commits. Test Plan: Build and run tests Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D49707
-
- 28 10月, 2015 1 次提交
-
-
由 Dmitri Smirnov 提交于
introduce a new DBOption random_access_max_buffer_size to limit the size of the random access buffer used for unbuffered access. Implement read ahead buffering when enabled. To that effect propagate compaction_readahead_size and the new option to the env options to make it available for the implementation. Add Hint() override so SetupForCompaction() call would call Hint() readahead can now be setup from both Hint() and EnableReadAhead() Add new option random_access_max_buffer_size support db_bench, options_helper to make it string parsable and the unit test.
-
- 16 10月, 2015 1 次提交
-
-
由 sdong 提交于
Summary: Give a name for every kill point, and allow users to disable some kill points based on prefixes. The kill points can be passed by db_stress through a command line paramter. This provides a way for users to boost the chance of triggering low frequency kill points This allow follow up changes in crash test scripts to improve crash test coverage. Test Plan: Manually run db_stress with variable values of --kill_random_test and --kill_prefix_blacklist. Like this: --kill_random_test=2 --kill_prefix_blacklist=Posix,WritableFileWriter::Append,WritableFileWriter::WriteBuffered,WritableFileWriter::Sync Reviewers: igor, kradhakrishnan, rven, IslamAbdelRahman, yhchiang Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D48735
-
- 14 10月, 2015 1 次提交
-
-
由 Praveen Rao 提交于
-
- 13 10月, 2015 1 次提交
-
-
由 Praveen Rao 提交于
-
- 07 10月, 2015 1 次提交
-
-
由 Dmitri Smirnov 提交于
Summary: This mirrors https://reviews.facebook.net/D45645 Currently, mmap returns IOError when user tries to read data past the end of the file. This diff changes the behavior. Now, we return just the bytes that we can, and report the size we returned via a Slice result. This is consistent with non-mmap behavior and also pread() system call.
-