1. 28 5月, 2016 1 次提交
    • S
      Handle overflow case of rate limiter's paramters · f62fbd2c
      sdong 提交于
      Summary: When rate_bytes_per_sec * refill_period_us_ overflows, the actual limited rate is very low. Handle this case so the rate will be large.
      
      Test Plan: Add a unit test for it.
      
      Reviewers: IslamAbdelRahman, andrewkr
      
      Reviewed By: andrewkr
      
      Subscribers: yiwu, lightmark, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D58929
      f62fbd2c
  2. 20 5月, 2016 2 次提交
  3. 17 5月, 2016 1 次提交
  4. 13 5月, 2016 1 次提交
  5. 30 4月, 2016 1 次提交
  6. 29 4月, 2016 1 次提交
  7. 28 4月, 2016 1 次提交
  8. 23 4月, 2016 2 次提交
    • D
      Alpine Linux Build (#990) · b71c4e61
      dx9 提交于
      * Musl libc does not provide adaptive mutex. Added feature test for PTHREAD_MUTEX_ADAPTIVE_NP.
      
      * Musl libc does not provide backtrace(3). Added a feature check for backtrace(3).
      
      * Fixed compiler error.
      
      * Musl libc does not implement backtrace(3). Added platform check for libexecinfo.
      
      * Alpine does not appear to support gcc -pg option. By default (gcc has PIE option enabled) it fails with:
      
      gcc: error: -pie and -pg|p|profile are incompatible when linking
      
      When -fno-PIE and -nopie are used it fails with:
      
      /usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find gcrt1.o: No such file or directory
      
      Added gcc -pg platform test and output PROFILING_FLAGS accordingly. Replaced pg var in Makefile with PROFILING_FLAGS.
      
      * fix segfault when TEST_IOCTL_FRIENDLY_TMPDIR is undefined and default candidates are not suitable
      
      * use ASSERT_DOUBLE_EQ instead of ASSERT_EQ
      
      * When compiled with ROCKSDB_MALLOC_USABLE_SIZE UniversalCompactionFourPaths and UniversalCompactionSecondPathRatio tests fail due to premature memtable flushes on systems with 16-byte alignment. Arena runs out of block space before GenerateNewFile() completes.
      
      Increased options.write_buffer_size.
      b71c4e61
    • P
      b54c3474
  9. 20 4月, 2016 1 次提交
  10. 01 4月, 2016 1 次提交
    • Y
      Fixed compile warnings in posix_logger.h and coding.h · a558830f
      Yueh-Hsuan Chiang 提交于
      Summary:
      Fixed the following compile warnings:
      
      /Users/yhchiang/rocksdb/util/posix_logger.h:32:11: error: unused variable 'kDebugLogChunkSize' [-Werror,-Wunused-const-variable]
      const int kDebugLogChunkSize = 128 * 1024;
                ^
      /Users/yhchiang/rocksdb/util/coding.h:24:20: error: unused variable 'kMaxVarint32Length' [-Werror,-Wunused-const-variable]
      const unsigned int kMaxVarint32Length = 5;
                         ^
      2 errors generated.
      
      Test Plan: make clean rocksdb
      
      Reviewers: igor, sdong, anthony, IslamAbdelRahman, rven, kradhakrishnan, adamretter
      
      Reviewed By: adamretter
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D56223
      a558830f
  11. 18 3月, 2016 1 次提交
  12. 20 2月, 2016 1 次提交
  13. 18 2月, 2016 1 次提交
  14. 10 2月, 2016 2 次提交
    • B
      Updated all copyright headers to the new format. · 21e95811
      Baraa Hamodi 提交于
      21e95811
    • A
      Env function for bulk metadata retrieval · 59b3ee65
      Andrew Kryczka 提交于
      Summary:
      Added this new function, which returns filename, size, and modified
      timestamp for each file in the provided directory. The default implementation
      retrieves the metadata sequentially using existing functions. In the next diff
      I'll make HdfsEnv override this function to use libhdfs's bulk get function.
      
      This won't work on windows due to the path separator.
      
      Test Plan:
      new unit test
      
        $ ./env_test --gtest_filter=EnvPosixTest.ConsistentChildrenMetadata
      
      Reviewers: yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: IslamAbdelRahman, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D53781
      59b3ee65
  15. 02 2月, 2016 3 次提交
    • T
    • T
      Making use of GetSystemTimePreciseAsFileTime dynamic to not · 502d41f1
      Tomas Kolda 提交于
      break compatibility with Windows 7. The issue with rotated logs
      was fixed other way.
      502d41f1
    • D
      Enable per-request buffer allocation in RandomAccessFile · 36300fbb
      Dmitri Smirnov 提交于
       This change impacts only non-buffered I/O on Windows.
       Currently, there is a buffer per RandomAccessFile
       instance that is protected by a lock. The reason we
       maintain the buffer is non-buffered I/O requires an aligned
       buffer to work.
       XPerf traces demonstrate that we accumulate a considerable
       wait time while waiting for that lock.
       This change enables to set random access buffer size to zero
       which would indicate a per request allocation.
       We are expecting that allocation expense would be much less than
       I/O costs plus wait time due to the fact that the memory heap
       would tend to re-use page aligned allocations especially with the
       use of Jemalloc.
       This change does not affect buffer use as a read_ahead_buffer for
       compaction purposes.
      36300fbb
  16. 14 1月, 2016 1 次提交
  17. 05 1月, 2016 1 次提交
  18. 26 12月, 2015 1 次提交
    • N
      support for concurrent adds to memtable · 7d87f027
      Nathan Bronson 提交于
      Summary:
      This diff adds support for concurrent adds to the skiplist memtable
      implementations.  Memory allocation is made thread-safe by the addition of
      a spinlock, with small per-core buffers to avoid contention.  Concurrent
      memtable writes are made via an additional method and don't impose a
      performance overhead on the non-concurrent case, so parallelism can be
      selected on a per-batch basis.
      
      Write thread synchronization is an increasing bottleneck for higher levels
      of concurrency, so this diff adds --enable_write_thread_adaptive_yield
      (default off).  This feature causes threads joining a write batch
      group to spin for a short time (default 100 usec) using sched_yield,
      rather than going to sleep on a mutex.  If the timing of the yield calls
      indicates that another thread has actually run during the yield then
      spinning is avoided.  This option improves performance for concurrent
      situations even without parallel adds, although it has the potential to
      increase CPU usage (and the heuristic adaptation is not yet mature).
      
      Parallel writes are not currently compatible with
      inplace updates, update callbacks, or delete filtering.
      Enable it with --allow_concurrent_memtable_write (and
      --enable_write_thread_adaptive_yield).  Parallel memtable writes
      are performance neutral when there is no actual parallelism, and in
      my experiments (SSD server-class Linux and varying contention and key
      sizes for fillrandom) they are always a performance win when there is
      more than one thread.
      
      Statistics are updated earlier in the write path, dropping the number
      of DB mutex acquisitions from 2 to 1 for almost all cases.
      
      This diff was motivated and inspired by Yahoo's cLSM work.  It is more
      conservative than cLSM: RocksDB's write batch group leader role is
      preserved (along with all of the existing flush and write throttling
      logic) and concurrent writers are blocked until all memtable insertions
      have completed and the sequence number has been advanced, to preserve
      linearizability.
      
      My test config is "db_bench -benchmarks=fillrandom -threads=$T
      -batch_size=1 -memtablerep=skip_list -value_size=100 --num=1000000/$T
      -level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999
      -disable_auto_compactions --max_write_buffer_number=8
      -max_background_flushes=8 --disable_wal --write_buffer_size=160000000
      --block_size=16384 --allow_concurrent_memtable_write" on a two-socket
      Xeon E5-2660 @ 2.2Ghz with lots of memory and an SSD hard drive.  With 1
      thread I get ~440Kops/sec.  Peak performance for 1 socket (numactl
      -N1) is slightly more than 1Mops/sec, at 16 threads.  Peak performance
      across both sockets happens at 30 threads, and is ~900Kops/sec, although
      with fewer threads there is less performance loss when the system has
      background work.
      
      Test Plan:
      1. concurrent stress tests for InlineSkipList and DynamicBloom
      2. make clean; make check
      3. make clean; DISABLE_JEMALLOC=1 make valgrind_check; valgrind db_bench
      4. make clean; COMPILE_WITH_TSAN=1 make all check; db_bench
      5. make clean; COMPILE_WITH_ASAN=1 make all check; db_bench
      6. make clean; OPT=-DROCKSDB_LITE make check
      7. verify no perf regressions when disabled
      
      Reviewers: igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, IslamAbdelRahman, anthony, yhchiang, rven, sdong, guyg8, kradhakrishnan, dhruba
      
      Differential Revision: https://reviews.facebook.net/D50589
      7d87f027
  19. 15 12月, 2015 1 次提交
    • V
      Running manual compactions in parallel with other automatic or manual... · 030215bf
      Venkatesh Radhakrishnan 提交于
      Running manual compactions in parallel with other automatic or manual compactions in restricted cases
      
      Summary:
      This diff provides a framework for doing manual
      compactions in parallel with other compactions. We now have a deque of manual compactions. We also pass manual compactions as an argument from RunManualCompactions down to
      BackgroundCompactions, so that RunManualCompactions can be reentrant.
      Parallelism is controlled by the two routines
      ConflictingManualCompaction to allow/disallow new parallel/manual
      compactions based on already existing ManualCompactions. In this diff, by default manual compactions still have to run exclusive of other compactions. However, by setting the compaction option, exclusive_manual_compaction to false, it is possible to run other compactions in parallel with a manual compaction. However, we are still restricted to one manual compaction per column family at a time. All of these restrictions will be relaxed in future diffs.
      I will be adding more tests later.
      
      Test Plan: Rocksdb regression + new tests + valgrind
      
      Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D47973
      030215bf
  20. 12 12月, 2015 1 次提交
  21. 11 12月, 2015 1 次提交
  22. 09 12月, 2015 1 次提交
  23. 24 11月, 2015 1 次提交
    • V
      Enable C4267 warning · 41b32c60
      Vasili Svirski 提交于
      * conversion from 'size_t' to 'type', by add static_cast
      
      Tested:
      * by build solution on Windows, Linux locally,
      * run tests
      * build CI system successful
      41b32c60
  24. 21 11月, 2015 1 次提交
  25. 17 11月, 2015 3 次提交
  26. 11 11月, 2015 2 次提交
    • D
      Make use of portable `uint64_t` type to make possible file access · 5270b33b
      Dmitri Smirnov 提交于
        in 64-bit.
      
        Currently, a signed off_t type is being used for the following
        interfaces for both offset and the length in bytes:
        * `Allocate`
        * `RangeSync`
      
        On Linux `off_t` is automatically either 32 or 64-bit depending on
        the platform. On Windows it is always a 32-bit signed long which
        limits file access and in particular space pre-allocation
        to effectively 2 Gb.
      
        Proposal is to replace off_t with uint64_t as a portable type
        always access files with 64-bit interfaces.
      
        May need to modify posix code but lack resources to test it.
      5270b33b
    • D
      Make use of portable `uint64_t` type to make possible file access · 5421c972
      Dmitri Smirnov 提交于
        in 64-bit.
      
        Currently, a signed off_t type is being used for the following
        interfaces for both offset and the length in bytes:
        * `Allocate`
        * `RangeSync`
      
        On Linux `off_t` is automatically either 32 or 64-bit depending on
        the platform. On Windows it is always a 32-bit signed long which
        limits file access and in particular space pre-allocation
        to effectively 2 Gb.
      
        Proposal is to replace off_t with uint64_t as a portable type
        always access files with 64-bit interfaces.
      
        May need to modify posix code but lack resources to test it.
      5421c972
  27. 30 10月, 2015 1 次提交
  28. 28 10月, 2015 1 次提交
    • D
      Implement smart buffer management. · 6fbc4f9f
      Dmitri Smirnov 提交于
        introduce a new DBOption random_access_max_buffer_size to limit
        the size of the random access buffer used for unbuffered access.
        Implement read ahead buffering when enabled.
        To that effect propagate compaction_readahead_size and the new option
        to the env options to make it available for the implementation.
        Add Hint() override so SetupForCompaction() call would call Hint()
        readahead can now be setup from both Hint() and EnableReadAhead()
        Add new option random_access_max_buffer_size support
        db_bench, options_helper to make it string parsable
        and the unit test.
      6fbc4f9f
  29. 16 10月, 2015 1 次提交
    • S
      Allow users to disable some kill points in db_stress · e1a5ff85
      sdong 提交于
      Summary:
      Give a name for every kill point, and allow users to disable some kill points based on prefixes. The kill points can be passed by db_stress through a command line paramter. This provides a way for users to boost the chance of triggering low frequency kill points
      This allow follow up changes in crash test scripts to improve crash test coverage.
      
      Test Plan:
      Manually run db_stress with variable values of --kill_random_test and --kill_prefix_blacklist. Like this:
       --kill_random_test=2 --kill_prefix_blacklist=Posix,WritableFileWriter::Append,WritableFileWriter::WriteBuffered,WritableFileWriter::Sync
      
      Reviewers: igor, kradhakrishnan, rven, IslamAbdelRahman, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48735
      e1a5ff85
  30. 14 10月, 2015 1 次提交
  31. 13 10月, 2015 1 次提交
  32. 07 10月, 2015 1 次提交