1. 18 11月, 2015 1 次提交
    • S
      DBTest.MergeTestTime to only use fake time to be determinstic · d5540e18
      sdong 提交于
      Summary: DBTest.MergeTestTime is a test verifying timing counters. Depending on real time may cause non-determinstic results. Change to fake time to be determinsitic.
      
      Test Plan: Run the test and make sure it passes
      
      Reviewers: yhchiang, anthony, rven, kradhakrishnan, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D50883
      d5540e18
  2. 19 10月, 2015 1 次提交
  3. 17 10月, 2015 1 次提交
  4. 14 10月, 2015 2 次提交
    • I
      Make db_test_util compile under ROCKSDB_LITE · f55d3009
      Islam AbdelRahman 提交于
      Summary: db_test_util is used in multiple test files but it dont compile under ROCKSDB_LITE
      
      Test Plan:
      make check
      make static_lib
      OPT=-DROCKSDB_LITE make db_wal_test
      
      Reviewers: igor, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48579
      f55d3009
    • S
      Seperate InternalIterator from Iterator · 35ad531b
      sdong 提交于
      Summary:
      Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.
      
      This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
      At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.
      
      Test Plan: Run all existing tests.
      
      Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven
      
      Reviewed By: rven
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48549
      35ad531b
  5. 13 10月, 2015 1 次提交
  6. 07 10月, 2015 1 次提交
    • D
      Support for LevelDB SST with .ldb suffix · 02675026
      dyniusz 提交于
      Summary:
      	Handle SST files with both ".sst" and ".ldb" suffix.
      	This enables user to migrate from leveldb to rocksdb.
      
      Test Plan:
              Added unit test with DB operating on SSTs with names schema.
              See db/dc_test.cc:SSTsWithLdbSuffixHandling for details
      
      Reviewers: yhchiang, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D48003
      02675026
  7. 24 9月, 2015 1 次提交
    • S
      PlainTableReader to support non-mmap mode · df34aea3
      sdong 提交于
      Summary:
      PlainTableReader now only allows mmap-mode. Add the support to non-mmap mode for more flexibility.
      Refactor the codes to move all logic of reading data to PlainTableKeyDecoder, and consolidate the calls to Read() call and ReadVarint32() call. Implement the calls for both of mmap and non-mmap case seperately. For non-mmap mode, make copy of keys in several places when we need to move the buffer after reading the keys.
      
      Test Plan: Add the mode of non-mmap case in plain_table_db_test. Run it in valgrind mode too.
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D47187
      df34aea3
  8. 18 9月, 2015 1 次提交
    • A
      Support for SingleDelete() · 014fd55a
      Andres Noetzli 提交于
      Summary:
      This patch fixes #7460559. It introduces SingleDelete as a new database
      operation. This operation can be used to delete keys that were never
      overwritten (no put following another put of the same key). If an overwritten
      key is single deleted the behavior is undefined. Single deletion of a
      non-existent key has no effect but multiple consecutive single deletions are
      not allowed (see limitations).
      
      In contrast to the conventional Delete() operation, the deletion entry is
      removed along with the value when the two are lined up in a compaction. Note:
      The semantics are similar to @igor's prototype that allowed to have this
      behavior on the granularity of a column family (
      https://reviews.facebook.net/D42093 ). This new patch, however, is more
      aggressive when it comes to removing tombstones: It removes the SingleDelete
      together with the value whenever there is no snapshot between them while the
      older patch only did this when the sequence number of the deletion was older
      than the earliest snapshot.
      
      Most of the complex additions are in the Compaction Iterator, all other changes
      should be relatively straightforward. The patch also includes basic support for
      single deletions in db_stress and db_bench.
      
      Limitations:
      - Not compatible with cuckoo hash tables
      - Single deletions cannot be used in combination with merges and normal
        deletions on the same key (other keys are not affected by this)
      - Consecutive single deletions are currently not allowed (and older version of
        this patch supported this so it could be resurrected if needed)
      
      Test Plan: make all check
      
      Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
      
      Reviewed By: igor
      
      Subscribers: maykov, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43179
      014fd55a
  9. 17 9月, 2015 1 次提交
  10. 12 9月, 2015 1 次提交
    • D
      Refactor to support file_reader_writer on Windows. · 30e82d5c
      Dmitri Smirnov 提交于
        Summary. A change https://reviews.facebook.net/differential/diff/224721/
        Has attempted to move common functionality out of platform dependent
        code to a new facility called file_reader_writer.
        This includes:
        - perf counters
        - Buffering
        - RateLimiting
      
        However, the change did not attempt to refactor Windows code.
        To mitigate, we introduce new quering interfaces such as UseOSBuffer(),
        GetRequiredBufferAlignment() and ReaderWriterForward()
        for pure forwarding where required.
        Introduce WritableFile got a new method Truncate(). This is to communicate
        to the file as to how much data it has on close.
         - When space is pre-allocated on Linux it is filled with zeros implicitly,
          no such thing exist on Windows so we must truncate file on close.
         - When operating in unbuffered mode the last page is filled with zeros but we still want to truncate.
      
         Previously, Close() would take care of it but now buffer management is shifted to the wrappers and the file has
         no idea about the file true size.
      
         This means that Close() on the wrapper level must always include
         Truncate() as well as wrapper __dtor should call Close() and
         against double Close().
         Move buffered/unbuffered write logic to the wrapper.
         Utilize Aligned buffer class.
         Adjust tests and implement Truncate() where necessary.
         Come up with reasonable defaults for new virtual interfaces.
         Forward calls for RandomAccessReadAhead class to avoid double
         buffering and locking (double locking in unbuffered mode on WIndows).
      30e82d5c
  11. 01 9月, 2015 2 次提交
    • A
      Add Subcompactions to Universal Compaction Unit Tests · 8b689546
      Ari Ekmekji 提交于
      Summary:
      Now that the approach to parallelizing L0-L1 level-based
      compactions by breaking the compaction job into subcompactions is
      being extended to apply to universal compactions as well, the unit
      tests need to account for this and run the universal compaction
      tests with subcompactions both enabled and disabled.
      
      Test Plan: make all && make check
      
      Reviewers: sdong, igor, noetzli, anthony, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D45657
      8b689546
    • S
      Arena usage to be calculated using malloc_usable_size() · 3d78eb66
      sdong 提交于
      Summary: malloc_usable_size() gets a better estimation of memory usage. It is already used to calculate block cache memory usage. Use it in arena too.
      
      Test Plan: Run all unit tests
      
      Reviewers: anthony, kradhakrishnan, rven, IslamAbdelRahman, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D43317
      3d78eb66
  12. 21 8月, 2015 1 次提交
    • S
      Add options.new_table_reader_for_compaction_inputs · 9130873a
      sdong 提交于
      Summary: Currently compaction inputs share the same file descriptor and table reader as other foreground threads. It makes fadvise works less predictable. Add options.new_table_reader_for_compaction_inputs to enforce to create a new file descriptor and new table reader for it.
      
      Test Plan: Add the option.
      
      Reviewers: rven, anthony, kradhakrishnan, IslamAbdelRahman, igor, yhchiang
      
      Reviewed By: igor
      
      Subscribers: igor, MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D43311
      9130873a
  13. 06 8月, 2015 1 次提交
    • S
      Add two unit tests for SyncWAL() · 7ccd1c80
      sdong 提交于
      Summary:
      Add two unit tests for SyncWAL(). One makes sure SyncWAL() doesn't block writes in the other thread. Another one makes sure SyncWAL() doesn't wait ongoing writes to finish before being executed.
      
      Create a new test file db_wal_test and move two WAL related tests from db_test to here.
      
      Test Plan: Run the new tests
      
      Reviewers: IslamAbdelRahman, rven, kradhakrishnan, kolmike, tnovak, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D43605
      7ccd1c80
  14. 05 8月, 2015 2 次提交
    • M
      [wal changes 3/3] method in DB to sync WAL without blocking writers · e06cf1a0
      Mike Kolupaev 提交于
      Summary:
      Subj. We really need this feature.
      
      Previous diff D40899 has most of the changes to make this possible, this diff just adds the method.
      
      Test Plan: `make check`, the new test fails without this diff; ran with ASAN, TSAN and valgrind.
      
      Reviewers: igor, rven, IslamAbdelRahman, anthony, kradhakrishnan, tnovak, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, maykov, hermanlee4, yoshinorim, tnovak, dhruba
      
      Differential Revision: https://reviews.facebook.net/D40905
      e06cf1a0
    • Y
      Add DBOptions::skip_sats_update_on_db_open · 14d0bfa4
      Yueh-Hsuan Chiang 提交于
      Summary:
      UpdateAccumulatedStats() is used to optimize compaction decision
      esp. when the number of deletion entries are high, but this function
      can slowdown DBOpen esp. in disk environment.
      
      This patch adds DBOptions::skip_sats_update_on_db_open, which skips
      UpdateAccumulatedStats() in DB::Open() time when it's set to true.
      
      Test Plan: Add DBCompactionTest.SkipStatsUpdateTest
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: tnovak, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42843
      14d0bfa4
  15. 04 8月, 2015 1 次提交
    • A
      Parallelize L0-L1 Compaction: Restructure Compaction Job · 40c64434
      Ari Ekmekji 提交于
      Summary:
      As of now compactions involving files from Level 0 and Level 1 are single
      threaded because the files in L0, although sorted, are not range partitioned like
      the other levels. This means that during L0-L1 compaction each file from L1
      needs to be merged with potentially all the files from L0.
      
      This attempt to parallelize the L0-L1 compaction assigns a thread and a
      corresponding iterator to each L1 file that then considers only the key range
      found in that L1 file and only the L0 files that have those keys (and only the
      specific portion of those L0 files in which those keys are found). In this way
      the overlap is minimized and potentially eliminated between different iterators
      focusing on the same files.
      
      The first step is to restructure the compaction logic to break L0-L1 compactions
      into multiple, smaller, sequential compactions. Eventually each of these smaller
      jobs will be run simultaneously. Areas to pay extra attention to are
      
        # Correct aggregation of compaction job statistics across multiple threads
        # Proper opening/closing of output files (make sure each thread's is unique)
        # Keys that span multiple L1 files
        # Skewed distributions of keys within L0 files
      
      Test Plan: Make and run db_test (newer version has separate compaction tests) and compaction_job_stats_test
      
      Reviewers: igor, noetzli, anthony, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42699
      40c64434
  16. 22 7月, 2015 1 次提交
    • S
      Tests to avoid to use TMPDIR directly · 85ac6553
      sdong 提交于
      Summary: Directly using TMPDIR can cause problems when running tests using parallel option. Fix them.
      
      Test Plan: Run all tests in parallel
      
      Reviewers: kradhakrishnan, yhchiang, IslamAbdelRahman, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42807
      85ac6553
  17. 18 7月, 2015 3 次提交
    • S
      Move rate_limiter, write buffering, most perf context instrumentation and most... · 6e9fbeb2
      sdong 提交于
      Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env
      
      Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
      
      Test Plan: Run all existing unit tests.
      
      Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42321
      6e9fbeb2
    • I
      Don't let flushes preempt compactions · 35ca5936
      Igor Canadi 提交于
      Summary:
      When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.
      
      We have a special case that when you set max_background_flushes to 0, we
      schedule the flush to execute on the compaction thread.
      
      Test Plan: make check (there might be some unit tests that depend on this behavior)
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41931
      35ca5936
    • A
      Fix ROCKSDB_WARNING · 79373c37
      agiardullo 提交于
      Summary:
      ROCKSDB_WARNING is only defined if either ROCKSDB_PLATFORM_POSIX or OS_WIN is defined.  This works well for building rocksdb with its own build scripts.  But this won't work when an outside project(like mongodb) doesn't define ROCKSDB_PLATFORM_POSIX.
      
      This fix defines ROCKSDB_WARNING for all platforms.  No idea if its defined correctly on non-posix,non-windows platforms but this is no worse that the current situation where this macro is missing on unexpected platforms.
      
      This fix should hopefully fix anyone whose build broke now that we've switched from using #warning to Pragma (to support windows).  Unfortunately, while mongo-rocks compiles, it ignores the Pragma and doesn't print a warning.  I have not been able to figure out a way to implement this portably on all platforms.
      
      Of course, an alternate solution would be to just get rid of ROCKSDB_WARNING and live with include file redirects indefinitely.  Thoughts?
      
      Test Plan: build rocks, build mongorocks
      
      Reviewers: igor, kradhakrishnan, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42477
      79373c37
  18. 17 7月, 2015 1 次提交
    • D
      Ensure Windows build w/o port/port.h in public headers · d1a45718
      Dmitri Smirnov 提交于
       - Remove make file defines from public headers and use _WIN32 because it is compiler defined
       - use __GNUC__ and __clang__ to guard non-portable attributes
       - add #include "port/port.h" to some new .cc files.
       - minor changes in CMakeLists to reflect recent changes
      d1a45718
  19. 16 7月, 2015 1 次提交
    • D
      Ensure Windows build w/o port/port.h in public headers · 247690fe
      Dmitri Smirnov 提交于
       - Remove make file defines from public headers and use _WIN32 because it is compiler defined
       - use __GNUC__ and __clang__ to guard non-portable attributes
       - add #include "port/port.h" to some new .cc files.
       - minor changes in CMakeLists to reflect recent changes
      247690fe
  20. 15 7月, 2015 2 次提交
  21. 14 7月, 2015 3 次提交
    • I
      Deprecate purge_redundant_kvs_while_flush · a9c51095
      Igor Canadi 提交于
      Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.
      
      Test Plan: none
      
      Reviewers: sdong, yhchiang, anthony, dhruba
      
      Reviewed By: dhruba
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42063
      a9c51095
    • Y
      Move global static functions in db_test_util to DBTestBase · 49f42ad0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Move global static functions in db_test_util to DBTestBase.
      This is to prevent unused function warning when decoupling
      db_test.cc into multiple files.
      
      Test Plan: db_test
      
      Reviewers: igor, sdong, anthony, IslamAbdelRahman, kradhakrishnan
      
      Reviewed By: kradhakrishnan
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D42009
      49f42ad0
    • Y
      Move reusable part of db_test.cc to util/db_test_util.h · 625467a0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Move reusable part of db_test.cc to util/db_test_util.h.
      This makes it more possible to partition db_test.cc into
      multiple smaller test files.
      
      Also, fixed many old lint errors in db_test.
      
      Test Plan: db_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong, kradhakrishnan
      
      Reviewed By: sdong, kradhakrishnan
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41973
      625467a0