1. 06 10月, 2015 1 次提交
  2. 01 10月, 2015 1 次提交
    • E
      New amalgamation target · 7a23e4d8
      Evan Shaw 提交于
      This commit adds two new targets to the Makefile: rocksdb.cc and rocksdb.h
      
      These files, when combined with the c.h header, are a self-contained RocksDB
      source distribution called an amalgamation. (The name comes from SQLite's, which
      is similar in concept.)
      
      The main benefit of an amalgamation is that it's very easy to drop into a
      new project. It also compiles faster compared to compiling individual source
      files and potentially gives the compiler more opportunity to make optimizations
      since it can see all functions at once.
      
      rocksdb.cc and rocksdb.h are generated by a new script, amalgamate.py.
      A detailed description of how amalgamate.py works is in a comment at the top of
      the file.
      
      There are also some small changes to existing files to enable the amalgamation:
      * Use quotes for includes in unity build
      * Fix an old header inclusion in util/xfunc.cc
      * Move some includes outside ifdef in util/env_hdfs.cc
      * Separate out tool sources in Makefile so they won't be included in unity.cc
      * Unity build now produces a static library
      
      Closes #733
      7a23e4d8
  3. 30 9月, 2015 1 次提交
    • Y
      RocksDB Options file format and its serialization / deserialization. · 74b100ac
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch defines the format of RocksDB options file, which
      follows the INI file format, and implements functions for its
      serialization and deserialization.  An example RocksDB options
      file can be found in examples/rocksdb_option_file_example.ini.
      
      A typical RocksDB options file has three sections, which are
      Version, DBOptions, and more than one CFOptions.  The RocksDB
      options file in general follows the basic INI file format
      with the following extensions / modifications:
       * Escaped characters
         We escaped the following characters:
          - \n -- line feed - new line
          - \r -- carriage return
          - \\ -- backslash \
          - \: -- colon symbol :
          - \# -- hash tag #
       * Comments
         We support # style comments.  Comments can appear at the ending
         part of a line.
       * Statements
         A statement is of the form option_name = value.
         Each statement contains a '=', where extra white-spaces
         are supported. However, we don't support multi-lined statement.
         Furthermore, each line can only contain at most one statement.
       * Section
         Sections are of the form [SecitonTitle "SectionArgument"],
         where section argument is optional.
       * List
         We use colon-separated string to represent a list.
         For instance, n1:n2:n3:n4 is a list containing four values.
      
      Below is an example of a RocksDB options file:
      
      [Version]
        rocksdb_version=4.0.0
        options_file_version=1.0
      [DBOptions]
        max_open_files=12345
        max_background_flushes=301
      [CFOptions "default"]
      [CFOptions "the second column family"]
      [CFOptions "the third column family"]
      
      Test Plan: Added many tests in options_test.cc
      
      Reviewers: igor, IslamAbdelRahman, sdong, anthony
      
      Reviewed By: anthony
      
      Subscribers: maykov, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D46059
      74b100ac
  4. 24 9月, 2015 2 次提交
    • A
      Remove ldb HexToString method's usage of sscanf · 4805fa0e
      Assaf Sela 提交于
      Summary:
      Fix hex2String performance issues by removing sscanf dependency.
      Also fixed some edge case handling (odd length, bad input).
      
      Test Plan: Created a test file which called old and new implementation, and validated results are the same. I'll paste results in the phabricator diff.
      
      Reviewers: igor, rven, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: thatsafunnyname, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D46785
      4805fa0e
    • I
      Add experimental DB::AddFile() to plug sst files into empty DB · f03b5c98
      Islam AbdelRahman 提交于
      Summary:
      This is an initial version of bulk load feature
      
      This diff allow us to create sst files, and then bulk load them later, right now the restrictions for loading an sst file are
      (1) Memtables are empty
      (2) Added sst files have sequence number = 0, and existing values in database have sequence number = 0
      (3) Added sst files values are not overlapping
      
      Test Plan: unit testing
      
      Reviewers: igor, ott, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, ott, dhruba
      
      Differential Revision: https://reviews.facebook.net/D39081
      f03b5c98
  5. 11 9月, 2015 1 次提交
    • A
      Refactored common code of Builder/CompactionJob out into a CompactionIterator · 8aa1f151
      Andres Noetzli 提交于
      Summary:
      Builder and CompactionJob share a lot of fairly complex code. This patch
      refactors this code into a separate class, the CompactionIterator. Because the
      shared code is fairly complex, this patch hopefully improves maintainability.
      While there are is a lot of potential for further improvements, the patch is
      intentionally pretty close to the original structure because the change is
      already complex enough.
      
      Test Plan: make clean all check && ./db_stress
      
      Reviewers: rven, anthony, yhchiang, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D46197
      8aa1f151
  6. 09 9月, 2015 1 次提交
    • A
      TransactionDB Custom Locking API · 5e94f68f
      agiardullo 提交于
      Summary:
      Prototype of API to allow MyRocks to override default Mutex/CondVar used by transactions with their own implementations.  They would simply need to pass their own implementations of Mutex/CondVar to the templated TransactionDB::Open().
      
      Default implementation of TransactionDBMutex/TransactionDBCondVar provided (but the code is not currently changed to use this).
      
      Let me know if this API makes sense or if it should be changed
      
      Test Plan: n/a
      
      Reviewers: yhchiang, rven, igor, sdong, spetrunia
      
      Reviewed By: spetrunia
      
      Subscribers: maykov, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43761
      5e94f68f
  7. 01 9月, 2015 1 次提交
    • A
      Support static Status messages · 77a28615
      agiardullo 提交于
      Summary: Provide a way to specify a detailed static error message for a Status without incurring a memcpy.  Let me know what people think of this approach.
      
      Test Plan: added simple test
      
      Reviewers: igor, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D44259
      77a28615
  8. 25 8月, 2015 1 次提交
    • A
      Common base class for transactions · 20d1e547
      agiardullo 提交于
      Summary:
      As I keep adding new features to transactions, I keep creating more duplicate code.  This diff cleans this up by creating a base implementation class for Transaction and OptimisticTransaction to inherit from.
      
      The code in TransactionBase.h/.cc is all just copied from elsewhere.  The only entertaining part of this class worth looking at is the virtual TryLock method which allows OptimisticTransactions and Transactions to share the same common code for Put/Get/etc.
      
      The rest of this diff is mostly red and easy on the eyes.
      
      Test Plan: No functionality change.  existing tests pass.
      
      Reviewers: sdong, jkedgar, rven, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45135
      20d1e547
  9. 12 8月, 2015 1 次提交
    • A
      Pessimistic Transactions · c2f2cb02
      agiardullo 提交于
      Summary:
      Initial implementation of Pessimistic Transactions.  This diff contains the api changes discussed in D38913.  This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
      
      MyRocks folks:  please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
      
      Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint().  After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex.  We can then decide which route is preferable.
      
      Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
      
      Test Plan: Unit tests, db_bench parallel testing.
      
      Reviewers: igor, rven, sdong, yhchiang, yoshinorim
      
      Reviewed By: sdong
      
      Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D40869
      c2f2cb02
  10. 07 8月, 2015 1 次提交
    • A
      simple ManagedSnapshot wrapper · 16ea1c7d
      agiardullo 提交于
      Summary: Implemented this simple wrapper for something else I was working on.  Seemed like it makes sense to expose it instead of burying it in some random code.
      
      Test Plan: added test
      
      Reviewers: rven, kradhakrishnan, sdong, yhchiang
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43293
      16ea1c7d
  11. 06 8月, 2015 2 次提交
    • P
      Add function 'GetInfoLogList()' · 960d936e
      Poornima Chozhiyath Raman 提交于
      Summary: The list of info log files of a db can be obtained using the new function.
      
      Test Plan: New test in db_test.cc passed.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: IslamAbdelRahman, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41715
      960d936e
    • S
      Add two unit tests for SyncWAL() · 7ccd1c80
      sdong 提交于
      Summary:
      Add two unit tests for SyncWAL(). One makes sure SyncWAL() doesn't block writes in the other thread. Another one makes sure SyncWAL() doesn't wait ongoing writes to finish before being executed.
      
      Create a new test file db_wal_test and move two WAL related tests from db_test to here.
      
      Test Plan: Run the new tests
      
      Reviewers: IslamAbdelRahman, rven, kradhakrishnan, kolmike, tnovak, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D43605
      7ccd1c80
  12. 05 8月, 2015 2 次提交
    • I
      Support delete rate limiting · c45a57b4
      Islam AbdelRahman 提交于
      Summary:
      Introduce DeleteScheduler that allow enforcing a rate limit on file deletion
      Instead of deleting files immediately, files are moved to trash directory and deleted in a background thread that apply sleep penalty between deletes if needed.
      
      I have updated PurgeObsoleteFiles and PurgeObsoleteWALFiles to use the delete_scheduler instead of env_->DeleteFile
      
      Test Plan:
      added delete_scheduler_test
      existing unit tests
      
      Reviewers: kradhakrishnan, anthony, rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D43221
      c45a57b4
    • Y
      Expose the BackupEngine from the Java API · ce21afd2
      Yueh-Hsuan Chiang 提交于
      Summary:
      Merge pull request #665 by adamretter
      
      Exposes BackupEngine from C++ to the Java API. Previously only BackupableDB was available
      
      Test Plan: BackupEngineTest.java
      
      Reviewers: fyrz, igor, ankgup87, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D42873
      ce21afd2
  13. 04 8月, 2015 1 次提交
    • Y
      Add CompactOnDeletionCollector in utilities/table_properties_collectors. · 26894303
      Yueh-Hsuan Chiang 提交于
      Summary:
      This diff adds CompactOnDeletionCollector in utilities/table_properties_collectors,
      which applies a sliding window to a sst file and mark this file as need-compaction
      when it observe enough deletion entries within the consecutive keys covered by
      the sliding window.
      
      Test Plan: compact_on_deletion_collector_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yoshinorim, sdong
      
      Reviewed By: sdong
      
      Subscribers: maykov, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41175
      26894303
  14. 21 7月, 2015 2 次提交
  15. 18 7月, 2015 1 次提交
    • S
      Move rate_limiter, write buffering, most perf context instrumentation and most... · 6e9fbeb2
      sdong 提交于
      Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env
      
      Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
      
      Test Plan: Run all existing unit tests.
      
      Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42321
      6e9fbeb2
  16. 16 7月, 2015 4 次提交
  17. 15 7月, 2015 5 次提交
  18. 14 7月, 2015 2 次提交
  19. 20 6月, 2015 1 次提交
    • V
      Add wal files to Checkpoint for multiple column families. · 04251e1e
      Venkatesh Radhakrishnan 提交于
      Summary:
      When there are multiple column families, the flush in
      GetLiveFiles is not atomic, so that there are entries in the wal files
      which are needed to get a consisten RocksDB. We now add the log files to
      the checkpoint.
      
      Test Plan:
      CheckpointCF - This test forces more data to be written to
      the other column families after the flush of the first column family but
      before the second.
      
      Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D40323
      04251e1e
  20. 03 6月, 2015 1 次提交
    • Y
      Allow EventListener::OnCompactionCompleted to return CompactionJobStats. · fe5c6321
      Yueh-Hsuan Chiang 提交于
      Summary:
      Allow EventListener::OnCompactionCompleted to return CompactionJobStats,
      which contains useful information about a compaction.
      
      Example CompactionJobStats returned by OnCompactionCompleted():
          smallest_output_key_prefix 05000000
          largest_output_key_prefix 06990000
          elapsed_time 42419
          num_input_records 300
          num_input_files 3
          num_input_files_at_output_level 2
          num_output_records 200
          num_output_files 1
          actual_bytes_input 167200
          actual_bytes_output 110688
          total_input_raw_key_bytes 5400
          total_input_raw_value_bytes 300000
          num_records_replaced 100
          is_manual_compaction 1
      
      Test Plan: Developed a mega test in db_test which covers 20 variables in CompactionJobStats.
      
      Reviewers: rven, igor, anthony, sdong
      
      Reviewed By: sdong
      
      Subscribers: tnovak, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38463
      fe5c6321
  21. 02 6月, 2015 1 次提交
    • M
      more times in perf_context and iostats_context · ec7a9443
      Mike Kolupaev 提交于
      Summary:
      We occasionally get write stalls (>1s Write() calls) on HDD under read load. The following timers explain almost all of the stalls:
       - perf_context.db_mutex_lock_nanos
       - perf_context.db_condition_wait_nanos
       - iostats_context.open_time
       - iostats_context.allocate_time
       - iostats_context.write_time
       - iostats_context.range_sync_time
       - iostats_context.logger_time
      
      In my experiments each of these occasionally takes >1s on write path under some workload. There are rare cases when Write() takes long but none of these takes long.
      
      Test Plan: Added code to our application to write the listed timings to log for slow writes. They usually add up to almost exactly the time Write() call took.
      
      Reviewers: rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: march, dhruba, tnovak
      
      Differential Revision: https://reviews.facebook.net/D39177
      ec7a9443
  22. 30 5月, 2015 1 次提交
    • A
      Optimistic Transactions · dc9d70de
      agiardullo 提交于
      Summary: Optimistic transactions supporting begin/commit/rollback semantics.  Currently relies on checking the memtable to determine if there are any collisions at commit time.  Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty.  You should probably start with transaction.h to get an overview of what is currently supported.
      
      Test Plan: Added a new test, but still need to look into stress testing.
      
      Reviewers: yhchiang, igor, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: adamretter, MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D33435
      dc9d70de
  23. 29 5月, 2015 1 次提交
  24. 13 5月, 2015 1 次提交
    • I
      Add more table properties to EventLogger · dbd95b75
      Igor Canadi 提交于
      Summary:
      Example output:
      
          {"time_micros": 1431463794310521, "job": 353, "event": "table_file_creation", "file_number": 387, "file_size": 86937, "table_info": {"data_size": "81801", "index_size": "9751", "filter_size": "0", "raw_key_size": "23448", "raw_average_key_size": "24.000000", "raw_value_size": "990571", "raw_average_value_size": "1013.890481", "num_data_blocks": "245", "num_entries": "977", "filter_policy_name": "", "kDeletedKeys": "0"}}
      
      Also fixed a bug where BuildTable() in recovery was passing Env::IOHigh argument into paranoid_checks_file parameter.
      
      Test Plan: make check + check out the output in the log
      
      Reviewers: sdong, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38343
      dbd95b75
  25. 12 5月, 2015 1 次提交
    • A
      API to fetch from both a WriteBatchWithIndex and the db · 711465cc
      agiardullo 提交于
      Summary:
      Added a couple functions to WriteBatchWithIndex to make it easier to query the value of a key including reading pending writes from a batch.  (This is needed for transactions).
      
      I created write_batch_with_index_internal.h to use to store an internal-only helper function since there wasn't a good place in the existing class hierarchy to store this function (and it didn't seem right to stick this function inside WriteBatchInternal::Rep).
      
      Since I needed to access the WriteBatchEntryComparator, I moved some helper classes from write_batch_with_index.cc into write_batch_with_index_internal.h/.cc.  WriteBatchIndexEntry, ReadableWriteBatch, and WriteBatchEntryComparator are all unchanged (just moved to a different file(s)).
      
      Test Plan: Added new unit tests.
      
      Reviewers: rven, yhchiang, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38037
      711465cc
  26. 18 4月, 2015 1 次提交
    • I
      Add experimental API MarkForCompaction() · 6059bdf8
      Igor Canadi 提交于
      Summary:
      Some Mongo+Rocks datasets in Parse's environment are not doing compactions very frequently. During the quiet period (with no IO), we'd like to schedule compactions so that our reads become faster. Also, aggressively compacting during quiet periods helps when write bursts happen. In addition, we also want to compact files that are containing deleted key ranges (like old oplog keys).
      
      All of this is currently not possible with CompactRange() because it's single-threaded and blocks all other compactions from happening. Running CompactRange() risks an issue of blocking writes because we generate too much Level 0 files before the compaction is over. Stopping writes is very dangerous because they hold transaction locks. We tried running manual compaction once on Mongo+Rocks and everything fell apart.
      
      MarkForCompaction() solves all of those problems. This is very light-weight manual compaction. It is lower priority than automatic compactions, which means it shouldn't interfere with background process keeping the LSM tree clean. However, if no automatic compactions need to be run (or we have extra background threads available), we will start compacting files that are marked for compaction.
      
      Test Plan: added a new unit test
      
      Reviewers: yhchiang, rven, MarkCallaghan, sdong
      
      Reviewed By: sdong
      
      Subscribers: yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D37083
      6059bdf8
  27. 14 4月, 2015 1 次提交
  28. 08 4月, 2015 1 次提交
    • J
      build: don't use a glob for java/rocksjni/* · cba59200
      Jim Meyering 提交于
      Summary:
      * src.mk (JNI_NATIVE_SOURCES): New variable, so we don't have to use
      a glob in Makefile
      * Makefile (JNI_NATIVE_SOURCES): Remove glob-using definition, now
      that the explicit list of sources is in src.mk.
      
      Test Plan:
        Run this:
          JAVA_HOME=/usr/local/jdk-7u67-64 PATH=$JAVA_HOME/bin:$PATH \
            make rocksdbjava
      
      Reviewers: yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D36633
      cba59200