1. 01 9月, 2015 1 次提交
    • A
      Support static Status messages · 77a28615
      agiardullo 提交于
      Summary: Provide a way to specify a detailed static error message for a Status without incurring a memcpy.  Let me know what people think of this approach.
      
      Test Plan: added simple test
      
      Reviewers: igor, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D44259
      77a28615
  2. 25 8月, 2015 1 次提交
    • A
      Common base class for transactions · 20d1e547
      agiardullo 提交于
      Summary:
      As I keep adding new features to transactions, I keep creating more duplicate code.  This diff cleans this up by creating a base implementation class for Transaction and OptimisticTransaction to inherit from.
      
      The code in TransactionBase.h/.cc is all just copied from elsewhere.  The only entertaining part of this class worth looking at is the virtual TryLock method which allows OptimisticTransactions and Transactions to share the same common code for Put/Get/etc.
      
      The rest of this diff is mostly red and easy on the eyes.
      
      Test Plan: No functionality change.  existing tests pass.
      
      Reviewers: sdong, jkedgar, rven, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45135
      20d1e547
  3. 12 8月, 2015 1 次提交
    • A
      Pessimistic Transactions · c2f2cb02
      agiardullo 提交于
      Summary:
      Initial implementation of Pessimistic Transactions.  This diff contains the api changes discussed in D38913.  This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
      
      MyRocks folks:  please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
      
      Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint().  After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex.  We can then decide which route is preferable.
      
      Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
      
      Test Plan: Unit tests, db_bench parallel testing.
      
      Reviewers: igor, rven, sdong, yhchiang, yoshinorim
      
      Reviewed By: sdong
      
      Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D40869
      c2f2cb02
  4. 07 8月, 2015 1 次提交
    • A
      simple ManagedSnapshot wrapper · 16ea1c7d
      agiardullo 提交于
      Summary: Implemented this simple wrapper for something else I was working on.  Seemed like it makes sense to expose it instead of burying it in some random code.
      
      Test Plan: added test
      
      Reviewers: rven, kradhakrishnan, sdong, yhchiang
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43293
      16ea1c7d
  5. 06 8月, 2015 2 次提交
    • P
      Add function 'GetInfoLogList()' · 960d936e
      Poornima Chozhiyath Raman 提交于
      Summary: The list of info log files of a db can be obtained using the new function.
      
      Test Plan: New test in db_test.cc passed.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: IslamAbdelRahman, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41715
      960d936e
    • S
      Add two unit tests for SyncWAL() · 7ccd1c80
      sdong 提交于
      Summary:
      Add two unit tests for SyncWAL(). One makes sure SyncWAL() doesn't block writes in the other thread. Another one makes sure SyncWAL() doesn't wait ongoing writes to finish before being executed.
      
      Create a new test file db_wal_test and move two WAL related tests from db_test to here.
      
      Test Plan: Run the new tests
      
      Reviewers: IslamAbdelRahman, rven, kradhakrishnan, kolmike, tnovak, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D43605
      7ccd1c80
  6. 05 8月, 2015 2 次提交
    • I
      Support delete rate limiting · c45a57b4
      Islam AbdelRahman 提交于
      Summary:
      Introduce DeleteScheduler that allow enforcing a rate limit on file deletion
      Instead of deleting files immediately, files are moved to trash directory and deleted in a background thread that apply sleep penalty between deletes if needed.
      
      I have updated PurgeObsoleteFiles and PurgeObsoleteWALFiles to use the delete_scheduler instead of env_->DeleteFile
      
      Test Plan:
      added delete_scheduler_test
      existing unit tests
      
      Reviewers: kradhakrishnan, anthony, rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D43221
      c45a57b4
    • Y
      Expose the BackupEngine from the Java API · ce21afd2
      Yueh-Hsuan Chiang 提交于
      Summary:
      Merge pull request #665 by adamretter
      
      Exposes BackupEngine from C++ to the Java API. Previously only BackupableDB was available
      
      Test Plan: BackupEngineTest.java
      
      Reviewers: fyrz, igor, ankgup87, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D42873
      ce21afd2
  7. 04 8月, 2015 1 次提交
    • Y
      Add CompactOnDeletionCollector in utilities/table_properties_collectors. · 26894303
      Yueh-Hsuan Chiang 提交于
      Summary:
      This diff adds CompactOnDeletionCollector in utilities/table_properties_collectors,
      which applies a sliding window to a sst file and mark this file as need-compaction
      when it observe enough deletion entries within the consecutive keys covered by
      the sliding window.
      
      Test Plan: compact_on_deletion_collector_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yoshinorim, sdong
      
      Reviewed By: sdong
      
      Subscribers: maykov, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41175
      26894303
  8. 21 7月, 2015 2 次提交
  9. 18 7月, 2015 1 次提交
    • S
      Move rate_limiter, write buffering, most perf context instrumentation and most... · 6e9fbeb2
      sdong 提交于
      Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env
      
      Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
      
      Test Plan: Run all existing unit tests.
      
      Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42321
      6e9fbeb2
  10. 16 7月, 2015 4 次提交
  11. 15 7月, 2015 5 次提交
  12. 14 7月, 2015 2 次提交
  13. 20 6月, 2015 1 次提交
    • V
      Add wal files to Checkpoint for multiple column families. · 04251e1e
      Venkatesh Radhakrishnan 提交于
      Summary:
      When there are multiple column families, the flush in
      GetLiveFiles is not atomic, so that there are entries in the wal files
      which are needed to get a consisten RocksDB. We now add the log files to
      the checkpoint.
      
      Test Plan:
      CheckpointCF - This test forces more data to be written to
      the other column families after the flush of the first column family but
      before the second.
      
      Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D40323
      04251e1e
  14. 03 6月, 2015 1 次提交
    • Y
      Allow EventListener::OnCompactionCompleted to return CompactionJobStats. · fe5c6321
      Yueh-Hsuan Chiang 提交于
      Summary:
      Allow EventListener::OnCompactionCompleted to return CompactionJobStats,
      which contains useful information about a compaction.
      
      Example CompactionJobStats returned by OnCompactionCompleted():
          smallest_output_key_prefix 05000000
          largest_output_key_prefix 06990000
          elapsed_time 42419
          num_input_records 300
          num_input_files 3
          num_input_files_at_output_level 2
          num_output_records 200
          num_output_files 1
          actual_bytes_input 167200
          actual_bytes_output 110688
          total_input_raw_key_bytes 5400
          total_input_raw_value_bytes 300000
          num_records_replaced 100
          is_manual_compaction 1
      
      Test Plan: Developed a mega test in db_test which covers 20 variables in CompactionJobStats.
      
      Reviewers: rven, igor, anthony, sdong
      
      Reviewed By: sdong
      
      Subscribers: tnovak, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38463
      fe5c6321
  15. 02 6月, 2015 1 次提交
    • M
      more times in perf_context and iostats_context · ec7a9443
      Mike Kolupaev 提交于
      Summary:
      We occasionally get write stalls (>1s Write() calls) on HDD under read load. The following timers explain almost all of the stalls:
       - perf_context.db_mutex_lock_nanos
       - perf_context.db_condition_wait_nanos
       - iostats_context.open_time
       - iostats_context.allocate_time
       - iostats_context.write_time
       - iostats_context.range_sync_time
       - iostats_context.logger_time
      
      In my experiments each of these occasionally takes >1s on write path under some workload. There are rare cases when Write() takes long but none of these takes long.
      
      Test Plan: Added code to our application to write the listed timings to log for slow writes. They usually add up to almost exactly the time Write() call took.
      
      Reviewers: rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: march, dhruba, tnovak
      
      Differential Revision: https://reviews.facebook.net/D39177
      ec7a9443
  16. 30 5月, 2015 1 次提交
    • A
      Optimistic Transactions · dc9d70de
      agiardullo 提交于
      Summary: Optimistic transactions supporting begin/commit/rollback semantics.  Currently relies on checking the memtable to determine if there are any collisions at commit time.  Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty.  You should probably start with transaction.h to get an overview of what is currently supported.
      
      Test Plan: Added a new test, but still need to look into stress testing.
      
      Reviewers: yhchiang, igor, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: adamretter, MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D33435
      dc9d70de
  17. 29 5月, 2015 1 次提交
  18. 13 5月, 2015 1 次提交
    • I
      Add more table properties to EventLogger · dbd95b75
      Igor Canadi 提交于
      Summary:
      Example output:
      
          {"time_micros": 1431463794310521, "job": 353, "event": "table_file_creation", "file_number": 387, "file_size": 86937, "table_info": {"data_size": "81801", "index_size": "9751", "filter_size": "0", "raw_key_size": "23448", "raw_average_key_size": "24.000000", "raw_value_size": "990571", "raw_average_value_size": "1013.890481", "num_data_blocks": "245", "num_entries": "977", "filter_policy_name": "", "kDeletedKeys": "0"}}
      
      Also fixed a bug where BuildTable() in recovery was passing Env::IOHigh argument into paranoid_checks_file parameter.
      
      Test Plan: make check + check out the output in the log
      
      Reviewers: sdong, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38343
      dbd95b75
  19. 12 5月, 2015 1 次提交
    • A
      API to fetch from both a WriteBatchWithIndex and the db · 711465cc
      agiardullo 提交于
      Summary:
      Added a couple functions to WriteBatchWithIndex to make it easier to query the value of a key including reading pending writes from a batch.  (This is needed for transactions).
      
      I created write_batch_with_index_internal.h to use to store an internal-only helper function since there wasn't a good place in the existing class hierarchy to store this function (and it didn't seem right to stick this function inside WriteBatchInternal::Rep).
      
      Since I needed to access the WriteBatchEntryComparator, I moved some helper classes from write_batch_with_index.cc into write_batch_with_index_internal.h/.cc.  WriteBatchIndexEntry, ReadableWriteBatch, and WriteBatchEntryComparator are all unchanged (just moved to a different file(s)).
      
      Test Plan: Added new unit tests.
      
      Reviewers: rven, yhchiang, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38037
      711465cc
  20. 18 4月, 2015 1 次提交
    • I
      Add experimental API MarkForCompaction() · 6059bdf8
      Igor Canadi 提交于
      Summary:
      Some Mongo+Rocks datasets in Parse's environment are not doing compactions very frequently. During the quiet period (with no IO), we'd like to schedule compactions so that our reads become faster. Also, aggressively compacting during quiet periods helps when write bursts happen. In addition, we also want to compact files that are containing deleted key ranges (like old oplog keys).
      
      All of this is currently not possible with CompactRange() because it's single-threaded and blocks all other compactions from happening. Running CompactRange() risks an issue of blocking writes because we generate too much Level 0 files before the compaction is over. Stopping writes is very dangerous because they hold transaction locks. We tried running manual compaction once on Mongo+Rocks and everything fell apart.
      
      MarkForCompaction() solves all of those problems. This is very light-weight manual compaction. It is lower priority than automatic compactions, which means it shouldn't interfere with background process keeping the LSM tree clean. However, if no automatic compactions need to be run (or we have extra background threads available), we will start compacting files that are marked for compaction.
      
      Test Plan: added a new unit test
      
      Reviewers: yhchiang, rven, MarkCallaghan, sdong
      
      Reviewed By: sdong
      
      Subscribers: yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D37083
      6059bdf8
  21. 14 4月, 2015 1 次提交
  22. 08 4月, 2015 1 次提交
    • J
      build: don't use a glob for java/rocksjni/* · cba59200
      Jim Meyering 提交于
      Summary:
      * src.mk (JNI_NATIVE_SOURCES): New variable, so we don't have to use
      a glob in Makefile
      * Makefile (JNI_NATIVE_SOURCES): Remove glob-using definition, now
      that the explicit list of sources is in src.mk.
      
      Test Plan:
        Run this:
          JAVA_HOME=/usr/local/jdk-7u67-64 PATH=$JAVA_HOME/bin:$PATH \
            make rocksdbjava
      
      Reviewers: yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D36633
      cba59200
  23. 31 3月, 2015 2 次提交
    • I
      Makefile minor cleanup · 2511b7d9
      Igor Canadi 提交于
      Summary:
      Just couple of small changes:
      1. removed signal_test, since it doesn't seem useful and we don't even run it as part of `make check`
      2. moved perf_context_test to TESTS instead of PROGRAMS
      3. `make release` probably shouldn't compile benchmarks. We currently rely on `make release` building db_bench (via Jenkins), so I left db_bench there.
      
      This is just a minor cleanup. We need to rethink our targets since they are a bit messy right now. We can do this during our tech debt week.
      
      Test Plan: make release
      
      Reviewers: anthony, rven, yhchiang, sdong, meyering
      
      Reviewed By: meyering
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D36171
      2511b7d9
    • I
      db_bench can now disable flashcache for background threads · d61cb0b9
      Igor Canadi 提交于
      Summary: Most of the approach is copied from WebSQL's MySQL branch. It's nice that we can do this without touching core RocksDB code.
      
      Test Plan: Compiles and runs. Didn't test flashback code, as I don't have flashback device and most if it is c/p
      
      Reviewers: MarkCallaghan, sdong
      
      Reviewed By: sdong
      
      Subscribers: rven, lgalanis, kradhakrishnan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D35391
      d61cb0b9
  24. 18 3月, 2015 1 次提交
    • A
      Create an abstract interface for write batches · 81345b90
      agiardullo 提交于
      Summary: WriteBatch and WriteBatchWithIndex now both inherit from a common abstract base class.  This makes it easier to write code that is agnostic toward the implementation of the particular write batch.  In particular, I plan on utilizing this abstraction to allow transactions to support using either implementation of a write batch.
      
      Test Plan: modified existing WriteBatchWithIndex tests to test new functions.  Running all tests.
      
      Reviewers: igor, rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D34017
      81345b90
  25. 17 3月, 2015 1 次提交
    • I
      rocksdb: Add gtest · a7aba2ef
      Igor Sugak 提交于
      Summary:
      Adds gtest fused source code into `third-party` directory. No manual changes.
      
      gtest latest released 1.7 has clang dev compilation errors. Trunk version requires only one disabled warning (-Wno-missing-field-initializers)
      
      Fused code is made as described here https://fburl.com/90806322
      Details about why we need gtest source code instead of precompiled library https://fburl.com/90805763
      Source used from http://googletest.googlecode.com/svn/trunk
      
      Test Plan:
      Build and notice no errors. Also check in logs that gtest-all.o being compiled gtest-all.o.
      ```lang=bash
      % USE_CLANG=1 make all
      ```
      
      Reviewers: lgalanis, yufei.zhu, rven, sdong, igor, meyering
      
      Reviewed By: meyering
      
      Subscribers: meyering, yhchiang, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D33345
      a7aba2ef
  26. 14 3月, 2015 1 次提交
    • I
      EventLogger · 52d8347a
      Igor Canadi 提交于
      Summary:
      Here's my proposal for making our LOGs easier to read by machines.
      
      The idea is to dump all events as JSON objects. JSON is easy to read by humans, but more importantly, it's easy to read by machines. That way, we can parse this, load into SQLite/mongo and then query or visualize.
      
      I started with table_create and table_delete events, but if everybody agrees, I'll continue by adding more events (flush/compaction/etc etc)
      
      Test Plan:
      Ran db_bench. Observed:
      2015/01/15-14:13:25.788019 1105ef000 EVENT_LOG_v1 {"time_micros": 1421360005788015, "event": "table_file_creation", "file_number": 12, "file_size": 1909699}
      2015/01/15-14:13:25.956500 110740000 EVENT_LOG_v1 {"time_micros": 1421360005956498, "event": "table_file_deletion", "file_number": 12}
      
      Reviewers: yhchiang, rven, dhruba, MarkCallaghan, lgalanis, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D31647
      52d8347a
  27. 12 3月, 2015 1 次提交
  28. 07 3月, 2015 1 次提交