1. 08 12月, 2015 3 次提交
    • A
      Support marking snapshots for write-conflict checking · ec704aaf
      agiardullo 提交于
      Summary:
      D50475 enables using SST files for transaction write-conflict checking.  In order for this to work, we need to make sure not to compact out SingleDeletes when there is an earlier transaction snapshot(D50295).  If there is a long-held snapshot, this could reduce the benefit of the SingleDelete optimization.
      
      This diff allows Transactions to mark snapshots as being used for write-conflict checking.  Then, during compaction, we will be able to optimize SingleDeletes better in the future.
      
      This diff adds a flag to SnapshotImpl which is used by Transactions.  This diff also passes the earliest write-conflict snapshot's sequence number to CompactionIterator.  This diff does not actually change Compaction (after this diff is pushed, D50295 will be able to use this information).
      
      Test Plan: no behavior change, ran existing tests
      
      Reviewers: rven, kradhakrishnan, yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D51183
      ec704aaf
    • S
      Revert "Fix a race condition in persisting options" · f307036b
      sdong 提交于
      This reverts commit 2fa3ed51. It breaks RocksDB lite build
      f307036b
    • Y
      Fix a race condition in persisting options · 2fa3ed51
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch fix a race condition in persisting options which will cause a crash when:
      
      * Thread A obtain cf options and start to persist options based on that cf options.
      * Thread B kicks in and finish DropColumnFamily and delete cf_handle.
      * Thread A wakes up and tries to finish the persisting options and crashes.
      
      Test Plan: Add a test in column_family_test that can reproduce the crash
      
      Reviewers: anthony, IslamAbdelRahman, rven, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D51609
      2fa3ed51
  2. 04 12月, 2015 1 次提交
    • A
      added public api to schedule flush/compaction, code to prevent race with db::open · e8180f99
      Alex Yang 提交于
      Summary:
      Fixes T8781168.
      
      Added a new function EnableAutoCompactions in db.h to be publicly
      avialable.  This allows compaction to be re-enabled after disabling it via
      SetOptions
      
      Refactored code to set the dbptr earlier on in TransactionDB::Open and DB::Open
      Temporarily disable auto_compaction in TransactionDB::Open until dbptr is set to
      prevent race condition.
      
      Test Plan:
      Ran make all check
      
      verified fix on myrocks side:
      was able to reproduce the seg fault with
      ../tools/mysqltest.sh --mem --force rocksdb.drop_table
      
      method was to manually sleep the thread after DB::Open but before TransactionDB ptr was
      assigned in transaction_db_impl.cc:
        DB::Open(db_options, dbname, column_families_copy, handles, &db);
        clock_t goal = (60000 * 10) + clock();
        while (goal > clock());
        ...dbptr(aka rdb) gets assigned below
      
      verified my changes fixed the issue.
      
      Also added unit test 'ToggleAutoCompaction' in transaction_test.cc
      
      Reviewers: hermanlee4, anthony
      
      Reviewed By: anthony
      
      Subscribers: alex, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51147
      e8180f99
  3. 01 12月, 2015 1 次提交
    • S
      DB to only flush the column family with the largest memtable while... · db320b1b
      sdong 提交于
      DB to only flush the column family with the largest memtable while option.db_write_buffer_size is hit
      
      Summary: When option.db_write_buffer_size is hit, we currently flush all column families. Move to flush the column family with the largest active memt table instead. In this way, we can avoid too many small files in some cases.
      
      Test Plan: Modify test DBTest.SharedWriteBuffer to work with the updated behavior
      
      Reviewers: kradhakrishnan, yhchiang, rven, anthony, IslamAbdelRahman, igor
      
      Reviewed By: igor
      
      Subscribers: march, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51291
      db320b1b
  4. 17 11月, 2015 1 次提交
  5. 14 11月, 2015 1 次提交
  6. 13 11月, 2015 1 次提交
    • N
      Don't merge WriteBatch-es if WAL is disabled · 6ce42dd0
      Nathan Bronson 提交于
      Summary:
      There's no need for WriteImpl to flatten the write batch group
      into a single WriteBatch if the WAL is disabled.  This diff moves the
      flattening into the WAL step, and skips flattening entirely if it isn't
      needed.  It's good for about 5% speedup on a multi-threaded workload
      with no WAL.
      
      This diff also adds clarifying comments about the chance for partial
      failure of WriteBatchInternal::InsertInto, and always sets bg_error_ if
      the memtable state diverges from the logged state or if a WriteBatch
      succeeds only partially.
      
      Benchmark for speedup:
        db_bench -benchmarks=fillrandom -threads=16 -batch_size=1 -memtablerep=skip_list -value_size=0 --num=200000 -level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999 -disable_auto_compactions --max_write_buffer_number=8 -max_background_flushes=8 --disable_wal --write_buffer_size=160000000
      
      Test Plan: asserts + make check
      
      Reviewers: sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D50583
      6ce42dd0
  7. 11 11月, 2015 1 次提交
    • Y
      Enable RocksDB to persist Options file. · e114f0ab
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch allows rocksdb to persist options into a file on
      DB::Open, SetOptions, and Create / Drop ColumnFamily.
      Options files are created under the same directory as the rocksdb
      instance.
      
      In addition, this patch also adds a fail_if_missing_options_file in DBOptions
      that makes any function call return non-ok status when it is not able to
      persist options properly.
      
        // If true, then DB::Open / CreateColumnFamily / DropColumnFamily
        // / SetOptions will fail if options file is not detected or properly
        // persisted.
        //
        // DEFAULT: false
        bool fail_if_missing_options_file;
      
      Options file names are formatted as OPTIONS-<number>, and RocksDB
      will always keep the latest two options files.
      
      Test Plan:
      Add options_file_test.
      
      options_test
      column_family_test
      
      Reviewers: igor, IslamAbdelRahman, sdong, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48285
      e114f0ab
  8. 06 11月, 2015 1 次提交
    • V
      Prefix-based iterating only shows keys in prefix · 9d50afc3
      Venkatesh Radhakrishnan 提交于
      Summary:
      MyRocks testing found an issue that while iterating over keys
      that are outside the prefix, sometimes wrong results were seen for keys
      outside the prefix. We now tighten the range of keys seen with a new
      read option called prefix_seen_at_start. This remembers the starting
      prefix and then compares it on a Next for equality of prefix. If they
      are from a different prefix, it sets valid to false.
      
      Test Plan: PrefixTest.PrefixValid
      
      Reviewers: IslamAbdelRahman, sdong, yhchiang, anthony
      
      Reviewed By: anthony
      
      Subscribers: spetrunia, hermanlee4, yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D50211
      9d50afc3
  9. 04 11月, 2015 1 次提交
  10. 31 10月, 2015 1 次提交
    • I
      Update DB::AddFile() to have less restrictions · ff4499e2
      Islam AbdelRahman 提交于
      Summary:
      Update DB::AddFile() restrictions to be
        - Key range in loaded table file don't overlap with existing keys or tombstones in DB.
        - No other writes happen during AddFile call.
      
      The updated AddFile() will verify that the file key range don't overlap with any keys or tombstones in the DB, and then add the file to L0
      
      Test Plan: unit tests
      
      Reviewers: igor, rven, anthony, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: adsharma, ameyag, dhruba
      
      Differential Revision: https://reviews.facebook.net/D49233
      ff4499e2
  11. 30 10月, 2015 2 次提交
    • I
      Clean and expose CreateLoggerFromOptions · 2872e0c8
      Islam AbdelRahman 提交于
      Summary:
      CreateLoggerFromOptions have some parameters like  db_log_dir and env, these parameters are redundant since they already exist in DBOptions
      
      this patch remove the redundant parameters and expose CreateLoggerFromOptions to users
      
      Test Plan: make check
      
      Reviewers: igor, anthony, yhchiang, rven, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, hermanlee4
      
      Differential Revision: https://reviews.facebook.net/D49713
      2872e0c8
    • S
      "make format" in some recent commits · 296c3a1f
      sdong 提交于
      Summary: Run "make format" for some recent commits.
      
      Test Plan: Build and run tests
      
      Reviewers: IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D49707
      296c3a1f
  12. 29 10月, 2015 1 次提交
  13. 27 10月, 2015 1 次提交
  14. 20 10月, 2015 2 次提交
  15. 19 10月, 2015 5 次提交
  16. 18 10月, 2015 1 次提交
  17. 17 10月, 2015 3 次提交
  18. 14 10月, 2015 3 次提交
    • I
      Make db_test_util compile under ROCKSDB_LITE · f55d3009
      Islam AbdelRahman 提交于
      Summary: db_test_util is used in multiple test files but it dont compile under ROCKSDB_LITE
      
      Test Plan:
      make check
      make static_lib
      OPT=-DROCKSDB_LITE make db_wal_test
      
      Reviewers: igor, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48579
      f55d3009
    • S
      Seperate InternalIterator from Iterator · 35ad531b
      sdong 提交于
      Summary:
      Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.
      
      This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
      At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.
      
      Test Plan: Run all existing tests.
      
      Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven
      
      Reviewed By: rven
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48549
      35ad531b
    • P
      Put wal_filter under #ifndef ROCKSDB_LITE · cc4d13e0
      Praveen Rao 提交于
      cc4d13e0
  19. 13 10月, 2015 2 次提交
  20. 10 10月, 2015 2 次提交
    • A
      Passing table properties to compaction callback · 3d07b815
      Alexey Maykov 提交于
      Summary: It would be nice to have and access to table properties in compaction callbacks. In MyRocks project, it will make possible to update optimizer statistics online.
      
      Test Plan: ran the unit test. Ran myrocks with the new way of collecting stats.
      
      Reviewers: igor, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48267
      3d07b815
    • S
      Pass column family ID to table property collector · 776bd8d5
      sdong 提交于
      Summary: Pass column family ID through TablePropertiesCollectorFactory::CreateTablePropertiesCollector() so that users can identify which column family this file is for and handle it differently.
      
      Test Plan: Add unit test scenarios in tests related to table properties collectors to verify the information passed in is correct.
      
      Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: yoshinorim, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48411
      776bd8d5
  21. 07 10月, 2015 1 次提交
    • D
      Support for LevelDB SST with .ldb suffix · 02675026
      dyniusz 提交于
      Summary:
      	Handle SST files with both ".sst" and ".ldb" suffix.
      	This enables user to migrate from leveldb to rocksdb.
      
      Test Plan:
              Added unit test with DB operating on SSTs with names schema.
              See db/dc_test.cc:SSTsWithLdbSuffixHandling for details
      
      Reviewers: yhchiang, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D48003
      02675026
  22. 03 10月, 2015 1 次提交
  23. 24 9月, 2015 1 次提交
    • I
      Add experimental DB::AddFile() to plug sst files into empty DB · f03b5c98
      Islam AbdelRahman 提交于
      Summary:
      This is an initial version of bulk load feature
      
      This diff allow us to create sst files, and then bulk load them later, right now the restrictions for loading an sst file are
      (1) Memtables are empty
      (2) Added sst files have sequence number = 0, and existing values in database have sequence number = 0
      (3) Added sst files values are not overlapping
      
      Test Plan: unit testing
      
      Reviewers: igor, ott, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, ott, dhruba
      
      Differential Revision: https://reviews.facebook.net/D39081
      f03b5c98
  24. 23 9月, 2015 1 次提交
  25. 18 9月, 2015 2 次提交
    • A
      Support for SingleDelete() · 014fd55a
      Andres Noetzli 提交于
      Summary:
      This patch fixes #7460559. It introduces SingleDelete as a new database
      operation. This operation can be used to delete keys that were never
      overwritten (no put following another put of the same key). If an overwritten
      key is single deleted the behavior is undefined. Single deletion of a
      non-existent key has no effect but multiple consecutive single deletions are
      not allowed (see limitations).
      
      In contrast to the conventional Delete() operation, the deletion entry is
      removed along with the value when the two are lined up in a compaction. Note:
      The semantics are similar to @igor's prototype that allowed to have this
      behavior on the granularity of a column family (
      https://reviews.facebook.net/D42093 ). This new patch, however, is more
      aggressive when it comes to removing tombstones: It removes the SingleDelete
      together with the value whenever there is no snapshot between them while the
      older patch only did this when the sequence number of the deletion was older
      than the earliest snapshot.
      
      Most of the complex additions are in the Compaction Iterator, all other changes
      should be relatively straightforward. The patch also includes basic support for
      single deletions in db_stress and db_bench.
      
      Limitations:
      - Not compatible with cuckoo hash tables
      - Single deletions cannot be used in combination with merges and normal
        deletions on the same key (other keys are not affected by this)
      - Consecutive single deletions are currently not allowed (and older version of
        this patch supported this so it could be resurrected if needed)
      
      Test Plan: make all check
      
      Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor
      
      Reviewed By: igor
      
      Subscribers: maykov, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43179
      014fd55a
    • V
      Do not flag error if file to be deleted does not exist · 51e1c112
      Venkatesh Radhakrishnan 提交于
      Summary:
      Some users have observed errors in the log file when
      the log file or sst file is already deleted.
      
      Test Plan:
      Make sure that the errors do not appear for already deleted
      files.
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: anthony, kradhakrishnan, yhchiang, rven, igor, IslamAbdelRahman, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D47115
      51e1c112