1. 17 4月, 2018 3 次提交
  2. 16 4月, 2018 4 次提交
  3. 14 4月, 2018 5 次提交
    • A
      Implemented Knuth shuffle to construct permutation for selecting no_o… · 28087acd
      Amy Tai 提交于
      Summary:
      …verwrite_keys. Also changed each no_overwrite_key set to an unordered set, otherwise Knuth shuffle only gets you 2x time improvement, because insertion (and subsequent internal sorting) into an ordered set is the bottleneck.
      
      With this change, each iteration of permutation construction and prefix selection takes around 40 secs, as opposed to 360 secs previously. However, this still means that with the default 10 CF per blackbox test case, the test is going to time out given the default interval of 200 secs.
      
      Also, there is currently an assertion error affecting all blackbox tests in db_crashtest.py; this assertion error will be fixed in a future PR.
      Closes https://github.com/facebook/rocksdb/pull/3699
      
      Differential Revision: D7624616
      
      Pulled By: amytai
      
      fbshipit-source-id: ea64fbe83407ff96c1c0ecabbc6c830576939393
      28087acd
    • X
      Make database files' permissions configurable · a0102aa6
      Xiaofei Du 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/3709
      
      Differential Revision: D7610227
      
      Pulled By: xiaofeidu008
      
      fbshipit-source-id: 88a52f0f9f96e2195fccde995cf9760b785e9f07
      a0102aa6
    • Z
      add kEntryRangeDeletion · 31ee4bf2
      zhangjinpeng1987 提交于
      Summary:
      When there are many range deletions in a range, we want to trigger manual compaction on this range to reclaim disk space as soon as possible and speed up read.
      After this change, we can collect informations of range deletions and store them into user properties which can guide our manual compaction.
      Closes https://github.com/facebook/rocksdb/pull/3695
      
      Differential Revision: D7570322
      
      Pulled By: ajkr
      
      fbshipit-source-id: c358fa43b0aac6cc954d2eadc7d3bd8015373369
      31ee4bf2
    • S
      Merge raw and shared pointer log method impls · 1f5457ef
      Steven Fackler 提交于
      Summary:
      Calling rocksdb::Log, rocksdb::Info, etc with a `shared_ptr<Logger>` should behave the same as calling those functions with a `Logger *`. This PR achieves it by making the `shared_ptr<Logger>` versions delegate to the `Logger *` versions.
      
      Closes #3689
      Closes https://github.com/facebook/rocksdb/pull/3710
      
      Differential Revision: D7595557
      
      Pulled By: ajkr
      
      fbshipit-source-id: 64dd7f20fd42dc821bac7b8032705c35b483e00d
      1f5457ef
    • Y
      Improve accuracy of I/O stats collection of external SST ingestion. · c81b0abe
      Yanqin Jin 提交于
      Summary:
      RocksDB supports ingestion of external ssts. If ingestion_options.move_files is true, when performing ingestion, RocksDB first tries to link external ssts. If external SST file resides on a different FS, or the underlying FS does not support hard link, then RocksDB performs actual file copy. However, no matter which choice is made, current code increase bytes-written when updating compaction stats, which is inaccurate when RocksDB does NOT copy file.
      
      Rename a sync point.
      Closes https://github.com/facebook/rocksdb/pull/3713
      
      Differential Revision: D7604151
      
      Pulled By: riversand963
      
      fbshipit-source-id: dd0c0d9b9a69c7d9ffceafc3d9c23371aa413586
      c81b0abe
  4. 13 4月, 2018 2 次提交
  5. 12 4月, 2018 2 次提交
    • M
      WritePrepared Txn: fix smallest_prep atomicity issue · 6f5e6445
      Maysam Yabandeh 提交于
      Summary:
      We introduced smallest_prep optimization in this commit b225de7e, which enables storing the smallest uncommitted sequence number along with the snapshot. This enables the readers that read from the snapshot to skip further checks and safely assumed the data is committed if its sequence number is less than smallest uncommitted when the snapshot was taken. The problem was that smallest uncommitted and the snapshot must be taken atomically, and the lack of atomicity had led to readers using a smallest uncommitted after the snapshot was taken and hence mistakenly skipping some data.
      This patch fixes the problem by i) separating the process of removing of prepare entries from the AddCommitted function, ii) removing the prepare entires AFTER the committed sequence number is published, iii) getting smallest uncommitted (from the prepare list) BEFORE taking a snapshot. This guarantees that the smallest uncommitted that is accompanied with a snapshot is less than or equal of such number if it was obtained atomically.
      
      Tested by running MySQLStyleTransactionTest/MySQLStyleTransactionTest.TransactionStressTest that was failing sporadically.
      Closes https://github.com/facebook/rocksdb/pull/3703
      
      Differential Revision: D7581934
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: dc9d6f4fb477eba75d4d5927326905b548a96a32
      6f5e6445
    • Y
      Improve visibility into the reasons for compaction. · d42bd041
      Yanqin Jin 提交于
      Summary:
      Add `compaction_reason` as part of event log for event `compaction started`.
      Add counters for each `CompactionReason`.
      Closes https://github.com/facebook/rocksdb/pull/3679
      
      Differential Revision: D7550348
      
      Pulled By: riversand963
      
      fbshipit-source-id: a19cff3a678c785aa5ef41aac78b9a5968fcc34d
      d42bd041
  6. 11 4月, 2018 3 次提交
    • A
      fix calling SetOptions on deprecated options · 019d7894
      Andrew Kryczka 提交于
      Summary:
      In `cf_options_type_info`, the deprecated options are all considered to have offset zero in the `MutableCFOptions` struct. Previously we weren't checking in `GetMutableOptionsFromStrings` whether the provided option was deprecated or not and simply writing the provided value to the offset specified by `cf_options_type_info`. That meant setting any deprecated option would overwrite the first element in the struct, which is `write_buffer_size`. `db_stress` hit this often since it calls `SetOptions` with `soft_rate_limit=0` and `hard_rate_limit=0`, which are both deprecated so cause `write_buffer_size` to be set to zero, which causes it to crash on the following assertion:
      
      ```
      db_stress: db/memtable.cc:106: rocksdb::MemTable::MemTable(const rocksdb::InternalKeyComparator&, const rocksdb::ImmutableCFOptions&, const rocksdb::MutableCFOptions&, rocksdb::WriteBufferManager*, rocksdb::SequenceNumber, uint32_t): Assertion `!ShouldScheduleFlush()' failed.
      ```
      
      We fix it by skipping deprecated options (and logging a warning) when users provide them to `SetOptions`. I didn't want to fail the call for compatibility reasons.
      Closes https://github.com/facebook/rocksdb/pull/3700
      
      Differential Revision: D7572596
      
      Pulled By: ajkr
      
      fbshipit-source-id: bd5d84e14c0c39f30c5d4c6df7c1503d2c28ecf1
      019d7894
    • Y
      fix some text in comments. · d95014b9
      Yanqin Jin 提交于
      Summary:
      1. Remove redundant text.
      2. Make terminology consistent across all comments and doc of RocksDB. Also do
         our best to conform to conventions. Specifically, use 'callback' instead of
         'call-back' [wikipedia](https://en.wikipedia.org/wiki/Callback_(computer_programming)).
      Closes https://github.com/facebook/rocksdb/pull/3693
      
      Differential Revision: D7560396
      
      Pulled By: riversand963
      
      fbshipit-source-id: ba8c251c487f4e7d1872a1a8dc680f9e35a6ffb8
      d95014b9
    • Z
      make MockTimeEnv::current_time_ atomic to fix data race · 2770a94c
      Zhongyi Xie 提交于
      Summary:
      fix a new TSAN failure
      https://gist.github.com/miasantreble/7599c33f4e17da1024c67d4540dbe397
      Closes https://github.com/facebook/rocksdb/pull/3694
      
      Differential Revision: D7565310
      
      Pulled By: miasantreble
      
      fbshipit-source-id: f672c96e925797b34dec6e20b59527e8eebaa825
      2770a94c
  7. 10 4月, 2018 5 次提交
  8. 08 4月, 2018 2 次提交
    • M
      WritePrepared Txn: add stats · bde1c1a7
      Maysam Yabandeh 提交于
      Summary:
      Adding some stats that would be helpful to monitor if the DB has gone to unlikely stats that would hurt the performance. These are mostly when we end up needing to acquire a mutex.
      Closes https://github.com/facebook/rocksdb/pull/3683
      
      Differential Revision: D7529393
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: f7d36279a8f39bd84d8ddbf64b5c97f670c5d6d9
      bde1c1a7
    • M
      WritePrepared Txn: add write_committed option to dump_wal · eb5a2954
      Maysam Yabandeh 提交于
      Summary:
      Currently dump_wal cannot print the prepared records from the WAL that is generated by WRITE_PREPARED write policy since the default reaction of the handler is to return NotSupported if markers of WRITE_PREPARED are encountered. This patch enables the admin to pass --write_committed=false option, which will be accordingly passed to the handler. Note that DBFileDumperCommand and DBDumperCommand are still not updated by this patch but firstly they are not urgent and secondly we need to revise this approach later when we also add WRITE_UNPREPARED markers so I leave it for future work.
      
      Tested by running it on a WAL generated by WRITE_PREPARED:
      $ ./ldb dump_wal --walfile=/dev/shm/dbbench/000003.log  | grep BEGIN_PREARE | head -1
      1,2,70,0,BEGIN_PREARE
      $ ./ldb dump_wal --walfile=/dev/shm/dbbench/000003.log --write_committed=false | grep BEGIN_PREARE | head -1
      1,2,70,0,BEGIN_PREARE PUT(0) : 0x30303031313330313938 PUT(0) : 0x30303032353732313935 END_PREPARE(0x74786E31313535383434323738303738363938313335312D30)
      Closes https://github.com/facebook/rocksdb/pull/3682
      
      Differential Revision: D7522090
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: a0332207261c61e18b2f9dfbe9feecd9a1339aca
      eb5a2954
  9. 07 4月, 2018 2 次提交
  10. 06 4月, 2018 8 次提交
    • A
      protect valid backup files when max_valid_backups_to_open is set · faba3fb5
      Andrew Kryczka 提交于
      Summary:
      When `max_valid_backups_to_open` is set, the `BackupEngine` doesn't know about the files referenced by existing backups. This PR prevents us from deleting valid files when that option is set, in cases where we are unable to accurately determine refcount. There are warnings logged when we may miss deleting unreferenced files, and a recommendation in the header for users to periodically unset this option and run a full `GarbageCollect`.
      Closes https://github.com/facebook/rocksdb/pull/3518
      
      Differential Revision: D7008331
      
      Pulled By: ajkr
      
      fbshipit-source-id: 87907f964dc9716e229d08636a895d2fc7b72305
      faba3fb5
    • Z
      fix shared libary compile on ppc · 65717700
      zhsj 提交于
      Summary:
      shared-ppc-objects is missed in $(SHARED4) target
      Closes https://github.com/facebook/rocksdb/pull/3619
      
      Differential Revision: D7475767
      
      Pulled By: ajkr
      
      fbshipit-source-id: d957ac7290bab3cd542af504405fb5ff912bfbf1
      65717700
    • P
      Support for Column family specific paths. · 446b32cf
      Phani Shekhar Mantripragada 提交于
      Summary:
      In this change, an option to set different paths for different column families is added.
      This option is set via cf_paths setting of ColumnFamilyOptions. This option will work in a similar fashion to db_paths setting. Cf_paths is a vector of Dbpath values which contains a pair of the absolute path and target size. Multiple levels in a Column family can go to different paths if cf_paths has more than one path.
      To maintain backward compatibility, if cf_paths is not specified for a column family, db_paths setting will be used. Note that, if db_paths setting is also not specified, RocksDB already has code to use db_name as the only path.
      
      Changes :
      1) A new member "cf_paths" is added to ImmutableCfOptions. This is set, based on cf_paths setting of ColumnFamilyOptions and db_paths setting of ImmutableDbOptions.  This member is used to identify the path information whenever files are accessed.
      2) Validation checks are added for cf_paths setting based on existing checks for db_paths setting.
      3) DestroyDB, PurgeObsoleteFiles etc. are edited to support multiple cf_paths.
      4) Unit tests are added appropriately.
      Closes https://github.com/facebook/rocksdb/pull/3102
      
      Differential Revision: D6951697
      
      Pulled By: ajkr
      
      fbshipit-source-id: 60d2262862b0a8fd6605b09ccb0da32bb331787d
      446b32cf
    • M
      Stats for false positive rate of full filtesr · 67182678
      Maysam Yabandeh 提交于
      Summary:
      Adds two stats to allow us measuring the false positive rate of full filters:
      - The total count of positives: rocksdb.bloom.filter.full.positive
      - The total count of true positives: rocksdb.bloom.filter.full.true.positive
      Not the term "full" in the stat name to indicate that they are meaningful in full filters. block-based filters are to be deprecated soon and supporting it is not worth the the additional cost of if-then-else branches.
      
      Closes #3680
      
      Tested by:
      $ ./db_bench -benchmarks=fillrandom  -db /dev/shm/rocksdb-tmpdb --num=1000000 -bloom_bits=10
      $ ./db_bench -benchmarks="readwhilewriting"  -db /dev/shm/rocksdb-tmpdb --statistics -bloom_bits=10 --duration=60 --num=2000000 --use_existing_db 2>&1 > /tmp/full.log
      $ grep filter.full /tmp/full.log
      rocksdb.bloom.filter.full.positive COUNT : 3628593
      rocksdb.bloom.filter.full.true.positive COUNT : 3536026
      which gives the false positive rate of 2.5%
      Closes https://github.com/facebook/rocksdb/pull/3681
      
      Differential Revision: D7517570
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 630ab1a473afdce404916d297035b6318de4c052
      67182678
    • Y
      Clock cache should check if deleter is nullptr before calling it · 685912d0
      Yi Wu 提交于
      Summary:
      Clock cache should check if deleter is nullptr before calling it.
      Closes https://github.com/facebook/rocksdb/pull/3677
      
      Differential Revision: D7493602
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 4f94b188d2baf2cbc7c0d5da30fea1215a683de4
      685912d0
    • D
      Fix pre_release callback argument list. · 147dfc7b
      Dmitri Smirnov 提交于
      Summary:
      Primitive types constness does not affect the signature of the
        method and has no influence on whether the overriding method would
        actually have that const bool instead of just bool. In addition,
        it is rarely useful but does produce a compatibility warnings
        in VS 2015 compiler.
      Closes https://github.com/facebook/rocksdb/pull/3663
      
      Differential Revision: D7475739
      
      Pulled By: ajkr
      
      fbshipit-source-id: fb275378b5acc397399420ae6abb4b6bfe5bd32f
      147dfc7b
    • Y
      Blob DB: blob_dump to show uncompressed values · 36a9f229
      Yi Wu 提交于
      Summary:
      Make blob_dump tool able to show uncompressed values if the blob file is compressed. Also show total compressed vs. raw size at the end if --show_summary is provided.
      Closes https://github.com/facebook/rocksdb/pull/3633
      
      Differential Revision: D7348926
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: ca709cb4ed5cf6a550ff2987df8033df81516f8e
      36a9f229
    • Z
      fix build for rocksdb lite · c827b2dc
      Zhongyi Xie 提交于
      Summary:
      currently rocksdb lite build fails due to the following errors:
      > db/db_sst_test.cc:29:51: error: ‘FlushJobInfo’ does not name a type
         virtual void OnFlushCompleted(DB* /*db*/, const FlushJobInfo& info) override {
                                                         ^
      db/db_sst_test.cc:29:16: error: ‘virtual void rocksdb::FlushedFileCollector::OnFlushCompleted(rocksdb::DB*, const int&)’ marked ‘override’, but does not override
         virtual void OnFlushCompleted(DB* /*db*/, const FlushJobInfo& info) override {
                      ^
      db/db_sst_test.cc:24:7: error: ‘class rocksdb::FlushedFileCollector’ has virtual functions and accessible non-virtual destructor [-Werror=non-virtual-dtor]
       class FlushedFileCollector : public EventListener {
             ^
      db/db_sst_test.cc: In member function ‘virtual void rocksdb::FlushedFileCollector::OnFlushCompleted(rocksdb::DB*, const int&)’:
      db/db_sst_test.cc:31:35: error: request for member ‘file_path’ in ‘info’, which is of non-class type ‘const int’
           flushed_files_.push_back(info.file_path);
                                         ^
      cc1plus: all warnings being treated as errors
      make: *** [db/db_sst_test.o] Error 1
      Closes https://github.com/facebook/rocksdb/pull/3676
      
      Differential Revision: D7493006
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 77dff0a5b23e27db51be9b9798e3744e6fdec64f
      c827b2dc
  11. 05 4月, 2018 1 次提交
  12. 04 4月, 2018 2 次提交
    • D
      Make Optimistic Tx database stackable · 2a62ca17
      Dmitri Smirnov 提交于
      Summary:
      This change models Optimistic Tx db after Pessimistic TX db. The motivation for this change is to make the ptr polymorphic so it can be held by the same raw or smart ptr.
      
      Currently, due to the inheritance of the Opt Tx db not being rooted in the manner of Pess Tx from a single DB root it is more difficult to write clean code and have clear ownership of the database in cases when options dictate instantiate of plan DB, Pess Tx DB or Opt tx db.
      Closes https://github.com/facebook/rocksdb/pull/3566
      
      Differential Revision: D7184502
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 31d06efafd79497bb0c230e971857dba3bd962c3
      2a62ca17
    • A
      Reduce default --nooverwritepercent in black-box crash tests · b058a337
      Andrew Kryczka 提交于
      Summary:
      Previously `python tools/db_crashtest.py blackbox` would do no useful work as the crash interval (two minutes) was shorter than the preparation phase. The preparation phase is slow because of the ridiculously inefficient way it computes which keys should not be overwritten. It was doing this for 60M keys since default values were `FLAGS_nooverwritepercent == 60` and `FLAGS_max_key == 100000000`.
      
      Move the "nooverwritepercent" override from whitebox-specific to the general options so it also applies to blackbox test runs. Now preparation phase takes a few seconds.
      Closes https://github.com/facebook/rocksdb/pull/3671
      
      Differential Revision: D7457732
      
      Pulled By: ajkr
      
      fbshipit-source-id: 601f4461a6a7e49e50449dcf15aebc9b8a98d6f0
      b058a337
  13. 03 4月, 2018 1 次提交