1. 19 12月, 2017 1 次提交
  2. 12 12月, 2017 1 次提交
    • S
      Refactor ReadBlockContents() · 2f1a3a4d
      Siying Dong 提交于
      Summary:
      Divide ReadBlockContents() to multiple sub-functions. Maintaining the input and intermediate data in a new class BlockFetcher.
      I hope in general it makes the code easier to maintain.
      Another motivation to do it is to clearly divide the logic before file reading and after file reading. The refactor will help us evaluate how can we make I/O async in the future.
      Closes https://github.com/facebook/rocksdb/pull/3244
      
      Differential Revision: D6520983
      
      Pulled By: siying
      
      fbshipit-source-id: 338d90bc0338472d46be7a7682028dc9114b12e9
      2f1a3a4d
  3. 01 12月, 2017 2 次提交
  4. 16 11月, 2017 1 次提交
  5. 14 11月, 2017 1 次提交
  6. 03 11月, 2017 1 次提交
  7. 13 10月, 2017 1 次提交
  8. 07 10月, 2017 1 次提交
    • Y
      WritePrepared Txn: Compaction/Flush · d1b74b0c
      Yi Wu 提交于
      Summary:
      Update Compaction/Flush to support WritePreparedTxnDB: Add SnapshotChecker which is a proxy to query WritePreparedTxnDB::IsInSnapshot. Pass SnapshotChecker to DBImpl on WritePreparedTxnDB open. CompactionIterator use it to check if a key has been committed and if it is visible to a snapshot. In CompactionIterator:
      * check if key has been committed. If not, output uncommitted keys AS-IS.
      * use SnapshotChecker to check if key is visible to a snapshot when in need.
      * do not output key with seq = 0 if the key is not committed.
      Closes https://github.com/facebook/rocksdb/pull/2926
      
      Differential Revision: D5902907
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 945e037fdf0aa652dc5ba0ad879461040baa0320
      d1b74b0c
  9. 04 10月, 2017 1 次提交
    • Y
      Add ValueType::kTypeBlobIndex · d1cab2b6
      Yi Wu 提交于
      Summary:
      Add kTypeBlobIndex value type, which will be used by blob db only, to insert a (key, blob_offset) KV pair. The purpose is to
      1. Make it possible to open existing rocksdb instance as blob db. Existing value will be of kTypeIndex type, while value inserted by blob db will be of kTypeBlobIndex.
      2. Make rocksdb able to detect if the db contains value written by blob db, if so return error.
      3. Make it possible to have blob db optionally store value in SST file (with kTypeValue type) or as a blob value (with kTypeBlobIndex type).
      
      The root db (DBImpl) basically pretended kTypeBlobIndex are normal value on write. On Get if is_blob is provided, return whether the value read is of kTypeBlobIndex type, or return Status::NotSupported() status if is_blob is not provided. On scan allow_blob flag is pass and if the flag is true, return wether the value is of kTypeBlobIndex type via iter->IsBlob().
      
      Changes on blob db side will be in a separate patch.
      Closes https://github.com/facebook/rocksdb/pull/2886
      
      Differential Revision: D5838431
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 3c5306c62bc13bb11abc03422ec5cbcea1203cca
      d1cab2b6
  10. 01 9月, 2017 1 次提交
  11. 14 8月, 2017 1 次提交
    • A
      rocksdb: make buildable on aarch64 · 5449c099
      Andrew Gallagher 提交于
      Summary:
      - Remove default arch-specified flags.
      - Move non-default arch-specific flags to arch-specific param.
      
      Reviewed By: yiwu-arbug
      
      Differential Revision: D5597499
      
      fbshipit-source-id: c53108ac39c73ac36893d3fd9aaf3b5e3080f1ae
      5449c099
  12. 08 8月, 2017 1 次提交
    • M
      Refactor PessimisticTransaction · bdc056f8
      Maysam Yabandeh 提交于
      Summary:
      This patch splits Commit and Prepare into lock-related logic and db-write-related logic. It moves lock-related logic to PessimisticTransaction to be reused by all children classes and movies the existing impl of db-write-related to PrepareInternal, CommitSingleInternal, and CommitInternal in WriteCommittedTxnImpl.
      Closes https://github.com/facebook/rocksdb/pull/2691
      
      Differential Revision: D5569464
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d1b8698e69801a4126c7bc211745d05c636f5325
      bdc056f8
  13. 06 8月, 2017 1 次提交
  14. 02 8月, 2017 1 次提交
    • Y
      Dump Blob DB options to info log · 1900771b
      Yi Wu 提交于
      Summary:
      * Dump blob db options to info log
      * Remove BlobDBOptionsImpl to disallow dynamic cast *BlobDBOptions into *BlobDBOptionsImpl. Move options there to be constants or into BlobDBOptions. The dynamic cast is broken after #2645
      * Change some of the default options
      * Remove blob_db_options.min_blob_size, which is unimplemented. Will implement it soon.
      Closes https://github.com/facebook/rocksdb/pull/2671
      
      Differential Revision: D5529912
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: dcd58ca981db5bcc7f123b65a0d6f6ae0dc703c7
      1900771b
  15. 28 7月, 2017 1 次提交
    • Y
      Blob DB TTL extractor · 6083bc79
      Yi Wu 提交于
      Summary:
      Introducing blob_db::TTLExtractor to replace extract_ttl_fn. The TTL
      extractor can be use to extract TTL from keys insert with Put or
      WriteBatch. Change over existing extract_ttl_fn are:
      * If value is changed, it will be return via std::string* (rather than Slice*). With Slice* the new value has to be part of the existing value. With std::string* the limitation is removed.
      * It can optionally return TTL or expiration.
      
      Other changes in this PR:
      * replace `std::chrono::system_clock` with `Env::NowMicros` so that I can mock time in tests.
      * add several TTL tests.
      * other minor naming change.
      Closes https://github.com/facebook/rocksdb/pull/2659
      
      Differential Revision: D5512627
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 0dfcb00d74d060b8534c6130c808e4d5d0a54440
      6083bc79
  16. 26 7月, 2017 1 次提交
  17. 25 7月, 2017 1 次提交
  18. 22 7月, 2017 1 次提交
    • P
      Cassandra compaction filter for purge expired columns and rows · 534c255c
      Pengchao Wang 提交于
      Summary:
      Major changes in this PR:
      * Implement CassandraCompactionFilter to remove expired columns and rows (if all column expired)
      * Move cassandra related code from utilities/merge_operators/cassandra to utilities/cassandra/*
      * Switch to use shared_ptr<> from uniqu_ptr for Column membership management in RowValue. Since columns do have multiple owners in Merge and GC process, use shared_ptr helps make RowValue immutable.
      * Rename cassandra_merge_test to cassandra_functional_test and add two TTL compaction related tests there.
      Closes https://github.com/facebook/rocksdb/pull/2588
      
      Differential Revision: D5430010
      
      Pulled By: wpc
      
      fbshipit-source-id: 9566c21e06de17491d486a68c70f52d501f27687
      534c255c
  19. 15 7月, 2017 1 次提交
  20. 11 7月, 2017 1 次提交
    • G
      Fix undefined behavior in Hash · 8f927e5f
      Giuseppe Ottaviano 提交于
      Summary:
      Instead of ignoring UBSan checks, fix the negative shifts in
      Hash(). Also add test to make sure the hash values are stable over
      time. The values were computed before this change, so the test also
      verifies the correctness of the change.
      Closes https://github.com/facebook/rocksdb/pull/2546
      
      Differential Revision: D5386369
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 6de4b44461a544d6222cc5d72d8cda2c0373d17e
      8f927e5f
  21. 29 6月, 2017 1 次提交
  22. 28 6月, 2017 2 次提交
    • Y
      Fix TARGETS file tests list · 982cec22
      Yi Wu 提交于
      Summary:
      1. The buckifier script assume each test "foo" comes with a .cc file of the same name (i.e. foo.cc). Update cassandra tests to follow this pattern so that the buckifier script can recognize them.
      2. add blob_db_test
      Closes https://github.com/facebook/rocksdb/pull/2506
      
      Differential Revision: D5331517
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 86f3eba471fc621186ab44cbd073b6162cde8e57
      982cec22
    • Y
      allow numa >= 2.0.8 · b49b3710
      Yi Wu 提交于
      Summary:
      Allow numa >= 2.0.8 in buck TARGET file.
      Closes https://github.com/facebook/rocksdb/pull/2504
      
      Differential Revision: D5330550
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 8ffb6167b4ad913877eac16a20a91023b31f8d41
      b49b3710
  23. 27 6月, 2017 1 次提交
    • E
      Encryption at rest support · 51778612
      Ewout Prangsma 提交于
      Summary:
      This PR adds support for encrypting data stored by RocksDB when written to disk.
      
      It adds an `EncryptedEnv` override of the `Env` class with matching overrides for sequential&random access files.
      The encryption itself is done through a configurable `EncryptionProvider`. This class creates is asked to create `BlockAccessCipherStream` for a file. This is where the actual encryption/decryption is being done.
      Currently there is a Counter mode implementation of `BlockAccessCipherStream` with a `ROT13` block cipher (NOTE the `ROT13` is for demo purposes only!!).
      
      The Counter operation mode uses an initial counter & random initialization vector (IV).
      Both are created randomly for each file and stored in a 4K (default size) block that is prefixed to that file. The `EncryptedEnv` implementation is such that clients of the `Env` class do not see this prefix (nor data, nor in filesize).
      The largest part of the prefix block is also encrypted, and there is room left for implementation specific settings/values/keys in there.
      
      To test the encryption, the `DBTestBase` class has been extended to consider a new environment variable called `ENCRYPTED_ENV`. If set, the test will setup a encrypted instance of the `Env` class to use for all tests.
      Typically you would run it like this:
      
      ```
      ENCRYPTED_ENV=1 make check_some
      ```
      
      There is also an added test that checks that some data inserted into the database is or is not "visible" on disk. With `ENCRYPTED_ENV` active it must not find plain text strings, with `ENCRYPTED_ENV` unset, it must find the plain text strings.
      Closes https://github.com/facebook/rocksdb/pull/2424
      
      Differential Revision: D5322178
      
      Pulled By: sdwilsh
      
      fbshipit-source-id: 253b0a9c2c498cc98f580df7f2623cbf7678a27f
      51778612
  24. 17 6月, 2017 1 次提交
  25. 03 6月, 2017 1 次提交
    • S
      Improve write buffer manager (and allow the size to be tracked in block cache) · 95b0e89b
      Siying Dong 提交于
      Summary:
      Improve write buffer manager in several ways:
      1. Size is tracked when arena block is allocated, rather than every allocation, so that it can better track actual memory usage and the tracking overhead is slightly lower.
      2. We start to trigger memtable flush when 7/8 of the memory cap hits, instead of 100%, and make 100% much harder to hit.
      3. Allow a cache object to be passed into buffer manager and the size allocated by memtable can be costed there. This can help users have one single memory cap across block cache and memtable.
      Closes https://github.com/facebook/rocksdb/pull/2350
      
      Differential Revision: D5110648
      
      Pulled By: siying
      
      fbshipit-source-id: b4238113094bf22574001e446b5d88523ba00017
      95b0e89b
  26. 02 6月, 2017 3 次提交
  27. 01 6月, 2017 2 次提交
    • T
      db: avoid `#include`ing malloc and jemalloc simultaneously · 0dc3040d
      Tamir Duberstein 提交于
      Summary:
      This fixes a compilation failure on Linux when the system libc is not
      glibc. jemalloc's configure script incorrectly assumes that glibc is
      always used on Linux systems, producing glibc-style signatures; when
      the system libc is e.g. musl, the following error is observed:
      
      ```
        [  0%] Building CXX object CMakeFiles/rocksdb.dir/db/db_impl.cc.o
        In file included from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/table/block.h:19:0,
                         from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/db/db_impl.cc:77:
        /x-tools/x86_64-unknown-linux-musl/x86_64-unknown-linux-musl/sysroot/usr/include/malloc.h:19:8: error: declaration of 'size_t malloc_usable_size(void*)' has a different exception specifier
         size_t malloc_usable_size(void *);
                ^~~~~~~~~~~~~~~~~~
        In file included from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/db/db_impl.cc:20:0:
        /go/native/x86_64-unknown-linux-musl/jemalloc/include/jemalloc/jemalloc.h:78:33: note: from previous declaration 'size_t malloc_usable_size(void*) throw ()'
         #  define je_malloc_usable_size malloc_usable_size
                                         ^
        /go/native/x86_64-unknown-linux-musl/jemalloc/include/jemalloc/jemalloc.h:239:41: note: in expansion of macro 'je_malloc_usable_size'
         JEMALLOC_EXPORT size_t JEMALLOC_NOTHROW je_malloc_usable_size(
                                                 ^~~~~~~~~~~~~~~~~~~~~
        CMakeFiles/rocksdb.dir/build.make:350: recipe for target 'CMakeFiles/rocksdb.dir/db/db_impl.cc.o' failed
      ```
      
      This works around the issue by rearranging the sources such that
      jemalloc's headers are never in the same scope as the system's malloc
      header. The jemalloc issue has been reported as well, see:
      https://github.com/jemalloc/jemalloc/issues/778.
      
      cc tschottdorf
      Closes https://github.com/facebook/rocksdb/pull/2188
      
      Differential Revision: D5163048
      
      Pulled By: siying
      
      fbshipit-source-id: c553125458892def175c1be5682b0330d80b2a0d
      0dc3040d
    • Y
      Fixing blob db sequence number handling · ad19eb86
      Yi Wu 提交于
      Summary:
      Blob db rely on base db returning sequence number through write batch after DB::Write(). However after recent changes to the write path, DB::Writ()e no longer return sequence number in some cases. Fixing it by have WriteBatchInternal::InsertInto() always encode sequence number into write batch.
      
      Stacking on #2375.
      Closes https://github.com/facebook/rocksdb/pull/2385
      
      Differential Revision: D5148358
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 8bda0aa07b9334ed03ed381548b39d167dc20c33
      ad19eb86
  28. 31 5月, 2017 1 次提交
    • Y
      update blob_db_test · 345878a7
      Yi Wu 提交于
      Summary:
      Re-enable blob_db_test with some update:
      * Commented out delay at the end of GC tests. Will update the logic later with sync point to properly trigger GC.
      * Added some helper functions.
      
      Also update make files to include blob_dump tool.
      Closes https://github.com/facebook/rocksdb/pull/2375
      
      Differential Revision: D5133793
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 95470b26d0c1f9592ba4b7637e027fdd263f425c
      345878a7
  29. 26 5月, 2017 1 次提交
  30. 25 5月, 2017 1 次提交
  31. 16 5月, 2017 2 次提交
    • N
      cross-platform compatibility improvements · 11c5d474
      Nikhil Benesch 提交于
      Summary:
      We've had a couple CockroachDB users fail to build RocksDB on exotic platforms, so I figured I'd try my hand at solving these issues upstream. The problems stem from a) `USE_SSE=1` being too aggressive about turning on SSE4.2, even on toolchains that don't support SSE4.2 and b) RocksDB attempting to detect support for thread-local storage based on OS, even though it can vary by compiler on the same OS.
      
      See the individual commit messages for details. Regarding SSE support, this PR should change virtually nothing for non-CMake based builds. `make`, `PORTABLE=1 make`, `USE_SSE=1 make`, and `PORTABLE=1 USE_SSE=1 make` function exactly as before, except that SSE support will be automatically disabled when a simple SSE4.2-using test program fails to compile, as it does on OpenBSD. (OpenBSD's ports GCC supports SSE4.2, but its binutils do not, so `__SSE_4_2__` is defined but an SSE4.2-using program will fail to assemble.) A warning is emitted in this case. The CMake build is modified to support the same set of options, except that `USE_SSE` is spelled `FORCE_SSE42` because `USE_SSE` is rather useless now that we can automatically detect SSE support, and I figure changing options in the CMake build is less disruptive than changing the non-CMake build.
      
      I've tested these changes on all the platforms I can get my hands on (macOS, Windows MSVC, Windows MinGW, and OpenBSD) and it all works splendidly. Let me know if there's anything you object to—I obviously don't mean to break any of your build pipelines in the process of fixing ours downstream.
      Closes https://github.com/facebook/rocksdb/pull/2199
      
      Differential Revision: D5054042
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 938e1fc665c049c02ae15698e1409155b8e72171
      11c5d474
    • Y
      Fix build error with blob DB. · 86d54925
      Yi Wu 提交于
      Summary:
      snprintf is in <stdio.h> and not in namespace std.
      Closes https://github.com/facebook/rocksdb/pull/2287
      
      Reviewed By: anirbanr-fb
      
      Differential Revision: D5054752
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 356807ec38f3c7d95951cdb41f31a3d3ae0714d4
      86d54925
  32. 13 5月, 2017 1 次提交
    • A
      Add GetAllKeyVersions API · 3fa9a39c
      Andrew Kryczka 提交于
      Summary:
      - Introduced an include/ file dedicated to db-related debug functions to avoid making db.h more complex
      - Added debugging function, `GetAllKeyVersions()`, to return a listing of internal data for a range of user keys. The new `struct KeyVersion` exposes data similar to internal key without exposing any internal type.
      - Migrated the "ldb idump" subcommand to use this function
      - The API takes an inclusive-exclusive range to match behavior of "ldb idump". This will be quite annoying for users who want to query a single user key's versions :(.
      Closes https://github.com/facebook/rocksdb/pull/2232
      
      Differential Revision: D4976007
      
      Pulled By: ajkr
      
      fbshipit-source-id: cab375da53a7595d6575af2b7e3b776aa3ad793e
      3fa9a39c
  33. 11 5月, 2017 2 次提交