1. 03 6月, 2017 1 次提交
    • S
      Improve write buffer manager (and allow the size to be tracked in block cache) · 95b0e89b
      Siying Dong 提交于
      Summary:
      Improve write buffer manager in several ways:
      1. Size is tracked when arena block is allocated, rather than every allocation, so that it can better track actual memory usage and the tracking overhead is slightly lower.
      2. We start to trigger memtable flush when 7/8 of the memory cap hits, instead of 100%, and make 100% much harder to hit.
      3. Allow a cache object to be passed into buffer manager and the size allocated by memtable can be costed there. This can help users have one single memory cap across block cache and memtable.
      Closes https://github.com/facebook/rocksdb/pull/2350
      
      Differential Revision: D5110648
      
      Pulled By: siying
      
      fbshipit-source-id: b4238113094bf22574001e446b5d88523ba00017
      95b0e89b
  2. 02 6月, 2017 1 次提交
    • M
      Retire memenv https://github.com/facebook/rocksdb/pull/2082 · 5a9b4d74
      Maysam Yabandeh 提交于
      Summary:
      This is a manual commit of this PR:
      Retire InMemoryEnv in favor of MockEnv #2082
      With MockEnv doing the same yet being more mature, InMemoryEnv is redundant.
      
      Reviewed By: IslamAbdelRahman
      
      Differential Revision: D5162323
      
      fbshipit-source-id: 59fd0082a891dc99cc531e4da9d68bf891eae3f5
      5a9b4d74
  3. 01 6月, 2017 2 次提交
    • T
      db: avoid `#include`ing malloc and jemalloc simultaneously · 0dc3040d
      Tamir Duberstein 提交于
      Summary:
      This fixes a compilation failure on Linux when the system libc is not
      glibc. jemalloc's configure script incorrectly assumes that glibc is
      always used on Linux systems, producing glibc-style signatures; when
      the system libc is e.g. musl, the following error is observed:
      
      ```
        [  0%] Building CXX object CMakeFiles/rocksdb.dir/db/db_impl.cc.o
        In file included from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/table/block.h:19:0,
                         from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/db/db_impl.cc:77:
        /x-tools/x86_64-unknown-linux-musl/x86_64-unknown-linux-musl/sysroot/usr/include/malloc.h:19:8: error: declaration of 'size_t malloc_usable_size(void*)' has a different exception specifier
         size_t malloc_usable_size(void *);
                ^~~~~~~~~~~~~~~~~~
        In file included from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/db/db_impl.cc:20:0:
        /go/native/x86_64-unknown-linux-musl/jemalloc/include/jemalloc/jemalloc.h:78:33: note: from previous declaration 'size_t malloc_usable_size(void*) throw ()'
         #  define je_malloc_usable_size malloc_usable_size
                                         ^
        /go/native/x86_64-unknown-linux-musl/jemalloc/include/jemalloc/jemalloc.h:239:41: note: in expansion of macro 'je_malloc_usable_size'
         JEMALLOC_EXPORT size_t JEMALLOC_NOTHROW je_malloc_usable_size(
                                                 ^~~~~~~~~~~~~~~~~~~~~
        CMakeFiles/rocksdb.dir/build.make:350: recipe for target 'CMakeFiles/rocksdb.dir/db/db_impl.cc.o' failed
      ```
      
      This works around the issue by rearranging the sources such that
      jemalloc's headers are never in the same scope as the system's malloc
      header. The jemalloc issue has been reported as well, see:
      https://github.com/jemalloc/jemalloc/issues/778.
      
      cc tschottdorf
      Closes https://github.com/facebook/rocksdb/pull/2188
      
      Differential Revision: D5163048
      
      Pulled By: siying
      
      fbshipit-source-id: c553125458892def175c1be5682b0330d80b2a0d
      0dc3040d
    • Y
      Fixing blob db sequence number handling · ad19eb86
      Yi Wu 提交于
      Summary:
      Blob db rely on base db returning sequence number through write batch after DB::Write(). However after recent changes to the write path, DB::Writ()e no longer return sequence number in some cases. Fixing it by have WriteBatchInternal::InsertInto() always encode sequence number into write batch.
      
      Stacking on #2375.
      Closes https://github.com/facebook/rocksdb/pull/2385
      
      Differential Revision: D5148358
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 8bda0aa07b9334ed03ed381548b39d167dc20c33
      ad19eb86
  4. 31 5月, 2017 1 次提交
    • Y
      update blob_db_test · 345878a7
      Yi Wu 提交于
      Summary:
      Re-enable blob_db_test with some update:
      * Commented out delay at the end of GC tests. Will update the logic later with sync point to properly trigger GC.
      * Added some helper functions.
      
      Also update make files to include blob_dump tool.
      Closes https://github.com/facebook/rocksdb/pull/2375
      
      Differential Revision: D5133793
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 95470b26d0c1f9592ba4b7637e027fdd263f425c
      345878a7
  5. 24 5月, 2017 1 次提交
  6. 23 5月, 2017 1 次提交
  7. 16 5月, 2017 1 次提交
  8. 13 5月, 2017 1 次提交
    • A
      Add GetAllKeyVersions API · 3fa9a39c
      Andrew Kryczka 提交于
      Summary:
      - Introduced an include/ file dedicated to db-related debug functions to avoid making db.h more complex
      - Added debugging function, `GetAllKeyVersions()`, to return a listing of internal data for a range of user keys. The new `struct KeyVersion` exposes data similar to internal key without exposing any internal type.
      - Migrated the "ldb idump" subcommand to use this function
      - The API takes an inclusive-exclusive range to match behavior of "ldb idump". This will be quite annoying for users who want to query a single user key's versions :(.
      Closes https://github.com/facebook/rocksdb/pull/2232
      
      Differential Revision: D4976007
      
      Pulled By: ajkr
      
      fbshipit-source-id: cab375da53a7595d6575af2b7e3b776aa3ad793e
      3fa9a39c
  9. 11 5月, 2017 1 次提交
  10. 27 4月, 2017 1 次提交
  11. 25 4月, 2017 1 次提交
    • A
      Reunite checkpoint and backup core logic · e5e545a0
      Andrew Kryczka 提交于
      Summary:
      These code paths forked when checkpoint was introduced by copy/pasting the core backup logic. Over time they diverged and bug fixes were sometimes applied to one but not the other (like fix to include all relevant WALs for 2PC), or it required extra effort to fix both (like fix to forge CURRENT file). This diff reunites the code paths by extracting the core logic into a function, CreateCustomCheckpoint(), that is customizable via callbacks to implement both checkpoint and backup.
      
      Related changes:
      
      - flush_before_backup is now forcibly enabled when 2PC is enabled
      - Extracted CheckpointImpl class definition into a header file. This is so the function, CreateCustomCheckpoint(), can be called by internal rocksdb code but not exposed to users.
      - Implemented more functions in DummyDB/DummyLogFile (in backupable_db_test.cc) that are used by CreateCustomCheckpoint().
      Closes https://github.com/facebook/rocksdb/pull/1932
      
      Differential Revision: D4622986
      
      Pulled By: ajkr
      
      fbshipit-source-id: 157723884236ee3999a682673b64f7457a7a0d87
      e5e545a0
  12. 07 4月, 2017 3 次提交
    • S
      Refactor compaction picker code · ff972870
      Siying Dong 提交于
      Summary:
      1. Move universal compaction picker to separate files compaction_picker_universal.cc and compaction_picker_universal.h.
      2. Rename some functions to make the code easier to understand.
      3. Move leveled compaction picking code to a dedicated class, so that we we don't need to pass some common variable around when calling functions. It also allowed us to break down LevelCompactionPicker::PickCompaction() to smaller functions.
      Closes https://github.com/facebook/rocksdb/pull/2100
      
      Differential Revision: D4845948
      
      Pulled By: siying
      
      fbshipit-source-id: efa0ab4
      ff972870
    • S
      Move various string utility functions into string_util · 343b59d6
      Sagar Vemuri 提交于
      Summary:
      This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.
      
      Check the sub-commits for easier reviewing.
      Closes https://github.com/facebook/rocksdb/pull/2094
      
      Differential Revision: D4837730
      
      Pulled By: sagar0
      
      fbshipit-source-id: 344278a
      343b59d6
    • Y
      Move memtable related files into memtable directory · df6f5a37
      Yi Wu 提交于
      Summary:
      Move memtable related files into memtable directory.
      Closes https://github.com/facebook/rocksdb/pull/2087
      
      Differential Revision: D4829242
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: ca70ab6
      df6f5a37
  13. 06 4月, 2017 2 次提交
  14. 05 4月, 2017 1 次提交
  15. 04 4月, 2017 1 次提交
  16. 31 3月, 2017 1 次提交
  17. 04 3月, 2017 1 次提交
  18. 01 3月, 2017 1 次提交
  19. 28 2月, 2017 1 次提交
  20. 24 2月, 2017 1 次提交
  21. 03 2月, 2017 1 次提交
  22. 26 1月, 2017 1 次提交
    • A
      Generalize Env registration framework · 17c11806
      Andrew Kryczka 提交于
      Summary:
      The Env registration framework supports registering client Envs and selecting which one to instantiate according to a text field. This enabled things like adding the -env_uri argument to db_bench, so the same binary could be reused with different Envs just by changing CLI config.
      
      Now this problem has come up again in a non-Env context, as I want to instantiate a client Statistics implementation from db_bench, which is configured entirely via text parameters. Also, in the future we may wish to use it for deserializing client objects when loading OPTIONS file.
      
      This diff generalizes the Env registration logic to work with arbitrary types.
      
      - Generalized registration and instantiation code by templating them
      - The entire implementation is in a header file as that's Google style guide's recommendation for template definitions
      - Pattern match with std::regex_match rather than checking prefix, which was the previous behavior
      - Rename functions/files to be non-Env-specific
      Closes https://github.com/facebook/rocksdb/pull/1776
      
      Differential Revision: D4421933
      
      Pulled By: ajkr
      
      fbshipit-source-id: 34647d1
      17c11806
  23. 17 12月, 2016 1 次提交
  24. 02 12月, 2016 1 次提交
  25. 30 11月, 2016 1 次提交
  26. 17 11月, 2016 1 次提交
  27. 21 10月, 2016 1 次提交
    • I
      Support IngestExternalFile (remove AddFile restrictions) · 869ae5d7
      Islam AbdelRahman 提交于
      Summary:
      Changes in the diff
      
      API changes:
      - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear)
      - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file)
      - Deprecate AddFile() API
      
      Logic changes:
      - If our file overlap with the memtable we will flush the memtable
      - We will find the first level in the LSM tree that our file key range overlap with the keys in it
      - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it
      - We will assign a global sequence number to our new file
      - Remove AddFile restrictions by using global sequence numbers
      
      Other changes:
      - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob
      
      Test Plan:
      unit tests (still need to add more)
      addfile_stress (https://reviews.facebook.net/D65037)
      
      Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: jkedgar, hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65061
      869ae5d7
  28. 19 10月, 2016 1 次提交
    • A
      Compaction Support for Range Deletion · 6fbe96ba
      Andrew Kryczka 提交于
      Summary:
      This diff introduces RangeDelAggregator, which takes ownership of iterators
      provided to it via AddTombstones(). The tombstones are organized in a two-level
      map (snapshot stripe -> begin key -> tombstone). Tombstone creation avoids data
      copy by holding Slices returned by the iterator, which remain valid thanks to pinning.
      
      For compaction, we create a hierarchical range tombstone iterator with structure
      matching the iterator over compaction input data. An aggregator based on that
      iterator is used by CompactionIterator to determine which keys are covered by
      range tombstones. In case of merge operand, the same aggregator is used by
      MergeHelper. Upon finishing each file in the compaction, relevant range tombstones
      are added to the output file's range tombstone metablock and file boundaries are
      updated accordingly.
      
      To check whether a key is covered by range tombstone, RangeDelAggregator::ShouldDelete()
      considers tombstones in the key's snapshot stripe. When this function is used outside of
      compaction, it also checks newer stripes, which can contain covering tombstones. Currently
      the intra-stripe check involves a linear scan; however, in the future we plan to collapse ranges
      within a stripe such that binary search can be used.
      
      RangeDelAggregator::AddToBuilder() adds all range tombstones in the table's key-range
      to a new table's range tombstone meta-block. Since range tombstones may fall in the gap
      between files, we may need to extend some files' key-ranges. The strategy is (1) first file
      extends as far left as possible and other files do not extend left, (2) all files extend right
      until either the start of the next file or the end of the last range tombstone in the gap,
      whichever comes first.
      
      One other notable change is adding release/move semantics to ScopedArenaIterator
      such that it can be used to transfer ownership of an arena-allocated iterator, similar to
      how unique_ptr is used for malloc'd data.
      
      Depends on D61473
      
      Test Plan: compaction_iterator_test, mock_table, end-to-end tests in D63927
      
      Reviewers: sdong, IslamAbdelRahman, wanning, yhchiang, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62205
      6fbe96ba
  29. 30 9月, 2016 1 次提交
  30. 24 9月, 2016 1 次提交
    • Y
      Split DBOptions into ImmutableDBOptions and MutableDBOptions · 9ed928e7
      Yi Wu 提交于
      Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs.
      
      Test Plan:
        make all check
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64065
      9ed928e7
  31. 08 9月, 2016 1 次提交
  32. 06 9月, 2016 1 次提交
  33. 03 9月, 2016 1 次提交
  34. 27 8月, 2016 1 次提交
    • I
      Expose ThreadPool under include/rocksdb/threadpool.h · e9b2af87
      Islam AbdelRahman 提交于
      Summary:
      This diff split ThreadPool to
      -ThreadPool (abstract interface exposed in include/rocksdb/threadpool.h)
      -ThreadPoolImpl (actual implementation in util/threadpool_imp.h)
      
      This allow us to expose ThreadPool to the user so we can use it as an option later
      
      Test Plan: existing unit tests
      
      Reviewers: andrewkr, yiwu, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62085
      e9b2af87
  35. 23 8月, 2016 1 次提交
  36. 20 8月, 2016 1 次提交
    • Y
      Introduce ClockCache · 4cc37f59
      Yi Wu 提交于
      Summary:
      Clock-based cache implemenetation aim to have better concurreny than
      default LRU cache. See inline comments for implementation details.
      
      Test Plan:
      Update cache_test to run on both LRUCache and ClockCache. Adding some
      new tests to catch some of the bugs that I fixed while implementing the
      cache.
      
      Reviewers: kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61647
      4cc37f59