1. 16 1月, 2014 6 次提交
    • K
      Merge branch 'master' into performance · 1304d8c8
      kailiu 提交于
      Conflicts:
      	Makefile
      	db/db_impl.cc
      	db/db_impl.h
      	db/db_test.cc
      	db/memtable.cc
      	db/memtable.h
      	db/version_edit.h
      	db/version_set.cc
      	include/rocksdb/options.h
      	util/hash_skiplist_rep.cc
      	util/options.cc
      1304d8c8
    • K
      Remove the unnecessary use of shared_ptr · eae1804f
      kailiu 提交于
      Summary:
      shared_ptr is slower than unique_ptr (which literally comes with no performance cost compare with raw pointers).
      In memtable and memtable rep, we use shared_ptr when we'd actually should use unique_ptr.
      
      According to igor's previous work, we are likely to make quite some performance gain from this diff.
      
      Test Plan: make check
      
      Reviewers: dhruba, igor, sdong, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15213
      eae1804f
    • I
      Move more functions from VersionSet to Version · 787f11bb
      Igor Canadi 提交于
      Summary:
      This moves functions:
      * VersionSet::Finalize() -> Version::UpdateCompactionStats()
      * VersionSet::UpdateFilesBySize() -> Version::UpdateFilesBySize()
      
      The diff depends on D15189, D15183 and D15171
      
      Test Plan: make check
      
      Reviewers: kailiu, sdong, haobo, dhruba
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15201
      787f11bb
    • I
      Moving Compaction class to separate header file · 615d1ea2
      Igor Canadi 提交于
      Summary:
      I'm sure we'll all agree that version_set.cc needs simplifying. This diff moves Compaction class to a separate file.
      
      The diff depends on D15171 and D15183
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15189
      615d1ea2
    • I
      Move functions from VersionSet to Version · 2f4eda78
      Igor Canadi 提交于
      Summary:
      There were some functions in VersionSet that had no reason to be there instead of Version. Moving them to Version will make column families implementation easier.
      
      The functions moved are:
      * NumLevelBytes
      * LevelSummary
      * LevelFileSummary
      * MaxNextLevelOverlappingBytes
      * AddLiveFiles (previously AddLiveFilesCurrentVersion())
      * NeedSlowdownForNumLevel0Files
      
      The diff continues on (and depends on) D15171
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong, emayanke
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15183
      2f4eda78
    • I
      Decrease reliance on VersionSet::NumberLevels() · 65a8a52b
      Igor Canadi 提交于
      Summary:
      With column families VersionSet will not have a constant number of levels (each CF can have different options), so we'll need to eliminate call to VersionSet::NumberLevels()
      
      This diff decreases number of callsites, but we're not there yet. It associates number of levels with Version (each version is associated with single CF) instead of VersionSet.
      
      I have also slightly changed how VersionSet keeps track of manifest size.
      
      This diff also modifies constructor of Compaction such that it takes input_version and automatically Ref()s it. Before this was done outside of constructor.
      
      In next diffs I will continue to decrease number of callsites of VersionSet::NumberLevels() and also references to current_
      
      Test Plan: make check
      
      Reviewers: haobo, dhruba, kailiu, sdong
      
      Reviewed By: sdong
      
      Differential Revision: https://reviews.facebook.net/D15171
      65a8a52b
  2. 15 1月, 2014 16 次提交
    • K
      Optimize MayContainHash() · cd535c22
      Kai Liu 提交于
      Summary:
      In latest leaf's, MayContainHash() consistently consumes 5%~7% CPU usage.
      
      I checked the code and did an experiment with/without inlining this method.
      
      In release mode, with `1024 * 1024 * 256` bits and `1024 * 512` entries, both call 2^30 MayContainHash() with distinctive parameters.
      
      As the result showed, this patch reduced the running time from 9.127 sec to 7.891 sec.
      
      Test Plan: make check
      
      Reviewers: sdong, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15177
      cd535c22
    • K
      Fix the return type of WriteBatch::Data(). · c8f16221
      kailiu 提交于
      Summary: Quick fix for https://reviews.facebook.net/D15123
      
      Test Plan: Make check
      
      Reviewers: sdong, vkrest
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15165
      c8f16221
    • S
      [RocksDB Performance Branch] DBImpl.NewInternalIterator() to reduce works inside mutex · 9b51af5a
      Siying Dong 提交于
      Summary: To reduce mutex contention caused by DBImpl.NewInternalIterator(), in this function, move all the iteration creation works out of mutex, only leaving object ref and get.
      
      Test Plan:
      make all check
      will run db_stress for a while too to make sure no problem.
      
      Reviewers: haobo, dhruba, kailiu
      
      Reviewed By: haobo
      
      CC: igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D14589
      
      Conflicts:
      	db/db_impl.cc
      9b51af5a
    • I
      Fix CompactRange to apply filter to every key · d9cd7a06
      Igor Canadi 提交于
      Summary:
      When doing CompactRange(), we should first flush the memtable and then calculate max_level_with_files. Also, we want to compact all the levels that have files, including level `max_level_with_files`.
      
      This patch fixed the unit test.
      
      Test Plan: Added a failing unit test and a fix, so it's not failing anymore.
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb, xjin
      
      Differential Revision: https://reviews.facebook.net/D14421
      d9cd7a06
    • I
      1ed2404f
    • I
      Fix test · 62910202
      Igor Canadi 提交于
      62910202
    • I
      Fix memtable construction in tests · 7f3e417f
      Igor Canadi 提交于
      7f3e417f
    • I
      VersionEdit not to take NumLevels() · 055e6df4
      Igor Canadi 提交于
      Summary:
      I will submit a sequence of diffs that are preparing master branch for column families. There are a lot of implicit assumptions in the code that are making column family implementation hard. If I make the change only in column family branch, it will make merging back to master impossible.
      
      Most of the diffs will be simple code refactorings, so I hope we can have fast turnaround time. Feel free to grab me in person to discuss any of them.
      
      This diff removes number of level check from VersionEdit. It is used only when VersionEdit is read, not written, but has to be set when it is written. I believe it is a right thing to make VersionEdit dumb and check consistency on the caller side. This will also make it much easier to implement Column Families, since different column families can have different number of levels.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, sdong, kailiu
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15159
      055e6df4
    • I
      BuildBatchGroup -- memcpy outside of lock · 7d9f21cf
      Igor Canadi 提交于
      Summary: When building batch group, don't actually build a new batch since it requires heavy-weight mem copy and malloc. Only store references to the batches and build the batch group without lock held.
      
      Test Plan:
      `make check`
      
      I am also planning to run performance tests. The workload that will benefit from this change is readwhilewriting. I will post the results once I have them.
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb, xjin
      
      Differential Revision: https://reviews.facebook.net/D15063
      7d9f21cf
    • K
      Move the compilation of the shared libraries to "make release" · 481c77e5
      kailiu 提交于
      Compiling the shared libraries took a long time. Thus to speed up the development speed, it still makes sense to be separated from regular compilation.
      481c77e5
    • N
    • K
      A script that automatically reformat affected lines · d702d807
      Kai Liu 提交于
      Summary:
      Added a script that reformat only the affected lines in a given diff.
      
      I planned to make that file as pre-commit hook but looks it's a little bit more difficult than I thought. Since I don't want to spend too much time on this task right now, I eventually added a "make command" to achieve this with a few additional key strokes.
      
      Also make the clang-format solely inherited from Google's style -- there are still debates on some of the style issues, but we can address them later once we reach a consensus.
      
      Test Plan: Did some ugly format change and ran "make format", all affected lines are formatted as expected.
      
      Reviewers: igor, sdong, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15147
      d702d807
    • N
      Use sanitized options while opening db · 1d9bac4d
      Naman Gupta 提交于
      Summary: We use SanitizeOptions() to set appropriate values for some options, based on other options. So we should use the sanitized options by default. Luckily it hasn't caused a bug yet, but can result in a bug in the fugture.
      
      Test Plan: make check
      
      Reviewers: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14103
      1d9bac4d
    • S
      DB::Put() to estimate write batch data size needed and pre-allocate buffer · 9ea8bf90
      Siying Dong 提交于
      Summary:
      In one of CPU profiles, we see some CPU costs of string::reserve() inside Batch.Put(). This patch should be able to reduce some of the costs by allocating sufficient buffer before hand.
      
      Since it is a trivial percentage of CPU costs, I didn't find a way to show the improvement in one of the benchmarks. I'll deploy it to same application and do the same CPU profiling to make sure those CPU costs are reduced.
      
      Test Plan: make all check
      
      Reviewers: haobo, kailiu, igor
      
      Reviewed By: haobo
      
      CC: leveldb, nkg-
      
      Differential Revision: https://reviews.facebook.net/D15135
      9ea8bf90
    • S
      Pre-calculate whether to slow down for too many level 0 files · fbbf0d14
      Siying Dong 提交于
      Summary: Currently in DBImpl::MakeRoomForWrite(), we do  "versions_->NumLevelFiles(0) >= options_.level0_slowdown_writes_trigger" to check whether the writer thread needs to slow down. However, versions_->NumLevelFiles(0) is slightly more expensive than we expected. By caching the result of the comparison when installing a new version, we can avoid this function call every time.
      
      Test Plan:
      make all check
      Manually trigger this behavior by applying universal compaction style and make sure inserts are made slow after there are certain number of files.
      
      Reviewers: haobo, kailiu, igor
      
      Reviewed By: kailiu
      
      CC: nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D15141
      fbbf0d14
    • S
      DB::Put() to estimate write batch data size needed and pre-allocate buffer · 51dd2192
      Siying Dong 提交于
      Summary:
      In one of CPU profiles, we see some CPU costs of string::reserve() inside Batch.Put(). This patch should be able to reduce some of the costs by allocating sufficient buffer before hand.
      
      Since it is a trivial percentage of CPU costs, I didn't find a way to show the improvement in one of the benchmarks. I'll deploy it to same application and do the same CPU profiling to make sure those CPU costs are reduced.
      
      Test Plan: make all check
      
      Reviewers: haobo, kailiu, igor
      
      Reviewed By: haobo
      
      CC: leveldb, nkg-
      
      Differential Revision: https://reviews.facebook.net/D15135
      51dd2192
  3. 14 1月, 2014 3 次提交
    • N
      Add read/modify/write functionality to Put() api · 8454cfe5
      Naman Gupta 提交于
      Summary: The application can set a callback function, which is applied on the previous value. And calculates the new value. This new value can be set, either inplace, if the previous value existed in memtable, and new value is smaller than previous value. Otherwise the new value is added normally.
      
      Test Plan: fbmake. Added unit tests. All unit tests pass.
      
      Reviewers: dhruba, haobo
      
      Reviewed By: haobo
      
      CC: sdong, kailiu, xinyaohu, sumeet, leveldb
      
      Differential Revision: https://reviews.facebook.net/D14745
      8454cfe5
    • K
      Compile dynamic library by default · ac2fe728
      kailiu 提交于
      Summary:
      Per request, some users need to use dynamic rocksdb library instead of static one.
      
      However currently the dynamic libraries have to be manually compiled by default, which is inconvenient. I made dymamic libraries to be compiled by default.
      
      Test Plan: make clean; make; make clean;
      
      Reviewers: haobo, sdong, dhruba, igor
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15117
      ac2fe728
    • S
      WriteBatch to provide a way for user to query data size directly and only... · c4548d5f
      Siying Dong 提交于
      WriteBatch to provide a way for user to query data size directly and only return constant reference of data in Data()
      
      Summary:
      WriteBatch::Data() now is easily to be misuse by users. Also, there is no cheap way for user of WriteBatch to know the data size accumulated. This patch fix the problem by:
      (1) return a constant reference to Data() so it's obvious to caller what it means.
      (2) add a function to return data size directly
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, kailiu
      
      Reviewed By: kailiu
      
      CC: zshao, leveldb
      
      Differential Revision: https://reviews.facebook.net/D15123
      c4548d5f
  4. 12 1月, 2014 1 次提交
  5. 11 1月, 2014 6 次提交
  6. 10 1月, 2014 4 次提交
    • Y
      fix compile warning · afdd2d1a
      Yancey 提交于
      afdd2d1a
    • S
      StopWatch not to get time if it is created for statistics and it is disabled · 237a3da6
      Siying Dong 提交于
      Summary: Currently, even if statistics is not enabled, StopWatch only for the stats still gets the time of the day, which is wasteful. This patch adds a new option to StopWatch to disable this get in this case.
      
      Test Plan: make all check
      
      Reviewers: dhruba, haobo, igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14703
      
      Conflicts:
      	db/db_impl.cc
      237a3da6
    • S
      [Performance Branch] A Hashed Linked List Based Mem Table · 424a524a
      Siying Dong 提交于
      Summary:
      Implement a mem table, in which keys are hashed based on prefixes. In each bucket, entries are organized in a sorted linked list. It has the same thread safety guarantee as skip list.
      
      The motivation is to optimize memory usage for the case that prefix hashing is primary way of seeking to the entry. Compared to hash skip list implementation, this implementation is more memory efficient, but inside each bucket, search is always linear. The target scenario is that there are only very limited number of records in each hash bucket.
      
      Test Plan: Add a test case in db_test
      
      Reviewers: haobo, kailiu, dhruba
      
      Reviewed By: haobo
      
      CC: igor, nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D14979
      424a524a
    • I
      Feature requests for BackupableDB · cb37ddf2
      Igor Canadi 提交于
      Summary:
      This diff introduces some features that were requested by two internal customers:
      * Ability for backups not to share table files, because we can't guarantee that equal filename means equal content accross replicas
      * Ability for two threads to call EnableFileDeletions() and DisableFileDeletions()
      * Ability to stop backup from another thread and not slow down the DB close
      * Copy the files to the temporary folder first and then atomically rename
      
      Test Plan: Added some tests to backupable_db_test
      
      Reviewers: dhruba, sanketh, muthu, sdong, haobo
      
      Reviewed By: haobo
      
      CC: leveldb, sanketh, muthu
      
      Differential Revision: https://reviews.facebook.net/D14769
      cb37ddf2
  7. 09 1月, 2014 3 次提交
  8. 08 1月, 2014 1 次提交
    • M
      Don't always compress L0 files written by memtable flush · 50994bf6
      Mark Callaghan 提交于
      Summary:
      Code was always compressing L0 files written by a memtable flush
      when compression was enabled. Now this is done when
      min_level_to_compress=0 for leveled compaction and when
      universal_compaction_size_percent=-1 for universal compaction.
      
      Task ID: #3416472
      
      Blame Rev:
      
      Test Plan:
      ran db_bench with compression options
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba, igor, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14757
      50994bf6