1. 28 1月, 2014 1 次提交
  2. 27 1月, 2014 1 次提交
  3. 26 1月, 2014 1 次提交
  4. 25 1月, 2014 11 次提交
    • I
      Fix reduce levels · 04afa321
      Igor Canadi 提交于
      ReduceNumberOfLevels had segmentation fault in WriteSnapshot() since we
      didn't change the number of levels in VersionSet (we consider them
      immutable from now on). This fixes the problem.
      04afa321
    • S
      Moving Some includes from options.h to forward declaration · 8477255d
      Siying Dong 提交于
      Summary: By removing some includes form options.h and reply on forward declaration, we can more easily reason the dependencies.
      
      Test Plan: make all check
      
      Reviewers: kailiu, haobo, igor, dhruba
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15411
      8477255d
    • I
      Fixing iterator cleanup for Tailing iterator · f653fdcf
      Igor Canadi 提交于
      Immutable tailing iterator doesn't set CleanupState::mem, so we don't
      have to unref it.
      f653fdcf
    • I
      Add a call DisownData() to Cache, which should speed up shutdown · b13bdfa5
      Igor Canadi 提交于
      Summary: On a shutdown, freeing memory takes a long time. If we're shutting down, we don't really care about memory leaks. I added a call to Cache that will avoid freeing all objects in cache.
      
      Test Plan:
      I created a script to test the speedup and demonstrate how to use the call: https://phabricator.fb.com/P3864368
      
      Clean shutdown took 7.2 seconds, while fast and dirty one took 6.3 seconds. Unfortunately, the speedup is not that big, but should be bigger with bigger block_cache. I have set up the capacity to 80GB, but the script filled up only ~7GB.
      
      Reviewers: dhruba, haobo, MarkCallaghan, xjin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15069
      b13bdfa5
    • I
      Make VersionSet::ReduceNumberOfLevels() static · 677fee27
      Igor Canadi 提交于
      Summary:
      A lot of our code implicitly assumes number_levels to be static. ReduceNumberOfLevels() breaks that assumption. For example, after calling ReduceNumberOfLevels(), DBImpl::NumberLevels() will be different from VersionSet::NumberLevels(). This is dangerous. Thankfully, it's not in public headers and is only used from LDB cmd tool. LDB tool is only using it statically, i.e. it never calls it with running DB instance. With this diff, we make it explicitly static. This way, we can assume number_levels to be immutable and not break assumption that lot of our code is relying upon. LDB tool can still use the method.
      
      Also, I removed the method from a separate file since it breaks filename completition. version_se<TAB> now completes to "version_set." instead of "version_set" (without the dot). I don't see a big reason that the function should be in a different file.
      
      Test Plan: reduce_levels_test
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15303
      677fee27
    • I
      MemTableListVersion · c583157d
      Igor Canadi 提交于
      Summary:
      MemTableListVersion is to MemTableList what Version is to VersionSet. I took almost the same ideas to develop MemTableListVersion. The reason is to have copying std::list done in background, while flushing, rather than in foreground (MultiGet() and NewIterator()) under a mutex! Also, whenever we copied MemTableList, we copied also some MemTableList metadata (flush_requested_, commit_in_progress_, etc.), which was wasteful.
      
      This diff avoids std::list copy under a mutex in both MultiGet() and NewIterator(). I created a small database with some number of immutable memtables, and creating 100.000 iterators in a single-thread (!) decreased from {188739, 215703, 198028} to {154352, 164035, 159817}. A lot of the savings come from code under a mutex, so we should see much higher savings with multiple threads. Creating new iterator is very important to LogDevice team.
      
      I also think this diff will make SuperVersion obsolete for performance reasons. I will try it in the next diff. SuperVersion gave us huge savings on Get() code path, but I think that most of the savings came from copying MemTableList under a mutex. If we had MemTableListVersion, we would never need to copy the entire object (like we still do in NewIterator() and MultiGet())
      
      Test Plan: `make check` works. I will also do `make valgrind_check` before commit
      
      Reviewers: dhruba, haobo, kailiu, sdong, emayanke, tnovak
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15255
      c583157d
    • K
      Add a make target for shared library · f131d4c2
      kailiu 提交于
      Summary:
      Previous we made `make release` also compile shared library. However it takes a long time to complete.
      
      To make our development process more efficient. I added a new make target shared_lib.
      
      User can of course run `make <library_name>` for direct compilation. However the <library_name> changed under certain condition. Thus we need `make shared_lib` to get rid of the memorization from users' side.
      
      Test Plan: make shared_lib
      
      Reviewers: igor, sdong, haobo, dhruba
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15309
      f131d4c2
    • I
      Revert "Moving to glibc-fb" · e832e72b
      Igor Canadi 提交于
      This reverts commit d24961b6.
      
      For some reason, glibc2.17-fb breaks gflags. Reverting for now
      e832e72b
    • K
      Temporarily disable caching index/filter blocks · 66dc033a
      kailiu 提交于
      Summary:
      Mixing index/filter blocks with data blocks resulted in some known
      issues.  To make sure in next release our users won't be affected,
      we added a new option in BlockBasedTableFactory::TableOption to
      conceal this functionality for now.
      
      This patch also introduced a BlockBasedTableReader::OpenOptions,
      which avoids the "infinite" growth of parameters in
      BlockBasedTableReader::Open().
      
      Test Plan: make check
      
      Reviewers: haobo, sdong, igor, dhruba
      
      Reviewed By: igor
      
      CC: leveldb, tnovak
      
      Differential Revision: https://reviews.facebook.net/D15327
      66dc033a
    • I
      Moving to glibc-fb · d24961b6
      Igor Canadi 提交于
      Summary:
      It looks like we might have some trouble when building the new release with 4.8, since fbcode is using glibc2.17-fb by default and we are using glibc2.17. It was reported by Benjamin Renard in our internal group.
      
      This diff moves our fbcode build to use glibc2.17-fb by default. I got some linker errors when compiling, complaining that `google::SetUsageMessage()` was undefined. After deleting all offending lines, the compile was successful and everything works.
      
      Test Plan:
      Compiled
      Ran ./db_bench ./db_stress ./db_repl_stress
      
      Reviewers: kailiu
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15405
      d24961b6
    • S
      If User setting of compaction multipliers overflow, use default value 1 instead · 4605e20c
      Siying Dong 提交于
      Summary: Currently, compaction multipliers can overflow and cause unexpected behaviors. In this patch, we detect those overflows and use multiplier 1 for them.
      
      Test Plan: make all check
      
      Reviewers: dhruba, haobo, igor, kailiu
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15321
      4605e20c
  5. 24 1月, 2014 4 次提交
    • L
      CompactRange() to return status · aba2acb5
      Lei Jin 提交于
      Summary: as title
      
      Test Plan:
      make all check
      What else tests shall I cover?
      
      Reviewers: igor, haobo
      
      CC:
      
      Differential Revision: https://reviews.facebook.net/D15339
      aba2acb5
    • T
      Tailing iterator · 81c9cc9b
      Tomislav Novak 提交于
      Summary:
      This diff implements a special type of iterator that doesn't create a snapshot
      (can be used to read newly inserted data) and is optimized for doing sequential
      reads.
      
      TailingIterator uses current superversion number to determine whether to
      invalidate its internal iterators. If the version hasn't changed, it can often
      avoid doing expensive seeks over immutable structures (sst files and immutable
      memtables).
      
      Test Plan:
      * new unit tests
      * running LD with this patch
      
      Reviewers: igor, dhruba, haobo, sdong, kailiu
      
      Reviewed By: sdong
      
      CC: leveldb, lovro, march
      
      Differential Revision: https://reviews.facebook.net/D15285
      81c9cc9b
    • I
      Fix performance regression in statistics · 4e91f27c
      Igor Canadi 提交于
      Summary:
      For some reason, D15099 caused a big performance regression: https://fburl.com/16059000
      
      After digging a bit, I figured out that the reason was that std::atomic_uint_fast64_t was allocated in an array. When I switched from an array to vector, the QPS returned to the previous level. I'm not sure why this is happening, but this diff seems to fix the performance regression.
      
      Test Plan: I ran the regression script, observed the performance going back to normal
      
      Reviewers: tnovak, kailiu, haobo
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15375
      4e91f27c
    • K
      Add google-style checker to "arc lint" · d0458469
      kailiu 提交于
      Summary:
      After we reached a consensus on code format, which follows exactly
      Google's coding style, a natural follow-up is to have a style checker
      that can handle stuffs beyond format.
      
      Google already has a powerful style checker "cpplint.py" and,
      luckily, phabricator already provides the built-in linter for it!
      Next time with "arc lint" most style inconsistency will be detected
      (but will not be fixed).
      
      Also I copied cpplint.py to linters directory, which is mostly
      because we may need the flexibility to make some modifications on
      it for our own need.
      
      Test Plan:
      ran arc lint table/block_based_table_builder.cc to see the amazing
      results.
      
      Reviewers: haobo, sdong, igor, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15369
      d0458469
  6. 23 1月, 2014 2 次提交
    • I
      Unfriending classes · fb01755a
      Igor Canadi 提交于
      Summary:
      In this diff I made some effort to reduce usage of friending. To do that, I had to expose Compaction::inputs_ through a method inputs(). Not sure if this is a good idea, there is a trade-off. I think it's less confusing than having lots of friends.
      
      I also thought about other friendship relationships, but they are too much tangled at this point. Once you friend two classes, it's very hard to unfriend them :)
      
      Test Plan: make check
      
      Reviewers: haobo, kailiu, sdong, dhruba
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15267
      fb01755a
    • I
      Refactor Recover() code · 6fe9b577
      Igor Canadi 提交于
      Summary:
      This diff does two things:
      * Rethinks how we call Recover() with read_only option. Before, we call it with pointer to memtable where we'd like to apply those changes to. This memtable is set in db_impl_readonly.cc and it's actually DBImpl::mem_. Why don't we just apply updates to mem_ right away? It seems more intuitive.
      * Changes when we apply updates to manifest. Before, the process is to recover all the logs, flush it to sst files and then do one giant commit that atomically adds all recovered sst files and sets the next log number. This works good enough, but causes some small troubles for my column family approach, since I can't have one VersionEdit apply to more than single column family[1]. The change here is to commit the files recovered from logs right away. Here is the state of the world before the change:
      1. Recover log 5, add new sst files to edit
      2. Recover log 7, add new sst files to edit
      3. Recover log 8, add new sst files to edit
      4. Commit all added sst files to manifest and mark log files 5, 7 and 8 as recoverd (via SetLogNumber(9) function)
      After the change, we'll do:
      1. Recover log 5, commit the new sst files and set log 5 as recovered
      2. Recover log 7, commit the new sst files and set log 7 as recovered
      3. Recover log 8, commit the new sst files and set log 8 as recovered
      
      The added (small) benefit is that if we fail after (2), the new recovery will only have to recover log 8. In previous case, we'll have to restart the recovery from the beginning. The bigger benefit will be to enable easier integration of multiple column families in Recovery code path.
      
      [1] I'm happy to dicuss this decison, but I believe this is the cleanest way to go. It also makes backward compatibility much easier. We don't have a requirement of adding multiple column families atomically.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15237
      6fe9b577
  7. 22 1月, 2014 1 次提交
  8. 18 1月, 2014 4 次提交
    • M
      Boost access before mutex is unlocked · 4e8321bf
      Mark Callaghan 提交于
      Summary:
      This moves the use of versions_ to before the mutex is unlocked
      to avoid a possible race.
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      make check
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: haobo, dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15279
      4e8321bf
    • I
      Statistics code cleanup · 83681bf9
      Igor Canadi 提交于
      Summary: I'm separating code-cleanup part of https://reviews.facebook.net/D14517. This will make D14517 easier to understand and this diff easier to review.
      
      Test Plan: make check
      
      Reviewers: haobo, kailiu, sdong, dhruba, tnovak
      
      Reviewed By: tnovak
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15099
      83681bf9
    • I
      Fix SIGSEGV in compaction picker · 0f4a75b7
      Igor Canadi 提交于
      Summary:
      The SIGSEGV was introduced by https://reviews.facebook.net/D15171
      
      I also fixed ExpandWhileOverlapping() which returned the failure by setting it's own stack variable to nullptr (!). This bug is present in 2.6 release, so I guess ExpandWhileOverlapping never fails :)
      
      Test Plan: `make check`. Also MarkCallaghan confirmed it fixed the SIGSEGV he reported.
      
      Reviewers: MarkCallaghan, kailiu, sdong, dhruba, haobo
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15261
      0f4a75b7
    • M
      Fix SlowdownAmount · 439e36db
      Mark Callaghan 提交于
      Summary:
      This had a few bugs.
      1) bottom and top were reversed. top is for the max value but the callers were passing the max
      value to bottom. The result is that the max sleep is used when n >= bottom.
      2) one of the callers passed values with type double and these values are frequently between
      1.0 and 2.0 so rounding will do some bad things
      3) sometimes the function returned 0 when there should be a stall
      
      With this change and one other diff (out for review soon) there are slightly fewer stalls on one workload.
      
      With the fix.
      Stalls(secs): 160.166 level0_slowdown, 0.000 level0_numfiles, 0.000 memtable_compaction, 58.495 leveln_slowdown
      Stalls(count): 910261 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 54526 leveln_slowdown
      
      Without the fix.
      Stalls(secs): 172.227 level0_slowdown, 0.000 level0_numfiles, 0.000 memtable_compaction, 56.538 leveln_slowdown
      Stalls(count): 160831 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 52845 leveln_slowdown
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench for --benchmarks=overwrite with IO-bound database
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15243
      439e36db
  9. 17 1月, 2014 3 次提交
    • K
      Fix some "make format" issue · e19bad9b
      Kai Liu 提交于
      Summary:
      * make sure when some pre-check fails, the script won't halt immediately.
      * change fburl to google's short url.
      * Fix a bug in this script: now it checks the uncommitted code only.
      
      Test Plan: Ran the script under differnet environments.
      
      Reviewers: igor
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15231
      e19bad9b
    • I
      Remove compaction pointers · 6d6fb709
      Igor Canadi 提交于
      Summary: The only thing we do with compaction pointers is set them to some values, we never actually read them. I don't know what we used them for, but it doesn't look like we use them anymore.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15225
      6d6fb709
    • I
      CompactionPicker · c699c84a
      Igor Canadi 提交于
      Summary:
      This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
      
      To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
      
      In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
      
      This diff depends on D15171, D15183, D15189 and D15201
      
      Test Plan: `make check`
      
      Reviewers: kailiu, sdong, dhruba, haobo
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15207
      c699c84a
  10. 16 1月, 2014 5 次提交
    • K
      Remove the unnecessary use of shared_ptr · eae1804f
      kailiu 提交于
      Summary:
      shared_ptr is slower than unique_ptr (which literally comes with no performance cost compare with raw pointers).
      In memtable and memtable rep, we use shared_ptr when we'd actually should use unique_ptr.
      
      According to igor's previous work, we are likely to make quite some performance gain from this diff.
      
      Test Plan: make check
      
      Reviewers: dhruba, igor, sdong, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15213
      eae1804f
    • I
      Move more functions from VersionSet to Version · 787f11bb
      Igor Canadi 提交于
      Summary:
      This moves functions:
      * VersionSet::Finalize() -> Version::UpdateCompactionStats()
      * VersionSet::UpdateFilesBySize() -> Version::UpdateFilesBySize()
      
      The diff depends on D15189, D15183 and D15171
      
      Test Plan: make check
      
      Reviewers: kailiu, sdong, haobo, dhruba
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15201
      787f11bb
    • I
      Moving Compaction class to separate header file · 615d1ea2
      Igor Canadi 提交于
      Summary:
      I'm sure we'll all agree that version_set.cc needs simplifying. This diff moves Compaction class to a separate file.
      
      The diff depends on D15171 and D15183
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15189
      615d1ea2
    • I
      Move functions from VersionSet to Version · 2f4eda78
      Igor Canadi 提交于
      Summary:
      There were some functions in VersionSet that had no reason to be there instead of Version. Moving them to Version will make column families implementation easier.
      
      The functions moved are:
      * NumLevelBytes
      * LevelSummary
      * LevelFileSummary
      * MaxNextLevelOverlappingBytes
      * AddLiveFiles (previously AddLiveFilesCurrentVersion())
      * NeedSlowdownForNumLevel0Files
      
      The diff continues on (and depends on) D15171
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong, emayanke
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15183
      2f4eda78
    • I
      Decrease reliance on VersionSet::NumberLevels() · 65a8a52b
      Igor Canadi 提交于
      Summary:
      With column families VersionSet will not have a constant number of levels (each CF can have different options), so we'll need to eliminate call to VersionSet::NumberLevels()
      
      This diff decreases number of callsites, but we're not there yet. It associates number of levels with Version (each version is associated with single CF) instead of VersionSet.
      
      I have also slightly changed how VersionSet keeps track of manifest size.
      
      This diff also modifies constructor of Compaction such that it takes input_version and automatically Ref()s it. Before this was done outside of constructor.
      
      In next diffs I will continue to decrease number of callsites of VersionSet::NumberLevels() and also references to current_
      
      Test Plan: make check
      
      Reviewers: haobo, dhruba, kailiu, sdong
      
      Reviewed By: sdong
      
      Differential Revision: https://reviews.facebook.net/D15171
      65a8a52b
  11. 15 1月, 2014 7 次提交