1. 16 4月, 2014 2 次提交
    • I
      RocksDBLite · 588bca20
      Igor Canadi 提交于
      Summary:
      Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.
      
      Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)
      
      Test Plan: compiles :)
      
      Reviewers: dhruba, haobo, ljin, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17835
      588bca20
    • I
      Don't roll empty logs · e6acb874
      Igor Canadi 提交于
      Summary:
      With multiple column families, especially when manual Flush is executed, we might roll the log file, although the current log file is empty (no data has been written to the log).
      
      After the diff, we won't create new log file if current is empty.
      
      Next, I will write an algorithm that will flush column families that reference old log files (i.e., that weren't flushed in a while)
      
      Test Plan: Added an unit test. Confirmed that unit test failes in master
      
      Reviewers: dhruba, haobo, ljin, sdong
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17631
      e6acb874
  2. 15 4月, 2014 1 次提交
  3. 08 4月, 2014 1 次提交
  4. 05 4月, 2014 1 次提交
    • S
      Create log::Writer out of DB Mutex · ea0198fe
      sdong 提交于
      Summary: Our measurement shows that sometimes new log::Write's constructor can take hundreds of milliseconds. It's unclear why but just simply move it out of DB mutex.
      
      Test Plan: make all check
      
      Reviewers: haobo, ljin, igor
      
      Reviewed By: haobo
      
      CC: nkg-, yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17487
      ea0198fe
  5. 04 4月, 2014 1 次提交
  6. 03 4月, 2014 1 次提交
    • H
      [RocksDB] Fix a race condition in GetSortedWalFiles · 48bc0c6a
      Haobo Xu 提交于
      Summary: This patch fixed a race condition where a log file is moved to archived dir in the middle of GetSortedWalFiles. Without the fix, the log file would be missed in the result, which leads to transaction log iterator gap. A test utility SyncPoint is added to help reproducing the race condition.
      
      Test Plan: TransactionLogIteratorRace; make check
      
      Reviewers: dhruba, ljin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17121
      48bc0c6a
  7. 28 3月, 2014 1 次提交
  8. 25 3月, 2014 1 次提交
    • D
      [rocksdb] new CompactionFilterV2 API · b47812fb
      Danny Guo 提交于
      Summary:
      This diff adds a new CompactionFilterV2 API that roll up the
      decisions of kv pairs during compactions. These kv pairs must share the
      same key prefix. They are buffered inside the db.
      
          typedef std::vector<Slice> SliceVector;
          virtual std::vector<bool> Filter(int level,
                                       const SliceVector& keys,
                                       const SliceVector& existing_values,
                                       std::vector<std::string>* new_values,
                                       std::vector<bool>* values_changed
                                       ) const = 0;
      
      Application can override the Filter() function to operate
      on the buffered kv pairs. More details in the inline documentation.
      
      Test Plan:
      make check. Added unit tests to make sure Keep, Delete,
      Change all works.
      
      Reviewers: haobo
      
      CCs: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15087
      b47812fb
  9. 21 3月, 2014 1 次提交
  10. 19 3月, 2014 1 次提交
  11. 18 3月, 2014 1 次提交
    • I
      Fix race condition in manifest roll · ae25742a
      Igor Canadi 提交于
      Summary:
      When the manifest is getting rolled the following happens:
      1) manifest_file_number_ is assigned to a new manifest number (even though the old one is still current)
      2) mutex is unlocked
      3) SetCurrentFile() creates temporary file manifest_file_number_.dbtmp
      4) SetCurrentFile() renames manifest_file_number_.dbtmp to CURRENT
      5) mutex is locked
      
      If FindObsoleteFiles happens between (3) and (4) it will:
      1) Delete manifest_file_number_.dbtmp (because it's not in pending_outputs_)
      2) Delete old manifest (because the manifest_file_number_ already points to a new one)
      
      I introduce the concept of prev_manifest_file_number_ that will avoid the race condition.
      
      However, we should discuss the future of MANIFEST file rolling. We found some race conditions with it last week and who knows how many more are there. Nobody is using it in production because we don't trust the implementation. Should we even support it?
      
      Test Plan: make check
      
      Reviewers: ljin, dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16929
      ae25742a
  12. 12 3月, 2014 2 次提交
    • S
      Fix data race against logging data structure because of LogBuffer · bd45633b
      sdong 提交于
      Summary:
      @igor pointed out that there is a potential data race because of the way we use the newly introduced LogBuffer. After "bg_compaction_scheduled_--" or "bg_flush_scheduled_--", they can both become 0. As soon as the lock is released after that, DBImpl's deconstructor can go ahead and deconstruct all the states inside DB, including the info_log object hold in a shared pointer of the options object it keeps. At that point it is not safe anymore to continue using the info logger to write the delayed logs.
      
      With the patch, lock is released temporarily for log buffer to be flushed before "bg_compaction_scheduled_--" or "bg_flush_scheduled_--". In order to make sure we don't miss any pending flush or compaction, a new flag bg_schedule_needed_ is added, which is set to be true if there is a pending flush or compaction but not scheduled because of the max thread limit. If the flag is set to be true, the scheduling function will be called before compaction or flush thread finishes.
      
      Thanks @igor for this finding!
      
      Test Plan: make all check
      
      Reviewers: haobo, igor
      
      Reviewed By: haobo
      
      CC: dhruba, ljin, yhchiang, igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D16767
      bd45633b
    • I
      [CF] db_stress for column families · 457c78eb
      Igor Canadi 提交于
      Summary:
      I had this diff for a while to test column families implementation. Last night, I ran it sucessfully for 10 hours with the command:
      
         time ./db_stress --threads=30 --ops_per_thread=200000000 --max_key=5000 --column_families=20 --clear_column_family_one_in=3000000 --verify_before_write=1  --reopen=50 --max_background_compactions=10 --max_background_flushes=10 --db=/tmp/db_stress
      
      It is ready to be committed :)
      
      Test Plan: Ran it for 10 hours
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16797
      457c78eb
  13. 11 3月, 2014 2 次提交
  14. 08 3月, 2014 2 次提交
  15. 06 3月, 2014 1 次提交
    • S
      Buffer info logs when picking compactions and write them out after releasing the mutex · ecb1ffa2
      sdong 提交于
      Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
      
      Test Plan:
      make all check
      check the log lines while running some tests that trigger compactions.
      
      Reviewers: haobo, igor, dhruba
      
      Reviewed By: dhruba
      
      CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
      
      Differential Revision: https://reviews.facebook.net/D16515
      ecb1ffa2
  16. 01 3月, 2014 1 次提交
    • Y
      Add ReadOptions to TransactionLogIterator. · a77527f2
      Yueh-Hsuan Chiang 提交于
      Summary:
      Add an optional input parameter ReadOptions to DB::GetUpdateSince(),
      which allows the verification of checksums to be disabled by setting
      ReadOptions::verify_checksums to false.
      
      Test Plan: Tests are done off-line and will not be included in the regular unit test.
      
      Reviewers: igor
      
      Reviewed By: igor
      
      CC: leveldb, xjin, dhruba
      
      Differential Revision: https://reviews.facebook.net/D16305
      a77527f2
  17. 28 2月, 2014 1 次提交
  18. 27 2月, 2014 1 次提交
  19. 26 2月, 2014 2 次提交
  20. 15 2月, 2014 2 次提交
    • I
      Fix table properties · 422bb09c
      Igor Canadi 提交于
      Summary: Adapt table properties to column family world
      
      Test Plan: make check
      
      Reviewers: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16161
      422bb09c
    • I
      [CF] DB test to run on non-default column family · c67d48c8
      Igor Canadi 提交于
      Summary:
      This is a huge diff and it was hectic, but the idea is actually quite simple. Every operation (Put, Get, etc.) done on default column family in DBTest is now forwarded to non-default ("pikachu"). The good news is that we had zero test failures! Column families look stable so far.
      
      One interesting test that I adapted for column families is MultiThreadedTest. I replaced every Put() with a WriteBatch writing to all column families concurrently. Every Put in the write batch contains unique_id. Instead of Get() I do a multiget across all column families with the same key. If atomicity holds, I expect to see the same unique_id in all column families.
      
      Test Plan: This is a test!
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16149
      c67d48c8
  21. 14 2月, 2014 1 次提交
  22. 13 2月, 2014 2 次提交
    • I
      [CF] Rethinking ColumnFamilyHandle and fix to dropping column families · b06840aa
      Igor Canadi 提交于
      Summary:
      The change to the public behavior:
      * When opening a DB or creating new column family client gets a ColumnFamilyHandle.
      * As long as column family handle is alive, client can do whatever he wants with it, even drop it
      * Dropped column family can still be read from (using the column family handle)
      * Added a new call CloseColumnFamily(). Client has to close all column families that he has opened before deleting the DB
      * As soon as column family is closed, any calls to DB using that column family handle will fail (also any outstanding calls)
      
      Internally:
      * Ref-counting ColumnFamilyData
      * New thread-safety for ColumnFamilySet
      * Dropped column families are now completely dropped and their memory cleaned-up
      
      Test Plan: added some tests to column_family_test
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16101
      b06840aa
    • L
      preload table handle on Recover() when max_open_files == -1 · 5fbf2ef4
      Lei Jin 提交于
      Summary: This covers existing table files before DB open happens and avoids contention on table cache
      
      Test Plan: db_test
      
      Reviewers: haobo, sdong, igor, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16089
      5fbf2ef4
  23. 06 2月, 2014 2 次提交
    • I
      [CF] Options -> DBOptions · f276e0e5
      Igor Canadi 提交于
      Summary: Replaced most of occurrences of Options with more specific DBOptions. This brings us very close to supporting different configuration options for each column family.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15933
      f276e0e5
    • I
      [CF] Rethink table cache · c24d8c4e
      Igor Canadi 提交于
      Summary:
      Adapting table cache to column families is interesting. We want table cache to be global LRU, so if some column families are use not as often as others, we want them to be evicted from cache. However, current TableCache object also constructs tables on its own. If table is not found in the cache, TableCache automatically creates new table. We want each column family to be able to specify different table factory.
      
      To solve the problem, we still have a single LRU, but we provide the LRUCache object to TableCache on construction. We have one TableCache per column family, but the underyling cache is shared by all TableCache objects.
      
      This allows us to have a global LRU, but still be able to support different table factories for different column families. Also, in the future it will also be able to support different directories for different column families.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15915
      c24d8c4e
  24. 05 2月, 2014 2 次提交
    • I
      [CF] Move InternalStats to ColumnFamilyData · 7b9f1349
      Igor Canadi 提交于
      Summary: InternalStats is a messy thing, keeping both DB data and column family data. However, it's better off living in ColumnFamilyData than in DBImpl. For now, at least.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu, haobo, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15879
      7b9f1349
    • I
      [CF] Split SanitizeOptions into two · 73f62255
      Igor Canadi 提交于
      Summary:
      There are three SanitizeOption-s now : one for DBOptions, one for ColumnFamilyOptions and one for Options (which just calls the other two)
      
      I have also reshuffled some options -- table_cache options and info_log should live in DBOptions, for example.
      
      Test Plan: make check doesn't complain
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15873
      73f62255
  25. 04 2月, 2014 3 次提交
  26. 01 2月, 2014 2 次提交
  27. 31 1月, 2014 1 次提交
  28. 30 1月, 2014 1 次提交
    • I
      InternalStatistics · 3c0dcf0e
      Igor Canadi 提交于
      Summary:
      In DBImpl we keep track of some statistics internally and expose them via GetProperty(). This diff encapsulates all the internal statistics into a class InternalStatisics. Most of it is copy/paste.
      
      Apart from cleaning up db_impl.cc, this diff is also necessary for Column families, since every column family should have its own CompactionStats, MakeRoomForWrite-stall stats, etc. It's much easier to keep track of it in every column family if it's nicely encapsulated in its own class.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu, haobo, sdong, emayanke
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15273
      3c0dcf0e