1. 11 5月, 2013 2 次提交
  2. 10 5月, 2013 1 次提交
  3. 04 5月, 2013 1 次提交
    • H
      [Rocksdb] Support Merge operation in rocksdb · 05e88540
      Haobo Xu 提交于
      Summary:
      This diff introduces a new Merge operation into rocksdb.
      The purpose of this review is mostly getting feedback from the team (everyone please) on the design.
      
      Please focus on the four files under include/leveldb/, as they spell the client visible interface change.
      include/leveldb/db.h
      include/leveldb/merge_operator.h
      include/leveldb/options.h
      include/leveldb/write_batch.h
      
      Please go over local/my_test.cc carefully, as it is a concerete use case.
      
      Please also review the impelmentation files to see if the straw man implementation makes sense.
      
      Note that, the diff does pass all make check and truly supports forward iterator over db and a version
      of Get that's based on iterator.
      
      Future work:
      - Integration with compaction
      - A raw Get implementation
      
      I am working on a wiki that explains the design and implementation choices, but coding comes
      just naturally and I think it might be a good idea to share the code earlier. The code is
      heavily commented.
      
      Test Plan: run all local tests
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: leveldb, zshao, sheki, emayanke, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D9651
      05e88540
  4. 03 5月, 2013 1 次提交
    • M
      Timestamp and TTL Wrapper for rocksdb · d786b25e
      Mayank Agarwal 提交于
      Summary:
      When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
      Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
      
      Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
      
      Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
      
      Reviewed By: vamsi
      
      CC: zshao, xjin, vkrest, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D10311
      d786b25e
  5. 23 4月, 2013 1 次提交
  6. 21 4月, 2013 1 次提交
    • H
      [RocksDB] CompactionFilter cleanup · b4243e5a
      Haobo Xu 提交于
      Summary:
      - removed the compaction_filter_value from the callback interface. Restrict compaction filter to purging values.
      - modify some comments to reflect curent status.
      
      Test Plan: make check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10335
      b4243e5a
  7. 12 4月, 2013 1 次提交
  8. 11 4月, 2013 1 次提交
    • H
      Set FD_CLOEXEC after each file open · e21ba94a
      heyongqiang 提交于
      Summary: as subject. This is causing problem in adsconv. Ideally, this flags should be set in open. But that is only supported in Linux kernel ≥2.6.23 and glibc ≥2.7.
      
      Test Plan:
      db_test
      
      run db_test
      
      Reviewers: dhruba, MarkCallaghan, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb, chip
      
      Differential Revision: https://reviews.facebook.net/D10089
      e21ba94a
  9. 28 3月, 2013 1 次提交
    • A
      memory manage statistics · 63f216ee
      Abhishek Kona 提交于
      Summary:
      Earlier Statistics object was a raw pointer. This meant the user had to clear up
      the Statistics object after creating the database. In most use cases the database is created in a function and the statistics pointer is out of scope. Hence the statistics object would never be deleted.
      Now Using a shared_ptr to manage this.
      
      Want this in before the next release.
      
      Test Plan: make all check.
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9735
      63f216ee
  10. 21 3月, 2013 1 次提交
    • D
      Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. · ad96563b
      Dhruba Borthakur 提交于
      Summary:
      This patch allows an application to specify whether to use bufferedio,
      reads-via-mmaps and writes-via-mmaps per database. Earlier, there
      was a global static variable that was used to configure this functionality.
      
      The default setting remains the same (and is backward compatible):
       1. use bufferedio
       2. do not use mmaps for reads
       3. use mmap for writes
       4. use readaheads for reads needed for compaction
      
      I also added a parameter to db_bench to be able to explicitly specify
      whether to do readaheads for compactions or not.
      
      Test Plan: make check
      
      Reviewers: sheki, heyongqiang, MarkCallaghan
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9429
      ad96563b
  11. 20 3月, 2013 1 次提交
  12. 09 3月, 2013 1 次提交
  13. 07 3月, 2013 1 次提交
    • A
      Do not allow Transaction Log Iterator to fall ahead when writer is writing the same file · d68880a1
      Abhishek Kona 提交于
      Summary:
      Store the last flushed, seq no. in db_impl. Check against it in
      transaction Log iterator. Do not attempt to read ahead if we do not know
      if the data is flushed completely.
      Does not work if flush is disabled. Any ideas on fixing that?
      * Minor change, iter->Next is called the first time automatically for
      * the first time.
      
      Test Plan:
      existing test pass.
      More ideas on testing this?
      Planning to run some stress test.
      
      Reviewers: dhruba, heyongqiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9087
      d68880a1
  14. 05 3月, 2013 1 次提交
    • Z
      [RocksDB] Add bulk_load option to Options and ldb · 7b435007
      Zheng Shao 提交于
      Summary:
      Add a shortcut function to make it easier for people
      to efficiently bulk_load data into RocksDB.
      
      Test Plan:
      Tried ldb with "--bulk_load" and "--bulk_load --compact" and verified the outcome.
      Needs to consult the team on how to test this automatically.
      
      Reviewers: sheki, dhruba, emayanke, heyongqiang
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8907
      7b435007
  15. 04 3月, 2013 2 次提交
    • M
      Add rate_delay_limit_milliseconds · 993543d1
      Mark Callaghan 提交于
      Summary:
      This adds the rate_delay_limit_milliseconds option to make the delay
      configurable in MakeRoomForWrite when the max compaction score is too high.
      This delay is called the Ln slowdown. This change also counts the Ln slowdown
      per level to make it possible to see where the stalls occur.
      
      From IO-bound performance testing, the Level N stalls occur:
      * with compression -> at the largest uncompressed level. This makes sense
                            because compaction for compressed levels is much
                            slower. When Lx is uncompressed and Lx+1 is compressed
                            then files pile up at Lx because the (Lx,Lx+1)->Lx+1
                            compaction process is the first to be slowed by
                            compression.
      * without compression -> at level 1
      
      Task ID: #1832108
      
      Blame Rev:
      
      Test Plan:
      run with real data, added test
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D9045
      993543d1
    • D
      Ability for rocksdb to compact when flushing the in-memory memtable to a file in L0. · 806e2643
      Dhruba Borthakur 提交于
      Summary:
      Rocks accumulates recent writes and deletes in the in-memory memtable.
      When the memtable is full, it writes the contents on the memtable to
      a file in L0.
      
      This patch removes redundant records at the time of the flush. If there
      are multiple versions of the same key in the memtable, then only the
      most recent one is dumped into the output file. The purging of
      redundant records occur only if the most recent snapshot is earlier
      than the earliest record in the memtable.
      
      Should we switch on this feature by default or should we keep this feature
      turned off in the default settings?
      
      Test Plan: Added test case to db_test.cc
      
      Reviewers: sheki, vamsi, emayanke, heyongqiang
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8991
      806e2643
  16. 01 3月, 2013 1 次提交
  17. 26 2月, 2013 1 次提交
  18. 23 2月, 2013 1 次提交
  19. 22 2月, 2013 1 次提交
    • A
      Counters for bytes written and read. · ec77366e
      Abhishek Kona 提交于
      Summary:
      * Counters for bytes read and write.
      as a part of this diff, I want to=>
      * Measure compaction times. @dhruba can you point which function, should
      * I time to get Compaction-times. Was looking at CompactRange.
      
      Test Plan: db_test
      
      Reviewers: dhruba, emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8763
      ec77366e
  20. 21 2月, 2013 1 次提交
    • A
      Introduce histogram in statistics.h · fe10200d
      Abhishek Kona 提交于
      Summary:
      * Introduce is histogram in statistics.h
      * stop watch to measure time.
      * introduce two timers as a poc.
      Replaced NULL with nullptr to fight some lint errors
      Should be useful for google.
      
      Test Plan:
      ran db_bench and check stats.
      make all check
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8637
      fe10200d
  21. 05 2月, 2013 1 次提交
    • K
      Allow the logs to be purged by TTL. · b63aafce
      Kai Liu 提交于
      Summary:
      * Add a SplitByTTLLogger to enable this feature. In this diff I implemented generalized AutoSplitLoggerBase class to simplify the
      development of such classes.
      * Refactor the existing AutoSplitLogger and fix several bugs.
      
      Test Plan:
      * Added a unit tests for different types of "auto splitable" loggers individually.
      * Tested the composited logger which allows the log files to be splitted by both TTL and log size.
      
      Reviewers: heyongqiang, dhruba
      
      Reviewed By: heyongqiang
      
      CC: zshao, leveldb
      
      Differential Revision: https://reviews.facebook.net/D8037
      b63aafce
  22. 01 2月, 2013 1 次提交
    • K
      Fixed cache key for block cache · 4dcc0c89
      Kosie van der Merwe 提交于
      Summary:
      Added function to `RandomAccessFile` to generate an unique ID for that file. Currently only `PosixRandomAccessFile` has this behaviour implemented and only on Linux.
      
      Changed how key is generated in `Table::BlockReader`.
      
      Added tests to check whether the unique ID is stable, unique and not a prefix of another unique ID. Added tests to see that `Table` uses the cache more efficiently.
      
      Test Plan: make check
      
      Reviewers: chip, vamsi, dhruba
      
      Reviewed By: chip
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8145
      4dcc0c89
  23. 26 1月, 2013 1 次提交
    • C
      Fix poor error on num_levels mismatch and few other minor improvements · 0b83a831
      Chip Turner 提交于
      Summary:
      Previously, if you opened a db with num_levels set lower than
      the database, you received the unhelpful message "Corruption:
      VersionEdit: new-file entry."  Now you get a more verbose message
      describing the issue.
      
      Also, fix handling of compression_levels (both the run-over-the-end
      issue and the memory management of it).
      
      Lastly, unique_ptr'ify a couple of minor calls.
      
      Test Plan: make check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8151
      0b83a831
  24. 25 1月, 2013 1 次提交
    • C
      Use fallocate to prevent excessive allocation of sst files and logs · 3dafdfb2
      Chip Turner 提交于
      Summary:
      On some filesystems, pre-allocation can be a considerable
      amount of space.  xfs in our production environment pre-allocates by
      1GB, for instance.  By using fallocate to inform the kernel of our
      expected file sizes, we eliminate this wasteage (that isn't recovered
      until the file is closed which, in the case of LOG files, can be a
      considerable amount of time).
      
      Test Plan:
      created an xfs loopback filesystem, mounted with
      allocsize=4M, and ran db_stress.  LOG file without this change was 4M,
      and with it it was 128k then grew to normal size.
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: adsharma, leveldb
      
      Differential Revision: https://reviews.facebook.net/D7953
      3dafdfb2
  25. 24 1月, 2013 1 次提交
    • C
      Fix a number of object lifetime/ownership issues · 2fdf91a4
      Chip Turner 提交于
      Summary:
      Replace manual memory management with std::unique_ptr in a
      number of places; not exhaustive, but this fixes a few leaks with file
      handles as well as clarifies semantics of the ownership of file handles
      with log classes.
      
      Test Plan: db_stress, make check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: zshao, leveldb, heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D8043
      2fdf91a4
  26. 18 1月, 2013 1 次提交
  27. 17 1月, 2013 2 次提交
  28. 15 1月, 2013 1 次提交
    • C
      Various build cleanups/improvements · c0cb289d
      Chip Turner 提交于
      Summary:
      Specific changes:
      
      1) Turn on -Werror so all warnings are errors
      2) Fix some warnings the above now complains about
      3) Add proper dependency support so changing a .h file forces a .c file
      to rebuild
      4) Automatically use fbcode gcc on any internal machine rather than
      whatever system compiler is laying around
      5) Fix jemalloc to once again be used in the builds (seemed like it
      wasn't being?)
      6) Fix issue where 'git' would fail in build_detect_version because of
      LD_LIBRARY_PATH being set in the third-party build system
      
      Test Plan:
      make, make check, make clean, touch a header file, make sure
      rebuild is expected
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D7887
      c0cb289d
  29. 20 12月, 2012 1 次提交
    • D
      Enhance ReadOnly mode to process the all committed transactions. · f4c2b7cf
      Dhruba Borthakur 提交于
      Summary:
      Leveldb has an api OpenForReadOnly() that opens the database
      in readonly mode. This call had an option to not process the
      transaction log.  This patch removes this option and always
      processes all transactions that had been committed. It has
      been done in such a way that it does not create/write to
      any new files in the process. The invariant of "no-writes"
      to the leveldb data directory is still true.
      
      This enhancement allows multiple threads to open the same database
      in readonly mode and access all trancations that were committed right
      upto the OpenForReadOnly call.
      
      I changed the public API to match the new semantics because
      there are no users who are currently using this api.
      
      Test Plan: make clean check
      
      Reviewers: sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7479
      f4c2b7cf
  30. 18 12月, 2012 1 次提交
    • D
      Enhancements to rocksdb for better support for replication. · 3d1e92b0
      Dhruba Borthakur 提交于
      Summary:
      1. The OpenForReadOnly() call should not lock the db. This is useful
      so that multiple processes can open the same database concurrently
      for reading.
      2. GetUpdatesSince should not error out if the archive directory
      does not exist.
      3. A new constructor for WriteBatch that can takes a serialized
      string as a parameter of the constructor.
      
      Test Plan: make clean check
      
      Reviewers: sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7449
      3d1e92b0
  31. 17 12月, 2012 1 次提交
    • Z
      manifest_dump: Add --hex=1 option · c2809753
      Zheng Shao 提交于
      Summary: Without this option, manifest_dump does not print binary keys for files in a human-readable way.
      
      Test Plan:
      ./manifest_dump --hex=1 --verbose=0 --file=/data/users/zshao/fdb_comparison/leveldb/fbobj.apprequest-0_0_original/MANIFEST-000002
      manifest_file_number 589 next_file_number 590 last_sequence 2311567 log_number 543  prev_log_number 0
      --- level 0 --- version# 0 ---
       532:1300357['0000455BABE20000' @ 2183973 : 1 .. 'FFFCA5D7ADE20000' @ 2184254 : 1]
       536:1308170['000198C75CE30000' @ 2203313 : 1 .. 'FFFCF94A79E30000' @ 2206463 : 1]
       542:1321644['0002931AA5E50000' @ 2267055 : 1 .. 'FFF77B31C5E50000' @ 2270754 : 1]
       544:1286390['000410A309E60000' @ 2278592 : 1 .. 'FFFE470A73E60000' @ 2289221 : 1]
       538:1298778['0006BCF4D8E30000' @ 2217050 : 1 .. 'FFFD77DAF7E30000' @ 2220489 : 1]
       540:1282353['00090D5356E40000' @ 2231156 : 1 .. 'FFFFF4625CE40000' @ 2231969 : 1]
      --- level 1 --- version# 0 ---
       510:2112325['000007F9C2D40000' @ 1782099 : 1 .. '146F5B67B8D80000' @ 1905458 : 1]
       511:2121742['146F8A3023D60000' @ 1824388 : 1 .. '28BC8FBB9CD40000' @ 1777993 : 1]
       512:801631['28BCD396F1DE0000' @ 2080191 : 1 .. '3082DBE9ADDB0000' @ 1989927 : 1]
      
      Reviewers: dhruba, sheki, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7425
      c2809753
  32. 13 12月, 2012 2 次提交
  33. 11 12月, 2012 1 次提交
  34. 08 12月, 2012 1 次提交
    • A
      GetUpdatesSince API to enable replication. · 80550089
      Abhishek Kona 提交于
      Summary:
      How it works:
      * GetUpdatesSince takes a SequenceNumber.
      * A LogFile with the first SequenceNumber nearest and lesser than the requested Sequence Number is found.
      * Seek in the logFile till the requested SeqNumber is found.
      * Return an iterator which contains logic to return record's one by one.
      
      Test Plan:
      * Test case included to check the good code path.
      * Will update with more test-cases.
      * Feedback required on test-cases.
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7119
      80550089
  35. 29 11月, 2012 2 次提交
    • S
      Move WAL files to archive directory, instead of deleting. · d4627e6d
      sheki 提交于
      Summary:
      Create a directory "archive" in the DB directory.
      During DeleteObsolteFiles move the WAL files (*.log) to the Archive directory,
      instead of deleting.
      
      Test Plan: Created a DB using DB_Bench. Reopened it. Checked if files move.
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6975
      d4627e6d
    • A
      Fix all the lint errors. · d29f1819
      Abhishek Kona 提交于
      Summary:
      Scripted and removed all trailing spaces and converted all tabs to
      spaces.
      
      Also fixed other lint errors.
      All lint errors from this point of time should be taken seriously.
      
      Test Plan: make all check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7059
      d29f1819