1. 13 12月, 2013 3 次提交
    • I
      [backupable db] Delete db_dir children when restoring backup · 417b453f
      Igor Canadi 提交于
      Summary:
      I realized that manifest will get deleted by PurgeObsoleteFiles in DBImpl, but it is sill cleaner to delete
      files before we restore the backup
      
      Test Plan: backupable_db_test
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14619
      417b453f
    • M
      Add monitoring for universal compaction and add counters for compaction IO · e9e6b00d
      Mark Callaghan 提交于
      Summary:
      Adds these counters
      { WAL_FILE_SYNCED, "rocksdb.wal.synced" }
        number of writes that request a WAL sync
      { WAL_FILE_BYTES, "rocksdb.wal.bytes" },
        number of bytes written to the WAL
      { WRITE_DONE_BY_SELF, "rocksdb.write.self" },
        number of writes processed by the calling thread
      { WRITE_DONE_BY_OTHER, "rocksdb.write.other" },
        number of writes not processed by the calling thread. Instead these were
        processed by the current holder of the write lock
      { WRITE_WITH_WAL, "rocksdb.write.wal" },
        number of writes that request WAL logging
      { COMPACT_READ_BYTES, "rocksdb.compact.read.bytes" },
        number of bytes read during compaction
      { COMPACT_WRITE_BYTES, "rocksdb.compact.write.bytes" },
        number of bytes written during compaction
      
      Per-interval stats output was updated with WAL stats and correct stats for universal compaction
      including a correct value for write-amplification. It now looks like:
                                     Compactions
      Level  Files Size(MB) Score Time(sec)  Read(MB) Write(MB)    Rn(MB)  Rnp1(MB)  Wnew(MB) RW-Amplify Read(MB/s) Write(MB/s)      Rn     Rnp1     Wnp1     NewW    Count  Ln-stall Stall-cnt
      --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        0        7      464  46.4       281      3411      3875      3411         0      3875        2.1      12.1        13.8      621        0      240      240      628       0.0         0
      Uptime(secs): 310.8 total, 2.0 interval
      Writes cumulative: 9999999 total, 9999999 batches, 1.0 per batch, 1.22 ingest GB
      WAL cumulative: 9999999 WAL writes, 9999999 WAL syncs, 1.00 writes per sync, 1.22 GB written
      Compaction IO cumulative (GB): 1.22 new, 3.33 read, 3.78 write, 7.12 read+write
      Compaction IO cumulative (MB/sec): 4.0 new, 11.0 read, 12.5 write, 23.4 read+write
      Amplification cumulative: 4.1 write, 6.8 compaction
      Writes interval: 100000 total, 100000 batches, 1.0 per batch, 12.5 ingest MB
      WAL interval: 100000 WAL writes, 100000 WAL syncs, 1.00 writes per sync, 0.01 MB written
      Compaction IO interval (MB): 12.49 new, 14.98 read, 21.50 write, 36.48 read+write
      Compaction IO interval (MB/sec): 6.4 new, 7.6 read, 11.0 write, 18.6 read+write
      Amplification interval: 101.7 write, 102.9 compaction
      Stalls(secs): 142.924 level0_slowdown, 0.000 level0_numfiles, 0.805 memtable_compaction, 0.000 leveln_slowdown
      Stalls(count): 132461 level0_slowdown, 0 level0_numfiles, 3 memtable_compaction, 0 leveln_slowdown
      
      Task ID: #3329644, #3301695
      
      Blame Rev:
      
      Test Plan:
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14583
      e9e6b00d
    • I
      portable %lu printing · 249e736b
      Igor Canadi 提交于
      249e736b
  2. 12 12月, 2013 5 次提交
    • I
      Add readrandom with both memtable and sst regression test · f5f5c645
      Igor Canadi 提交于
      Summary: @MarkCallaghan's tests indicate that performance with 8k rows in memtable is much worse than empty memtable. I wanted to add a regression tests that measures this effect, so we could optimize it. However, current config shows 634461 QPS on my devbox. Mark, any idea why this is so much faster than your measurements?
      
      Test Plan: Ran the regression test.
      
      Reviewers: MarkCallaghan, dhruba, haobo
      
      Reviewed By: MarkCallaghan
      
      CC: leveldb, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D14511
      f5f5c645
    • S
      Introduce MergeContext to Lazily Initialize merge operand list · a8029fdc
      Siying Dong 提交于
      Summary: In get operations, merge_operands is only used in few cases. Lazily initialize it can reduce average latency in some cases
      
      Test Plan: make all check
      
      Reviewers: haobo, kailiu, dhruba
      
      Reviewed By: haobo
      
      CC: igor, nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D14415
      
      Conflicts:
      	db/db_impl.cc
      	db/memtable.cc
      a8029fdc
    • S
      [RocksDB Performance Branch] Avoid sorting in Version::Get() by presorting... · bc5dd19b
      Siying Dong 提交于
      [RocksDB Performance Branch] Avoid sorting in Version::Get() by presorting them in VersionSet::Builder::SaveTo()
      
      Summary: Pre-sort files in VersionSet::Builder::SaveTo() so that when getting the value, no need to sort them. It can avoid the costs of vector operations and sorting in Version::Get().
      
      Test Plan: make all check
      
      Reviewers: haobo, kailiu, dhruba
      
      Reviewed By: dhruba
      
      CC: nkg-, igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D14409
      bc5dd19b
    • S
      When flushing mem tables, create iterators out of mutex · 0304e3d2
      Siying Dong 提交于
      Summary:
      creating new iterators of mem tables can be expensive. Move them out of mutex.
      DBImpl::WriteLevel0Table()'s mems seems to be a local vector and is only used by flushing. memtables to flush are also immutable, so it should be safe to do so.
      
      Test Plan: make all check
      
      Reviewers: haobo, dhruba, kailiu
      
      Reviewed By: dhruba
      
      CC: igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D14577
      
      Conflicts:
      	db/db_impl.cc
      0304e3d2
    • I
      [RocksDB perf] Cache speedup · e8d40c31
      Igor Canadi 提交于
      Summary:
      I have ran a get benchmark where all the data is in the cache and observed that most of the time is spent on waiting for lock in LRUCache.
      
      This is an effort to optimize LRUCache.
      
      Test Plan:
      The data was loaded with fillseq. Then, I ran a benchmark:
      
          /db_bench --db=/tmp/rocksdb_stat_bench --num=1000000 --benchmarks=readrandom --statistics=1 --use_existing_db=1 --threads=16 --disable_seek_compaction=1 --cache_size=20000000000 --cache_numshardbits=8 --table_cache_numshardbits=8
      
      I ran the benchmark three times. Here are the results:
      AFTER THE PATCH: 798072, 803998, 811807
      BEFORE THE PATCH: 782008, 815593, 763017
      
      Reviewers: dhruba, haobo, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14571
      e8d40c31
  3. 11 12月, 2013 8 次提交
  4. 10 12月, 2013 5 次提交
    • D
      Rename leveldb to rocksdb in C api · 6c4e110c
      Doğan Çeçen 提交于
      6c4e110c
    • D
      Fix shared lib build · f6012ab8
      Doğan Çeçen 提交于
      f6012ab8
    • I
      Fix unused variable warning · 784e62f9
      Igor Canadi 提交于
      784e62f9
    • I
      [RocksDB] BackupableDB · fb9fce4f
      Igor Canadi 提交于
      Summary:
      In this diff I present you BackupableDB v1. You can easily use it to backup your DB and it will do incremental snapshots for you.
      Let's first describe how you would use BackupableDB. It's inheriting StackableDB interface so you can easily construct it with your DB object -- it will add a method RollTheSnapshot() to the DB object. When you call RollTheSnapshot(), current snapshot of the DB will be stored in the backup dir. To restore, you can just call RestoreDBFromBackup() on a BackupableDB (which is a static method) and it will restore all files from the backup dir. In the next version, it will even support automatic backuping every X minutes.
      
      There are multiple things you can configure:
      1. backup_env and db_env can be different, which is awesome because then you can easily backup to HDFS or wherever you feel like.
      2. sync - if true, it *guarantees* backup consistency on machine reboot
      3. number of snapshots to keep - this will keep last N snapshots around if you want, for some reason, be able to restore from an earlier snapshot. All the backuping is done in incremental fashion - if we already have 00010.sst, we will not copy it again. *IMPORTANT* -- This is based on assumption that 00010.sst never changes - two files named 00010.sst from the same DB will always be exactly the same. Is this true? I always copy manifest, current and log files.
      4. You can decide if you want to flush the memtables before you backup, or you're fine with backing up the log files -- either way, you get a complete and consistent view of the database at a time of backup.
      5. More things you can find in BackupableDBOptions
      
      Here is the directory structure I use:
      
         backup_dir/CURRENT_SNAPSHOT - just 4 bytes holding the latest snapshot
                     0, 1, 2, ... - files containing serialized version of each snapshot - containing a list of files
                     files/*.sst - sst files shared between snapshots - if one snapshot references 00010.sst and another one needs to backup it from the DB, it will just reference the same file
                     files/ 0/, 1/, 2/, ... - snapshot directories containing private snapshot files - current, manifest and log files
      
      All the files are ref counted and deleted immediatelly when they get out of scope.
      
      Some other stuff in this diff:
      1. Added GetEnv() method to the DB. Discussed with @haobo and we agreed that it seems right thing to do.
      2. Fixed StackableDB interface. The way it was set up before, I was not able to implement BackupableDB.
      
      Test Plan:
      I have a unittest, but please don't look at this yet. I just hacked it up to help me with debugging. I will write a lot of good tests and update the diff.
      
      Also, `make asan_check`
      
      Reviewers: dhruba, haobo, emayanke
      
      Reviewed By: dhruba
      
      CC: leveldb, haobo
      
      Differential Revision: https://reviews.facebook.net/D14295
      fb9fce4f
    • I
      Fixing git branch detection in Jenkins · 26bc40a8
      Igor Canadi 提交于
      Branch detection did not work in Jenkins. I realized that it set
      GIT_BRANCH env variable to point to the current branch, so let's try
      using this for branch detection.
      26bc40a8
  5. 07 12月, 2013 3 次提交
    • I
      Print stack trace on assertion failure · 9644e0e0
      Igor Canadi 提交于
      Summary:
      This will help me a lot! When we hit an assertion in unittest, we get the whole stack trace now.
      
      Also, changed stack trace a bit, we now include actual demangled C++ class::function symbols!
      
      Test Plan: Added ASSERT_TRUE(false) to a test, observed a stack trace
      
      Reviewers: haobo, dhruba, kailiu
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14499
      9644e0e0
    • I
      Enable regression tests to be run on other branches · 07c84488
      Igor Canadi 提交于
      Summary: When running regression tests on other branches, this will push values to entity rocksdb_build.$git_branch
      
      Test Plan: Ran regression test on regression branch, observed values send to ODS in entity rocksdb_build.regression
      
      Reviewers: kailiu
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14493
      07c84488
    • I
      Make DBWithTTL more like StackableDB · 0a5ec498
      Igor Canadi 提交于
      Summary: Now DBWithTTL takes DB* and can behave more like StackableDB. This saves us a lot of duplicate work by defining interfaces
      
      Test Plan: ttl_test with ASAN - OK
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14481
      0a5ec498
  6. 06 12月, 2013 2 次提交
  7. 05 12月, 2013 2 次提交
  8. 04 12月, 2013 4 次提交
    • M
      Add compression options to db_bench · 97aa401e
      Mark Callaghan 提交于
      Summary:
      This adds 2 options for compression to db_bench:
      * universal_compression_size_percent
      * compression_level - to set zlib compression level
      It also logs compression_size_percent at startup in LOG
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      make check, run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14439
      97aa401e
    • S
      [rocksdb] statistics counters for memtable hits and misses · 28a1b9b9
      Sajal Jain 提交于
      Summary:
      added counters
      rocksdb.memtable.hit - for memtable hit
      rocksdb.memtable.miss - for memtable miss
      
      Test Plan: db_bench tests
      
      Reviewers: igor, dhruba, haobo
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D14433
      28a1b9b9
    • I
      Killing Transform Rep · eb12e47e
      Igor Canadi 提交于
      Summary:
      Let's get rid of TransformRep and it's children. We have confirmed that HashSkipListRep works better with multifeed, so there is no benefit to keeping this around.
      
      This diff is mostly just deleting references to obsoleted functions. I also have a diff for fbcode that we'll need to push when we switch to new release.
      
      I had to expose HashSkipListRepFactory in the client header files because db_impl.cc needs access to GetTransform() function for SanitizeOptions.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14397
      eb12e47e
    • I
      Get rid of some shared_ptrs · 043fc14c
      Igor Canadi 提交于
      Summary:
      I went through all remaining shared_ptrs and removed the ones that I found not-necessary. Only GenerateCachePrefix() is called fairly often, so don't expect much perf wins.
      
      The ones that are left are accessed infrequently and I think we're fine with keeping them.
      
      Test Plan: make asan_check
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14427
      043fc14c
  9. 03 12月, 2013 2 次提交
  10. 02 12月, 2013 3 次提交
  11. 30 11月, 2013 1 次提交
  12. 29 11月, 2013 1 次提交
  13. 28 11月, 2013 1 次提交