1. 31 1月, 2014 3 次提交
    • I
      Enable iterating column families with a concurrent writer · 9ca638a8
      Igor Canadi 提交于
      Summary:
      Sometimes we iterate through column families, and unlock the mutex in the body of the iteration. While mutex is unlocked, some column family might be created or dropped. We need to be able to continue iterating through column families even though our current column family got dropped.
      
      This diff implements circular linked lists that connect all column families. It then uses the link list to enable iterating through linked lists. Even if the column family is dropped, its next_ pointer still can be used to advance to another alive column family.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15603
      9ca638a8
    • I
      MakeRoomForWrite() support for column families · 6973bb17
      Igor Canadi 提交于
      Summary: Making room for write will be the hardest part of the column family implementation. For now, I just iterate through all column families and run MakeRoomForWrite() for every one.
      
      Test Plan: make check does not complain
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15597
      6973bb17
    • I
      Merge branch 'master' into columnfamilies · c37e7de6
      Igor Canadi 提交于
      Conflicts:
      	db/db_impl.cc
      	db/db_impl.h
      c37e7de6
  2. 30 1月, 2014 14 次提交
    • I
      InternalStatistics · 3c0dcf0e
      Igor Canadi 提交于
      Summary:
      In DBImpl we keep track of some statistics internally and expose them via GetProperty(). This diff encapsulates all the internal statistics into a class InternalStatisics. Most of it is copy/paste.
      
      Apart from cleaning up db_impl.cc, this diff is also necessary for Column families, since every column family should have its own CompactionStats, MakeRoomForWrite-stall stats, etc. It's much easier to keep track of it in every column family if it's nicely encapsulated in its own class.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu, haobo, sdong, emayanke
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15273
      3c0dcf0e
    • L
      set bg_error_ when background flush goes wrong · d118707f
      Lei Jin 提交于
      Summary: as title
      
      Test Plan: unit test
      
      Reviewers: haobo, igor, sdong, kailiu, dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15435
      d118707f
    • I
      Fix some lint warnings · 514e42c7
      Igor Canadi 提交于
      514e42c7
    • I
      Change ColumnFamilyData from struct to class · fa99d53e
      Igor Canadi 提交于
      Summary: ColumnFamilyData grew a lot, there's much more data that it holds now. It makes more sense to encapsulate it better by making it a class.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15579
      fa99d53e
    • L
      convert Tickers back to array with padding and alignment · fb84c49a
      Lei Jin 提交于
      Summary:
      Pad each Ticker structure to be 64 bytes and make them align on 64 bytes
      boundary to avoid cache line false sharing issue.
      Please refer to task 3615553 for more details
      
      Test Plan:
      db_bench
      
      LevelDB:    version 2.0s
      Date:       Wed Jan 29 12:23:17 2014
      CPU:        32 * Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
      CPUCache:   20480 KB
      rocksdb.build.overwrite.qps 49638
      rocksdb.build.overwrite.p50_micros 58.73
      rocksdb.build.overwrite.p75_micros 210.56
      rocksdb.build.overwrite.p99_micros 733.28
      rocksdb.build.fillseq.qps 366729
      rocksdb.build.fillseq.p50_micros 1.00
      rocksdb.build.fillseq.p75_micros 1.00
      rocksdb.build.fillseq.p99_micros 2.65
      rocksdb.build.readrandom.qps 1152995
      rocksdb.build.readrandom.p50_micros 11.27
      rocksdb.build.readrandom.p75_micros 15.69
      rocksdb.build.readrandom.p99_micros 33.59
      rocksdb.build.readrandom_smallblockcache.qps 956047
      rocksdb.build.readrandom_smallblockcache.p50_micros 15.23
      rocksdb.build.readrandom_smallblockcache.p75_micros 17.31
      rocksdb.build.readrandom_smallblockcache.p99_micros 31.49
      rocksdb.build.readrandom_memtable_sst.qps 1105183
      rocksdb.build.readrandom_memtable_sst.p50_micros 12.04
      rocksdb.build.readrandom_memtable_sst.p75_micros 15.78
      rocksdb.build.readrandom_memtable_sst.p99_micros 32.49
      rocksdb.build.readrandom_fillunique_random.qps 487856
      rocksdb.build.readrandom_fillunique_random.p50_micros 29.65
      rocksdb.build.readrandom_fillunique_random.p75_micros 40.93
      rocksdb.build.readrandom_fillunique_random.p99_micros 78.68
      rocksdb.build.memtablefillrandom.qps 91304
      rocksdb.build.memtablefillrandom.p50_micros 171.05
      rocksdb.build.memtablefillrandom.p75_micros 196.12
      rocksdb.build.memtablefillrandom.p99_micros 291.73
      rocksdb.build.memtablereadrandom.qps 1340411
      rocksdb.build.memtablereadrandom.p50_micros 9.48
      rocksdb.build.memtablereadrandom.p75_micros 13.95
      rocksdb.build.memtablereadrandom.p99_micros 30.36
      rocksdb.build.readwhilewriting.qps 491004
      rocksdb.build.readwhilewriting.p50_micros 29.58
      rocksdb.build.readwhilewriting.p75_micros 40.34
      rocksdb.build.readwhilewriting.p99_micros 76.78
      
      Reviewers: igor, haobo
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15573
      fb84c49a
    • I
      Fix column family test (create directory) · 15999e72
      Igor Canadi 提交于
      15999e72
    • I
      PurgeObsoleteFiles in DropColumnFamily · 4662969b
      Igor Canadi 提交于
      Summary: When we drop the column family, we want to delete all the files from that column family.
      
      Test Plan: make check
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15561
      4662969b
    • I
      Merge branch 'master' into columnfamilies · 20b231d7
      Igor Canadi 提交于
      20b231d7
    • I
      Read from and write to different column families · f24a3ee5
      Igor Canadi 提交于
      Summary: This one is big. It adds ability to write to and read from different column families (see the unit test). It also supports recovery of different column families from log, which was the hardest part to reason about. We need to make sure to never delete the log file which has unflushed data from any column family. To support that, I added another concept, which is versions_->MinLogNumber()
      
      Test Plan: Added a unit test in column_family_test
      
      Reviewers: dhruba, haobo, sdong, kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15537
      f24a3ee5
    • K
      LIBNAME in Makefile is not really configurable · b7db2411
      Kai Liu 提交于
      Summary:
      In new third-party release tool, `LIBNAME=<customized_library> make`
      will not really change the LIBNAME.
      
      However it's very odd that the same approach works with old third-party
      release tools. I checked previous rocksdb version and both librocksdb.a
      and librocksdb_debug.a were correctly generated and copied to the
      right place.
      
      Test Plan:
      `LIBNAME=hello make -j32` generates hello.a
      `make -j32` generates librocksdb.a
      
      Reviewers: igor, sdong, haobo, dhruba
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15555
      b7db2411
    • K
      Canonicalize "RocksDB" in make_new_version.sh · b1874af8
      kailiu 提交于
      Summary: Change all occurrences of "rocksdb" to its canonical form "RocksDB".
      
      Test Plan: N/A
      
      Reviewers: igor
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15549
      b1874af8
    • K
      Improve make_new_version.sh · c9eef784
      kailiu 提交于
      c9eef784
    • I
      Installation instructions for CentOS · 9a597dc6
      Igor Canadi 提交于
      9a597dc6
    • I
      add include <atomic> to version_set.h · e57f0cc1
      Igor Canadi 提交于
      e57f0cc1
  3. 29 1月, 2014 7 次提交
    • K
      Add history log and revise script · 9fe60d50
      kailiu 提交于
      Summary:
      * Add a change log for rocksdb releases.
      * Remove the hacky parts of make_new_version.sh, which are either
        no longer useful or will be done in our dedicated 3rd-party release
        tool.
      
      Test Plan: N/A
      
      Reviewers: igor, haobo, sdong, dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15543
      9fe60d50
    • I
      Merge branch 'master' into columnfamilies · c1071ed9
      Igor Canadi 提交于
      c1071ed9
    • L
      only corrupt private file checksum in backupable_db_test · 9a126ba3
      Lei Jin 提交于
      Summary:
      if it happens (randomly) to corrupt shared file in the test, then the
          checksum will be inconsistent between meta files from different backup.
          BackupEngine will then detect this issue and fail. But in reality, this
          does not happen since the checksum is checked on every backup. So here,
          only corrupt checksum of private file to let BackupEngine to construct
          properly (but fail during restore).
      
      Test Plan: run test with valgrind
      
      Reviewers: igor
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15531
      9a126ba3
    • I
      Only get the manifest file size if there is no error · 5d2c6282
      Igor Canadi 提交于
      Summary:
      I came across this while working on column families. CorruptionTest::RecoverWriteError threw a SIGSEG because the descriptor_log_->file() was nullptr. I'm not sure why it doesn't happen in master, but better safe than sorry.
      
      @kailiu, can we get this in release, too?
      
      Test Plan: make check
      
      Reviewers: kailiu, dhruba, haobo
      
      Reviewed By: haobo
      
      CC: leveldb, kailiu
      
      Differential Revision: https://reviews.facebook.net/D15513
      5d2c6282
    • I
      Better interface to create BackupEngine · e5ec7384
      Igor Canadi 提交于
      Summary: I think it looks nicer. In RocksDB we have both styles, but I think that static method is the more common version.
      
      Test Plan: backupable_db_test
      
      Reviewers: ljin, benj, swk
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15519
      e5ec7384
    • I
      Export BackupEngine · ec2fa4a6
      Igor Canadi 提交于
      Summary:
      Lots of clients have problems with using StackableDB interface. It's nice to have BackupableDB as a layer on top of DB, but not necessary.
      
      This diff exports BackupEngine, which can be used to create backups without forcing clients to use StackableDB interface.
      
      Test Plan: backupable_db_test
      
      Reviewers: dhruba, ljin, swk
      
      Reviewed By: ljin
      
      CC: leveldb, benj
      
      Differential Revision: https://reviews.facebook.net/D15477
      ec2fa4a6
    • L
      add checksum for backup files · 9dc29414
      Lei Jin 提交于
      Summary: Keep checksum of each backuped file in meta file. When it restores these files, compute their checksum on the fly and compare against what is in the meta file. Fail the restore process if checksum mismatch.
      
      Test Plan: unit test
      
      Reviewers: haobo, igor, sdong, kailiu
      
      Reviewed By: igor
      
      CC: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D15381
      9dc29414
  4. 28 1月, 2014 9 次提交
    • I
      [column families] Removing VersionSet::current() · 4bf25357
      Igor Canadi 提交于
      Summary: Instead of VersionSet::current(), DBImpl uses default_cfd_->current directly.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15483
      4bf25357
    • M
      Update monitoring to include average time per compaction and stall · 90f29ccb
      Mark Callaghan 提交于
      Summary:
      The new columns are msComp and msStall that provide average time per compaction and stall for that level in milliseconds.
      Level  Files Size(MB) Score Time(sec)  Read(MB) Write(MB)    Rn(MB)  Rnp1(MB)  Wnew(MB) RW-Amplify Read(MB/s) Write(MB/s)      Rn     Rnp1     Wnp1     NewW    Count   msComp   msStall  Ln-stall Stall-cnt
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        0        8       15   1.5         2         0        30         0         0        30        0.0       0.0        15.5        0        0        0        0       16      112       0.2       1.3      7568
        1        8       16   1.6         1        26        26        15        11        16        3.5      17.6        18.1        8        6       13        7        3      362       0.0       0.0         0
        2        1        2   0.0         0         0         2         0         0         2        0.0       0.0        18.4        0        0        0        0        1       50       0.0       0.0         0
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15345
      90f29ccb
    • S
      Fix UnmarkEOF for partial blocks · 3d33da75
      Schalk-Willem Kruger 提交于
      Summary:
      Blocks in the transaction log are a fixed size, but the last block in the transaction log file is usually a partial block. When a new record is added after the reader hit the end of the file, a new physical record will be appended to the last block. ReadPhysicalRecord can only read full blocks and assumes that the file position indicator is aligned to the start of a block. If the reader is forced to read further by simply clearing the EOF flag, ReadPhysicalRecord will read a full block starting from somewhere in the middle of a real block, causing it to lose alignment and to have a partial physical record at the end of the read buffer. This will result in length mismatches and checksum failures. When the log file is tailed for replication this will cause the log iterator to become invalid, necessitating the creation of a new iterator which will have to read the log file from scratch.
      
      This diff fixes this issue by reading the remaining portion of the last block we read from. This is done when the reader is forced to read further (UnmarkEOF is called).
      
      Test Plan:
      - Added unit tests
      - Stress test (with replication). Check dbdir/LOG file for corruptions.
      - Test on test tier
      
      Reviewers: emayanke, haobo, dhruba
      
      Reviewed By: haobo
      
      CC: vamsi, sheki, dhruba, kailiu, igor
      
      Differential Revision: https://reviews.facebook.net/D15249
      3d33da75
    • I
      LogAndApply to take ColumnFamilyData · 511b03a5
      Igor Canadi 提交于
      Summary: This removes the default implementation of LogAndApply that applied the changed to the default column family by default. It is mostly simple reformatting.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15465
      511b03a5
    • I
      [column families] Move memtable and immutable memtable list to column family data · eb055609
      Igor Canadi 提交于
      Summary: All memtables and immutable memtables are moved from DBImpl to ColumnFamilyData. For now, they are all referenced from default column family in DBImpl. It shouldn't be hard to get them from custom column family.
      
      Test Plan: make check
      
      Reviewers: dhruba, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15459
      eb055609
    • I
      Merge branch 'master' into columnfamilies · ae16606f
      Igor Canadi 提交于
      Conflicts:
      	db/version_set.cc
      	db/version_set.h
      ae16606f
    • I
      Fsync directory after we create a new file · 832158e7
      Igor Canadi 提交于
      Summary:
      @dhruba, I'm not sure where we need to sync the directory. I implemented the function in Env() and added the dir sync just after we close the newly created file in the builder.
      
      Should I also add FsyncDir() to new files that get created by a compaction?
      
      Test Plan: Confirmed that FsyncDir is returning Status::OK()
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D14751
      832158e7
    • I
      Merge branch 'master' into columnfamilies · cf783c67
      Igor Canadi 提交于
      Conflicts:
      	db/version_set.h
      cf783c67
    • I
      Move NeedsCompaction() from VersionSet to Version · 6c2ca1d3
      Igor Canadi 提交于
      Summary: There is no reason to have functions NeedCompaction(), MaxCompactionScore() and MaxCompactionScoreLevel() in VersionSet, since they don't access any data in VersionSet.
      
      Test Plan: make check
      
      Reviewers: kailiu, haobo, sdong
      
      Reviewed By: kailiu
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15333
      6c2ca1d3
  5. 27 1月, 2014 1 次提交
  6. 26 1月, 2014 1 次提交
  7. 25 1月, 2014 5 次提交