- 12 11月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: This patch clarifies and refactors the logic around tracked keys in transactions. Closes https://github.com/facebook/rocksdb/pull/3140 Differential Revision: D6290258 Pulled By: maysamyabandeh fbshipit-source-id: 03b50646264cbcc550813c060b180fc7451a55c1
-
- 02 11月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: GetCommitTimeWriteBatch is currently used to store some state as part of commit in 2PC. In MyRocks it is specifically used to store some data that would be needed only during recovery. So it is not need to be stored in memtable right after each commit. This patch enables an optimization to write the GetCommitTimeWriteBatch only to the WAL. The batch will be written to memtable during recovery when the WAL is replayed. To cover the case when WAL is deleted after memtable flush, the batch is also buffered and written to memtable right before each memtable flush. Closes https://github.com/facebook/rocksdb/pull/3071 Differential Revision: D6148023 Pulled By: maysamyabandeh fbshipit-source-id: 2d09bae5565abe2017c0327421010d5c0d55eaa7
-
- 03 10月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: Implement the rollback of WritePrepared txns. For each modified value, it reads the value before the txn and write it back. This would cancel out the effect of transaction. It also remove the rolled back txn from prepared heap. Closes https://github.com/facebook/rocksdb/pull/2946 Differential Revision: D5937575 Pulled By: maysamyabandeh fbshipit-source-id: a6d3c47f44db3729f44b287a80f97d08dc4e888d
-
- 14 9月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: We had two proposals for lock-free commit maps. This patch implements the latter one that was simpler. We can later experiment with both proposals. In this impl each entry is an std::atomic of uint64_t, which are accessed via memory_order_acquire/release. In x86_64 arch this is compiled to simple reads and writes from memory. Closes https://github.com/facebook/rocksdb/pull/2861 Differential Revision: D5800724 Pulled By: maysamyabandeh fbshipit-source-id: 41abae9a4a5df050a8eb696c43de11c2770afdda
-
- 09 9月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: Implements CommitBatch and CommitWithoutPrepare for WritePreparedTxn Closes https://github.com/facebook/rocksdb/pull/2854 Differential Revision: D5793999 Pulled By: maysamyabandeh fbshipit-source-id: d8b9858221162c6ac7a1f6912cbd3481d0d8a503
-
- 17 8月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: Implement the main body of WritePrepared pseudo code. This includes PrepareInternal and CommitInternal, as well as AddCommitted which updates the commit map. It also provides a IsInSnapshot method that could be later called form the read path to decide if a version is in the read snapshot or it should other be skipped. This patch lacks unit tests and does not attempt to offer an efficient implementation. The idea is that to have the API specified so that we can work on related tasks in parallel. Closes https://github.com/facebook/rocksdb/pull/2713 Differential Revision: D5640021 Pulled By: maysamyabandeh fbshipit-source-id: bfa7a05e8d8498811fab714ce4b9c21530514e1c
-
- 08 8月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: This patch splits Commit and Prepare into lock-related logic and db-write-related logic. It moves lock-related logic to PessimisticTransaction to be reused by all children classes and movies the existing impl of db-write-related to PrepareInternal, CommitSingleInternal, and CommitInternal in WriteCommittedTxnImpl. Closes https://github.com/facebook/rocksdb/pull/2691 Differential Revision: D5569464 Pulled By: maysamyabandeh fbshipit-source-id: d1b8698e69801a4126c7bc211745d05c636f5325
-
- 06 8月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: This opens space for the new implementations of TransactionDBImpl such as WritePreparedTxnDBImpl that has a different policy of how to write to DB. Closes https://github.com/facebook/rocksdb/pull/2689 Differential Revision: D5568918 Pulled By: maysamyabandeh fbshipit-source-id: f7eac866e175daf3793ae79da108f65cc7dc7b25
-
- 03 8月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: This patch refactors TransactionImpl by separating the logic for pessimistic concurrency control from the implementation of how to write the data to rocksdb. The existing implementation is named WriteCommittedTxnImpl as it writes committed data to the db. A template named WritePreparedTxnImpl is also added which will be later completed to provide a an alternative implementation. Closes https://github.com/facebook/rocksdb/pull/2676 Differential Revision: D5549998 Pulled By: maysamyabandeh fbshipit-source-id: 16298e86b43ca4849324c1f35c731913c6d17bec
-
- 29 7月, 2017 1 次提交
-
-
由 Siying Dong 提交于
Summary: Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available. Some nontrivial changes: 1. Add Comparator::GetRootComparator() to get around the internal comparator hack 2. Add the two experiemental functions to DB 3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string 4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric. Closes https://github.com/facebook/rocksdb/pull/2645 Differential Revision: D5502723 Pulled By: siying fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c
-
- 22 7月, 2017 2 次提交
-
-
由 Sagar Vemuri 提交于
Summary: This reverts the previous commit 1d7048c5, which broke the build. Did a `git revert 1d7048c5`. Closes https://github.com/facebook/rocksdb/pull/2627 Differential Revision: D5476473 Pulled By: sagar0 fbshipit-source-id: 4756ff5c0dfc88c17eceb00e02c36176de728d06
-
由 Victor Gao 提交于
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually. Reviewed By: igorsugak Differential Revision: D5454343 fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
-
- 16 7月, 2017 1 次提交
-
-
由 Siying Dong 提交于
Summary: Closes https://github.com/facebook/rocksdb/pull/2589 Differential Revision: D5431502 Pulled By: siying fbshipit-source-id: 8ebf8c87883daa9daa54b2303d11ce01ab1f6f75
-
- 25 6月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: Throughput: 46k tps in our sysbench settings (filling the details later) The idea is to have the simplest change that gives us a reasonable boost in 2PC throughput. Major design changes: 1. The WAL file internal buffer is not flushed after each write. Instead it is flushed before critical operations (WAL copy via fs) or when FlushWAL is called by MySQL. Flushing the WAL buffer is also protected via mutex_. 2. Use two sequence numbers: last seq, and last seq for write. Last seq is the last visible sequence number for reads. Last seq for write is the next sequence number that should be used to write to WAL/memtable. This allows to have a memtable write be in parallel to WAL writes. 3. BatchGroup is not used for writes. This means that we can have parallel writers which changes a major assumption in the code base. To accommodate for that i) allow only 1 WriteImpl that intends to write to memtable via mem_mutex_--which is fine since in 2PC almost all of the memtable writes come via group commit phase which is serial anyway, ii) make all the parts in the code base that assumed to be the only writer (via EnterUnbatched) to also acquire mem_mutex_, iii) stat updates are protected via a stat_mutex_. Note: the first commit has the approach figured out but is not clean. Submitting the PR anyway to get the early feedback on the approach. If we are ok with the approach I will go ahead with this updates: 0) Rebase with Yi's pipelining changes 1) Currently batching is disabled by default to make sure that it will be consistent with all unit tests. Will make this optional via a config. 2) A couple of unit tests are disabled. They need to be updated with the serial commit of 2PC taken into account. 3) Replacing BatchGroup with mem_mutex_ got a bit ugly as it requires releasing mutex_ beforehand (the same way EnterUnbatched does). This needs to be cleaned up. Closes https://github.com/facebook/rocksdb/pull/2345 Differential Revision: D5210732 Pulled By: maysamyabandeh fbshipit-source-id: 78653bd95a35cd1e831e555e0e57bdfd695355a4
-
- 28 4月, 2017 1 次提交
-
-
由 Siying Dong 提交于
Summary: Closes https://github.com/facebook/rocksdb/pull/2226 Differential Revision: D4967547 Pulled By: siying fbshipit-source-id: dd3b58ae1e7a106ab6bb6f37ab5c88575b125ab4
-
- 11 4月, 2017 2 次提交
-
-
由 Manuel Ung 提交于
Summary: Upgrading a shared lock was silently succeeding because the actual locking code was skipped. This is because if the keys are tracked, it is assumed that they are already locked and do not require locking. Fix this by recording in tracked keys whether the key was locked exclusively or not. Note that lock downgrades are impossible, which is the behaviour we expect. This fixes facebook/mysql-5.6#587. Closes https://github.com/facebook/rocksdb/pull/2122 Differential Revision: D4861489 Pulled By: IslamAbdelRahman fbshipit-source-id: 58c7ebe7af098bf01b9774b666d3e9867747d8fd
-
由 Manuel Ung 提交于
Summary: Extend TransactionOptions to include max_write_batch_size which determines the maximum size of the writebatch representation. If memory limit is exceeded, the operation will abort with subcode kMemoryLimit. Closes https://github.com/facebook/rocksdb/pull/2124 Differential Revision: D4861842 Pulled By: lth fbshipit-source-id: 46fd172ea67cc90bbba829bf0d70cfab2261c161
-
- 06 12月, 2016 1 次提交
-
-
由 Manuel Ung 提交于
Summary: This is an implementation of non-exclusive locks for pessimistic transactions. It is relatively simple and does not prevent starvation (ie. it's possible that request for exclusive access will never be granted if there are always threads holding shared access). It is done by changing `KeyLockInfo` to hold an set a transaction ids, instead of just one, and adding a flag specifying whether this lock is currently held with exclusive access or not. Some implementation notes: - Some lock diagnostic functions had to be updated to return a set of transaction ids for a given lock, eg. `GetWaitingTxn` and `GetLockStatusData`. - Deadlock detection is a bit more complicated since a transaction can now wait on multiple other transactions. A BFS is done in this case, and deadlock detection depth is now just a limit on the number of transactions we visit. - Expirable transactions do not work efficiently with shared locks at the moment, but that's okay for now. Closes https://github.com/facebook/rocksdb/pull/1573 Differential Revision: D4239097 Pulled By: lth fbshipit-source-id: da7c074
-
- 20 10月, 2016 1 次提交
-
-
由 Manuel Ung 提交于
Summary: Implement deadlock detection. This is done by maintaining a TxnID -> TxnID map which represents the edges in the wait for graph (this is named `wait_txn_map_`). Test Plan: transaction_test Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64491
-
- 08 10月, 2016 2 次提交
-
-
由 Reid Horuff 提交于
Summary: This exposes a transactions state through a public api rather than through a public member variable. I also do some name refactoring. ExecutionStatus => TransactionState exec_status_ => trx_state_ Test Plan: It compiles and transaction_test passes. Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, mung, dhruba, sdong Differential Revision: https://reviews.facebook.net/D64689
-
由 Reid Horuff 提交于
Summary: When constructing a write batch a client may now call MarkWalTerminationPoint() on that batch. No batch operations after this call will be added written to the WAL but will still be inserted into the Memtable. This facility is used to remove one of the three WriteImpl calls in 2PC transactions. This produces a ~1% perf improvement. ``` RocksDB - unoptimized 2pc, sync_binlog=1, disable_2pc=off INFO 2016-08-31 14:30:38,814 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2619 seconds. Requests/second = 28628 RocksDB - optimized 2pc , sync_binlog=1, disable_2pc=off INFO 2016-08-31 16:26:59,442 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2581 seconds. Requests/second = 29054 ``` Test Plan: Two unit tests added. Reviewers: sdong, yiwu, IslamAbdelRahman Reviewed By: yiwu Subscribers: hermanlee4, dhruba, andrewkr Differential Revision: https://reviews.facebook.net/D64599
-
- 01 10月, 2016 1 次提交
-
-
由 Manuel Ung 提交于
Summary: This diff does 3 things: Expose TransactionID so that we can identify transactions when we retrieve locking and lock wait information. This is exposed as `Transaction::GetID`. Expose lock state information by locking all stripes in all column families and copying their contents to a data structure. This is exposed as `TransactionDB::GetLockStatusData`. Adds support for tracking the transaction and the key being waited on, and exposes this as `Transaction::GetWaitingTxn`. Test Plan: unit tests Reviewers: horuff, sdong Reviewed By: sdong Subscribers: vasilep, hermanlee4, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64413
-
- 12 8月, 2016 1 次提交
-
-
由 Aaron Gao 提交于
Summary: make transactionDB working with StackableDB Test Plan: make all check -j64 Reviewers: andrewkr, yiwu, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D60705
-
- 18 5月, 2016 1 次提交
-
-
由 Reid Horuff 提交于
Summary: This tests that a prepared transaction is not lost after several crashes, restarts, and memtable flushes. Test Plan: TwoPhaseLongPrepareTest Reviewers: sdong Subscribers: hermanlee4, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58185
-
- 11 5月, 2016 1 次提交
-
-
由 Reid Horuff 提交于
Summary: Two Phase Commit addition to RocksDB. See wiki: https://github.com/facebook/rocksdb/wiki/Two-Phase-Commit-Implementation Quip: https://fb.quip.com/pxZrAyrx53r3 Depends on: WriteBatch modification: https://reviews.facebook.net/D54093 Memtable Log Referencing and Prepared Batch Recovery: https://reviews.facebook.net/D56919 Test Plan: - SimpleTwoPhaseTransactionTest - PersistentTwoPhaseTransactionTest. - TwoPhaseRollbackTest - TwoPhaseMultiThreadTest - TwoPhaseLogRollingTest - TwoPhaseEmptyWriteTest - TwoPhaseExpirationTest Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: leveldb, hermanlee4, andrewkr, vasilep, dhruba, santoshb Differential Revision: https://reviews.facebook.net/D56925
-
- 08 3月, 2016 1 次提交
-
-
由 agiardullo 提交于
Summary: Previously, reusing a transaction (by passing it as an argument to BeginTransaction) would not clear the transaction's snapshot. This is not a clear, well-definited behavior. Test Plan: improved test Reviewers: sdong, IslamAbdelRahman, horuff, jkedgar Reviewed By: jkedgar Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55053
-
- 01 3月, 2016 1 次提交
-
-
由 agiardullo 提交于
Summary: Add function to reinitialize a transaction object so that it can be reused. This is an optimization so users can potentially avoid reallocating transaction objects. Test Plan: added tests Reviewers: yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: jkedgar, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53835
-
- 10 2月, 2016 2 次提交
-
-
由 Baraa Hamodi 提交于
-
由 agiardullo 提交于
Summary: MyRocks wants to be able to un-lock a key that was just locked by GetForUpdate(). To do this safely, I am now keeping track of the number of reads(for update) and writes for each key in a transaction. UndoGetForUpdate() will only unlock a key if it hasn't been written and the read count reaches 0. Test Plan: more unit tests Reviewers: igor, rven, yhchiang, spetrunia, sdong Reviewed By: spetrunia, sdong Subscribers: spetrunia, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47043
-
- 03 2月, 2016 1 次提交
-
-
Summary: Doing inline checking of transaction expiration instead of using a callback. Test Plan: To be added Reviewers: anthony Reviewed By: anthony Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D53673
-
- 29 1月, 2016 1 次提交
-
-
由 agiardullo 提交于
Summary: Removed a couple of memory allocations Test Plan: changes covered by existing tests Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53523
-
- 12 12月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: Currently, transactions can fail even if there is no actual write conflict. This is due to relying on only the memtables to check for write-conflicts. Users have to tune memtable settings to try to avoid this, but it's hard to figure out exactly how to tune these settings. With this diff, TransactionDB will use both memtables and SST files to determine if there are any write conflicts. This relies on the fact that BlockBasedTable stores sequence numbers for all writes that happen after any open snapshot. Also, D50295 is needed to prevent SingleDelete from disappearing writes (the TODOs in this test code will be fixed once the other diff is approved and merged). Note that Optimistic transactions will still rely on tuning memtable settings as we do not want to read from SST while on the write thread. Also, memtable settings can still be used to reduce how often TransactionDB needs to read SST files. Test Plan: unit tests, db bench Reviewers: rven, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb, yoshinorim Differential Revision: https://reviews.facebook.net/D50475
-
- 10 10月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: Support for Transaction::CreateSnapshotOnNextOperation(). This is to fix a write-conflict race-condition that Yoshinori was running into when testing MyRocks with LinkBench. Test Plan: New tests Reviewers: yhchiang, spetrunia, rven, igor, yoshinorim, sdong Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D48099
-
- 30 9月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: Should have used auto& instead of auto. Also needed to change the code a bit due to const correctness. Test Plan: unit tests Reviewers: sdong, igor, yoshinorim, yhchiang Reviewed By: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47787
-
- 12 9月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: Transaction::RollbackToSavePoint() will now release any locks that were taken since the previous SavePoint. To do this cleanly, I moved tracked_keys_ management into TransactionBase. Test Plan: New Transaction test. Reviewers: igor, rven, sdong Reviewed By: sdong Subscribers: dhruba, spetrunia, leveldb Differential Revision: https://reviews.facebook.net/D46761
-
- 10 9月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: Added funtions to fetch the number of locked keys in a transaction, the number of pending puts/merge/deletes, and the elapsed time Test Plan: unit tests Reviewers: yoshinorim, jkedgar, rven, sdong, yhchiang, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D45417
-
- 09 9月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: Prototype of API to allow MyRocks to override default Mutex/CondVar used by transactions with their own implementations. They would simply need to pass their own implementations of Mutex/CondVar to the templated TransactionDB::Open(). Default implementation of TransactionDBMutex/TransactionDBCondVar provided (but the code is not currently changed to use this). Let me know if this API makes sense or if it should be changed Test Plan: n/a Reviewers: yhchiang, rven, igor, sdong, spetrunia Reviewed By: spetrunia Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43761
-
- 25 8月, 2015 1 次提交
-
-
由 agiardullo 提交于
Summary: As I keep adding new features to transactions, I keep creating more duplicate code. This diff cleans this up by creating a base implementation class for Transaction and OptimisticTransaction to inherit from. The code in TransactionBase.h/.cc is all just copied from elsewhere. The only entertaining part of this class worth looking at is the virtual TryLock method which allows OptimisticTransactions and Transactions to share the same common code for Put/Get/etc. The rest of this diff is mostly red and easy on the eyes. Test Plan: No functionality change. existing tests pass. Reviewers: sdong, jkedgar, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D45135
-
- 12 8月, 2015 2 次提交
-
-
由 agiardullo 提交于
Summary: Clean up transactions to use the new RollbackToSavePoint api in WriteBatchWithIndex. Note, this diff depends on Pessimistic Transactions diff and ManagedSnapshot diff (D40869 and D43293). Test Plan: unit tests Reviewers: rven, yhchiang, kradhakrishnan, spetrunia, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43371
-
由 agiardullo 提交于
Summary: Initial implementation of Pessimistic Transactions. This diff contains the api changes discussed in D38913. This diff is pretty large, so let me know if people would prefer to meet up to discuss it. MyRocks folks: please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues. Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint(). After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex. We can then decide which route is preferable. Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing. Test Plan: Unit tests, db_bench parallel testing. Reviewers: igor, rven, sdong, yhchiang, yoshinorim Reviewed By: sdong Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D40869
-