-
由 Philipp Reisner 提交于
There was a race condition: In a situation with a SyncSource+Primary and a SyncTarget+Secondary node, and a resync dependency to some other device. After both nodes decided to do the resync, the other device finishes its resync process. At that time SyncSource already sent the P_SYNC_UUID packet, and already updated its peer disk state to Inconsistent. The SyncTarget node waits for the P_SYNC_UUID and sends a state packet to report the resync dependency change. That packet still carries a disk state of Outdated. Impact: If application writes come in, during that time on the Primary node, those do not get replicated, and the out-of-sync counter gets increased. => The completion of resync is not detected on the primary node. => stalled. Those blocks get resync'ed with the next resync, since the are get marked as out-of-sync in the bitmap. In order to fix this, we filter out that wrong state change in the sanitize_state() function. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
e4f925e1