• A
    Fix PSYNC2 incomplete command bug as described in #3899. · a91cc5bc
    antirez 提交于
    This bug was discovered by @kevinmcgehee and constituted a major hidden
    bug in the PSYNC2 implementation, caused by the propagation from the
    master of incomplete commands to slaves.
    
    The bug had several results:
    
    1. Borrowing from Kevin text in the issue: "Given that slaves blindly
    copy over their master's input into their own replication backlog over
    successive read syscalls, it's possible that with large commands or
    small TCP buffers, partial commands are present in this buffer. If the
    master were to fail before successfully propagating the entire command
    to a slave, the slaves will never execute the partial command (since the
    client is invalidated) but will copy it to replication backlog which may
    relay those invalid bytes to its slaves on PSYNC2, corrupting the
    backlog and possibly other valid commands that follow the failover.
    Simple command boundaries aren't sufficient to capture this, either,
    because in the case of a MULTI/EXEC block, if the master successfully
    propagates a subset of the commands but not the EXEC, then the
    transaction in the backlog becomes corrupt and could corrupt other
    slaves that consume this data."
    
    2. As identified by @yangsiran later, there is another effect of the
    bug. For the same mechanism of the first problem, a slave having another
    slave, could receive a full resynchronization request with an already
    half-applied command in the backlog. Once the RDB is ready, it will be
    sent to the slave, and the replication will continue sending to the
    sub-slave the other half of the command, which is not valid.
    
    The fix, designed by @yangsiran and @antirez, and implemented by
    @antirez, uses a secondary buffer in order to feed the sub-masters and
    update the replication backlog and offsets, only when a given part of
    the query buffer is actually *applied* to the state of the instance,
    that is, when the command gets processed and the command is not pending
    in the Redis transaction buffer because of CLIENT_MULTI state.
    
    Given that now the backlog and offsets representation are in agreement
    with the actual processed commands, both issue 1 and 2 should no longer
    be possible.
    
    Thanks to @kevinmcgehee, @yangsiran and @oranagra for their work in
    identifying and designing a fix for this problem.
    a91cc5bc
server.h 87.4 KB