1. 06 8月, 2015 2 次提交
    • A
      startBgsaveForReplication(): log what you really do. · ce5761e0
      antirez 提交于
      ce5761e0
    • A
      Replication: add REPLCONF CAPA EOF support. · 3e6d4d59
      antirez 提交于
      Add the concept of slaves capabilities to Redis, the slave now presents
      to the Redis master with a set of capabilities in the form:
      
          REPLCONF capa SOMECAPA capa OTHERCAPA ...
      
      This has the effect of setting slave->slave_capa with the corresponding
      SLAVE_CAPA macros that the master can test later to understand if it
      the slave will understand certain formats and protocols of the
      replication process. This makes it much simpler to introduce new
      replication capabilities in the future in a way that don't break old
      slaves or masters.
      
      This patch was designed and implemented together with Oran Agra
      (@oranagra).
      3e6d4d59
  2. 05 8月, 2015 4 次提交
    • A
      Fix replication slave pings period. · 55ba7727
      antirez 提交于
      For PINGs we use the period configured by the user, but for the newlines
      of slaves waiting for an RDB to be created (including slaves waiting for
      the FULLRESYNC reply) we need to ping with frequency of 1 second, since
      the timeout is fixed and needs to be refreshed.
      55ba7727
    • A
      Make sure we re-emit SELECT after each new slave full sync setup. · 15de6b10
      antirez 提交于
      In previous commits we moved the FULLRESYNC to the moment we start the
      BGSAVE, so that the offset we provide is the right one. However this
      also means that we need to re-emit the SELECT statement every time a new
      slave starts to accumulate the changes.
      
      To obtian this effect in a more clean way, the function that sends the
      FULLRESYNC reply was overloaded with a more important role of also doing
      this and chanigng the slave state. So it was renamed to
      replicationSetupSlaveForFullResync() to better reflect what it does now.
      15de6b10
    • A
      Don't send SELECT to slaves in WAIT_BGSAVE_START state. · a5a06a8e
      antirez 提交于
      a5a06a8e
    • A
      syncCommand() comments improved. · 62b5c60e
      antirez 提交于
      62b5c60e
  3. 04 8月, 2015 1 次提交
    • A
      PSYNC initial offset fix. · 292fec05
      antirez 提交于
      This commit attempts to fix a bug involving PSYNC and diskless
      replication (currently experimental) found by Yuval Inbar from Redis Labs
      and that was later found to have even more far reaching effects (the bug also
      exists when diskstore is off).
      
      The gist of the bug is that, a Redis master replies with +FULLRESYNC to
      a PSYNC attempt that fails and requires a full resynchronization.
      However, the baseline offset sent along with FULLRESYNC was always the
      current master replication offset. This is not ok, because there are
      many reasosn that may delay the RDB file creation. And... guess what,
      the master offset we communicate must be the one of the time the RDB
      was created. So for example:
      
      1) When the BGSAVE for replication is delayed since there is one
         already but is not good for replication.
      2) When the BGSAVE is not needed as we attach one currently ongoing.
      3) When because of diskless replication the BGSAVE is delayed.
      
      In all the above cases the PSYNC reply is wrong and the slave may
      reconnect later claiming to need a wrong offset: this may cause
      data curruption later.
      292fec05
  4. 28 7月, 2015 2 次提交
    • A
      Force slaves to resync after unsuccessful PSYNC. · c1e94b6b
      antirez 提交于
      Using chained replication where C is slave of B which is in turn slave of
      A, if B reconnects the replication link with A but discovers it is no
      longer possible to PSYNC, slaves of B must be disconnected and PSYNC
      not allowed, since the new B dataset may be completely different after
      the synchronization with the master.
      
      Note that there are varius semantical differences in the way this is
      handled now compared to the past. In the past the semantics was:
      
      1. When a slave lost connection with its master, disconnected the chained
      slaves ASAP. Which is not needed since after a successful PSYNC with the
      master, the slaves can continue and don't need to resync in turn.
      
      2. However after a failed PSYNC the replication backlog was not reset, so a
      slave was able to PSYNC successfully even if the instance did a full
      sync with its master, containing now an entirely different data set.
      
      Now instead chained slaves are not disconnected when the slave lose the
      connection with its master, but only when it is forced to full SYNC with
      its master. This means that if the slave having chained slaves does a
      successful PSYNC all its slaves can continue without troubles.
      
      See issue #2694 for more details.
      c1e94b6b
    • A
      278ea9d1
  5. 27 7月, 2015 2 次提交
  6. 26 7月, 2015 5 次提交
  7. 11 6月, 2015 1 次提交
    • A
      Use best effort address binding to connect to the master · 8366907b
      antirez 提交于
      We usually want to reach the master using the address of the interface
      Redis is bound to (via the "bind" config option). That's useful since
      the master will get (and publish) the slave address getting the peer
      name of the incoming socket connection from the slave.
      
      However, when this is not possible, for example because the slave is
      bound to the loopback interface but repliaces from a master accessed via
      an external interface, we want to still connect with the master even
      from a different interface: in this case it is not really important that
      the master will provide any other address, while it is vital to be able
      to replicate correctly.
      
      Related to issues #2609 and #2612.
      8366907b
  8. 01 4月, 2015 2 次提交
    • A
      Net: improve prepareClientToWrite() error handling and comments. · 6c60526d
      antirez 提交于
      When we fail to setup the write handler it does not make sense to take
      the client around, it is missing writes: whatever is a client or a slave
      anyway the connection should terminated ASAP.
      
      Moreover what the function does exactly with its return value, and in
      which case the write handler is installed on the socket, was not clear,
      so the functions comment are improved to make the goals of the function
      more obvious.
      
      Also related to #2485.
      6c60526d
    • O
      fixes to diskless replication. · 159875b5
      Oran Agra 提交于
      master was closing the connection if the RDB transfer took long time.
      and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.
      159875b5
  9. 24 3月, 2015 1 次提交
    • A
      Replication: disconnect blocked clients when switching to slave role. · c3ad7090
      antirez 提交于
      Bug as old as Redis and blocking operations. It's hard to trigger since
      only happens on instance role switch, but the results are quite bad
      since an inconsistency between master and slave is created.
      
      How to trigger the bug is a good description of the bug itself.
      
      1. Client does "BLPOP mylist 0" in master.
      2. Master is turned into slave, that replicates from New-Master.
      3. Client does "LPUSH mylist foo" in New-Master.
      4. New-Master propagates write to slave.
      5. Slave receives the LPUSH, the blocked client get served.
      
      Now Master "mylist" key has "foo", Slave "mylist" key is empty.
      
      Highlights:
      
      * At step "2" above, the client remains attached, basically escaping any
        check performed during command dispatch: read only slave, in that case.
      * At step "5" the slave (that was the master), serves the blocked client
        consuming a list element, which is not consumed on the master side.
      
      This scenario is technically likely to happen during failovers, however
      since Redis Sentinel already disconnects clients using the CLIENT
      command when changing the role of the instance, the bug is avoided in
      Sentinel deployments.
      
      Closes #2473.
      c3ad7090
  10. 04 2月, 2015 1 次提交
  11. 12 12月, 2014 1 次提交
  12. 11 12月, 2014 1 次提交
  13. 03 12月, 2014 1 次提交
    • A
      Network bandwidth tracking + refactoring. · 1b732c09
      antirez 提交于
      Track bandwidth used by clients and replication (but diskless
      replication is not tracked since the actual transfer happens in the
      child process).
      
      This includes a refactoring that makes tracking new instantaneous
      metrics simpler.
      1b732c09
  14. 12 11月, 2014 1 次提交
    • A
      Diskless SYNC: fix RDB EOF detection. · bb7fea0d
      antirez 提交于
      RDB EOF detection was relying on the final part of the RDB transfer to
      be a magic 40 bytes EOF marker. However as the slave is put online
      immediately, and because of sockets timeouts, the replication stream is
      actually contiguous with the RDB file.
      
      This means that to detect the EOF correctly we should either:
      
      1) Scan all the stream searching for the mark. Sucks CPU-wise.
      2) Start to send the replication stream only after an acknowledge.
      3) Implement a proper chunked encoding.
      
      For now solution "2" was picked, so the master does not start to send
      ASAP the stream of commands in the case of diskless replication. We wait
      for the first REPLCONF ACK command from the slave, that certifies us
      that the slave correctly loaded the RDB file and is ready to get more
      data.
      bb7fea0d
  15. 11 11月, 2014 1 次提交
  16. 30 10月, 2014 1 次提交
    • M
      Networking: add more outbound IP binding fixes · 0014966c
      Matt Stancliff 提交于
      Same as the original bind fixes (we just missed these the
      first time around).
      
      This helps Redis not automatically send
      connections from the first IP on an interface if we are bound
      to a specific IP address (e.g. with multiple IP aliases on one
      interface, you want to send from _your_ IP, not from the first IP
      on the interface).
      0014966c
  17. 29 10月, 2014 1 次提交
    • A
      Diskless replication: missing listRewind() added. · 9ec22d92
      antirez 提交于
      This caused BGSAVE to be triggered a second time without any need when
      we switch from socket to disk target via the command
      
          CONFIG SET repl-diskless-sync no
      
      and there is already a slave waiting for the BGSAVE to start.
      Also comments clarified about what is happening.
      9ec22d92
  18. 27 10月, 2014 4 次提交
  19. 24 10月, 2014 1 次提交
  20. 17 10月, 2014 6 次提交
  21. 16 10月, 2014 1 次提交