1. 12 12月, 2018 3 次提交
  2. 14 11月, 2018 2 次提交
  3. 31 10月, 2018 3 次提交
  4. 10 10月, 2018 3 次提交
  5. 04 8月, 2018 1 次提交
  6. 03 8月, 2018 2 次提交
    • A
      Set repl_down_since to zero on state change. · 677f7585
      antirez 提交于
      PR #5081 fixes an "interesting" bug about Redis Cluster failover but in
      general about the updating of repl_down_since, that is used in order to
      count the time a slave was left disconnected from its master.
      
      While the fix provided resolves the specific issue, in general the
      validity of repl_down_since is limited to states that are different
      than the state CONNECTED, and the disconnected time is set when the
      state is DISCONNECTED. However from CONNECTED to other states, the state
      machine must always go to DISCONNECTED first. So it makes sense to set
      the field to zero (since it is meaningless in that context) when the
      state is set to CONNECTED.
      677f7585
    • W
      fix server.repl_down_since resetting, so that slaves could failover · 8c6223f9
      WuYunlong 提交于
      automatically as expected.
      8c6223f9
  7. 24 7月, 2018 1 次提交
    • O
      fix rare replication stream corruption with disk-based replication · 9535c215
      Oran Agra 提交于
      The slave sends \n keepalive messages to the master while parsing the rdb,
      and later sends REPLCONF ACK once a second. rarely, the master recives both
      a linefeed char and a REPLCONF in the same read, \n*3\r\n$8\r\nREPLCONF\r\n...
      and it tries to trim two chars (\r\n) from the query buffer,
      trimming the '*' from *3\r\n$8\r\nREPLCONF\r\n...
      
      then the master tries to process a command starting with '3' and replies to
      the slave a bunch of -ERR and one +OK.
      although the slave silently ignores these (prints a log message), this corrupts
      the replication offset at the slave since the slave increases the replication
      offset, and the master did not.
      
      other than the fix in processInlineBuffer, i did several other improvments
      while hunting this very rare bug.
      
      - when redis replies with "unknown command" it includes a portion of the
        arguments, not just the command name. so it would be easier to understand
        what was recived, in my case, on the slave side,  it was -ERR, but
        the "arguments" were the interesting part (containing info on the error).
      - about a year ago i added code in addReplyErrorLength to print the error to
        the log in case of a reply to master (since this string isn't actually
        trasmitted to the master), now changed that block to print a similar log
        message to indicate an error being sent from the master to the slave.
        note that the slave is marked as CLIENT_SLAVE only after PSYNC was received,
        so this will not cause any harm for REPLCONF, and will only indicate problems
        that are gonna corrupt the replication stream anyway.
      - two places were c->reply was emptied, and i wanted to reset sentlen
        this is a precaution (i did not actually see such a problem), since a
        non-zero sentlen will cause corruption to be transmitted on the socket.
      9535c215
  8. 29 6月, 2018 3 次提交
    • Z
      fix exists command on slave · 5f1fcc59
      zhaozhao.zz 提交于
      5f1fcc59
    • A
      Fix infinite loop in dbRandomKey(). · ab145a9f
      antirez 提交于
      Thanks to @kevinmcgehee for signaling the issue and reasoning about the
      consequences and potential fixes.
      
      Issue #5015.
      ab145a9f
    • A
      Sentinel: add an option to deny online script reconfiguration. · 2fa43ece
      antirez 提交于
      The ability of "SENTINEL SET" to change the reconfiguration script at
      runtime is a problem even in the security model of Redis: any client
      inside the network may set any executable to be ran once a failover is
      triggered.
      
      This option adds protection for this problem: by default the two
      SENTINEL SET subcommands modifying scripts paths are denied. However the
      user is still able to rever that using the Sentinel configuration file
      in order to allow such a feature.
      2fa43ece
  9. 13 6月, 2018 2 次提交
  10. 01 6月, 2018 1 次提交
  11. 29 5月, 2018 10 次提交
  12. 23 5月, 2018 3 次提交
    • A
      Fix ae.c when a timer finalizerProc adds an event. · 17f5de89
      antirez 提交于
      While this feature is not used by Redis, ae.c implements the ability for
      a timer to call a finalizer callback when an timer event is deleted.
      This feature was bugged since the start, and because it was never used
      we never noticed a problem. However Anthony LaTorre was using the same
      library in order to implement a different system: he found a bug that he
      describes as follows, and which he fixed with the patch in this commit,
      sent me by private email:
      
          --- Anthony email ---
      
      've found one bug in the current implementation of the timed events.
      It's possible to lose track of a timed event if an event is added in
      the finalizerProc of another event.
      
      For example, suppose you start off with three timed events 1, 2, and
      3. Then the linked list looks like:
      
      3 -> 2 -> 1
      
      Then, you run processTimeEvents and events 2 and 3 finish, so now the
      list looks like:
      
      -1 -> -1 -> 2
      
      Now, on the next iteration of processTimeEvents it starts by deleting
      the first event, and suppose this finalizerProc creates a new event,
      so that the list looks like this:
      
      4 -> -1 -> 2
      
      On the next iteration of the while loop, when it gets to the second
      event, the variable prev is still set to NULL, so that the head of the
      event loop after the next event will be set to 2, i.e. after deleting
      the next event the event loop will look like:
      
      2
      
      and the event with id 4 will be lost.
      
      I've attached an example program to illustrate the issue. If you run
      it you will see that it prints:
      
      ```
      foo id = 0
      spam!
      ```
      
      But if you uncomment line 29 and run it again it won't print "spam!".
      
          --- End of email ---
      
      Test.c source code is as follows:
      
          #include "ae.h"
          #include <stdio.h>
      
          aeEventLoop *el;
      
          int foo(struct aeEventLoop *el, long long id, void *data)
          {
      	printf("foo id = %lld\n", id);
      
      	return AE_NOMORE;
          }
      
          int spam(struct aeEventLoop *el, long long id, void *data)
          {
      	printf("spam!\n");
      
      	return AE_NOMORE;
          }
      
          void bar(struct aeEventLoop *el, void *data)
          {
      	aeCreateTimeEvent(el, 0, spam, NULL, NULL);
          }
      
          int main(int argc, char **argv)
          {
      	el = aeCreateEventLoop(100);
      
      	//aeCreateTimeEvent(el, 0, foo, NULL, NULL);
      	aeCreateTimeEvent(el, 0, foo, NULL, bar);
      
      	aeMain(el);
      
      	return 0;
          }
      
      Anthony fixed the problem by using a linked list for the list of timers, and
      sent me back this patch after he tested the code in production for some time.
      The code looks sane to me, so committing it to Redis.
      17f5de89
    • A
      Sentinel: fix delay in detecting ODOWN. · 266e6423
      antirez 提交于
      See issue #2819 for details. The gist is that when we want to send INFO
      because we are over the time, we used to send only INFO commands, no
      longer sending PING commands. However if a master fails exactly when we
      are about to send an INFO command, the PING times will result zero
      because the PONG reply was already received, and we'll fail to send more
      PINGs, since we try only to send INFO commands: the failure detector
      will delay until the connection is closed and re-opened for "long
      timeout".
      
      This commit changes the logic so that we can send the three kind of
      messages regardless of the fact we sent another one already in the same
      code path. It could happen that we go over the message limit for the
      link by a few messages, but this is not significant. However now we'll
      not introduce delays in sending commands just because there was
      something else to send at the same time.
      266e6423
    • Z
      AOF & RDB: be compatible with rdbchecksum no · eafaf172
      zhaozhao.zz 提交于
      eafaf172
  13. 08 5月, 2018 1 次提交
  14. 27 3月, 2018 1 次提交
  15. 26 3月, 2018 2 次提交
  16. 14 3月, 2018 1 次提交
    • A
      Cluster: ability to prevent slaves from failing over their masters. · 70597a30
      antirez 提交于
      This commit, in some parts derived from PR #3041 which is no longer
      possible to merge (because the user deleted the original branch),
      implements the ability of slaves to have a special configuration
      preventing that they try to start a failover when the master is failing.
      
      There are multiple reasons for wanting this, and the feautre was
      requested in issue #3021 time ago.
      
      The differences between this patch and the original PR are the
      following:
      
      1. The flag is saved/loaded on the nodes configuration.
      2. The 'myself' node is now flag-aware, the flag is updated as needed
         when the configuration is changed via CONFIG SET.
      3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent
         with existing NOADDR.
      4. The redis.conf documentation was rewritten.
      
      Thanks to @deep011 for the original patch.
      70597a30
  17. 02 3月, 2018 1 次提交