1. 22 3月, 2015 2 次提交
    • A
      Cluster: separate unknown master check from the rest. · 47bbaa17
      antirez 提交于
      In no case we should try to attempt to failover if myself->slaveof is
      NULL.
      47bbaa17
    • A
      Cluster: refactoring around configEpoch handling. · 0595420b
      antirez 提交于
      This commit moves the process of generating a new config epoch without
      consensus out of the clusterCommand() implementation, in order to make
      it reusable for other reasons (current target is to have a CLUSTER
      FAILOVER option forcing the failover when no master majority is
      reachable).
      
      Moreover the commit moves other functions which are similarly related to
      config epochs in a new logical section of the cluster.c file, just for
      clarity.
      0595420b
  2. 20 3月, 2015 1 次提交
    • A
      Cluster: better cluster state transiction handling. · 62893f5b
      antirez 提交于
      Before we relied on the global cluster state to make sure all the hash
      slots are linked to some node, when getNodeByQuery() is called. So
      finding the hash slot unbound was checked with an assertion. However
      this is fragile. The cluster state is often updated in the
      clusterBeforeSleep() function, and not ASAP on state change, so it may
      happen to process clients with a cluster state that is 'ok' but yet
      certain hash slots set to NULL.
      
      With this commit the condition is also checked in getNodeByQuery() and
      reported with a identical error code of -CLUSTERDOWN but slightly
      different error message so that we have more debugging clue in the
      future.
      
      Root cause of issue #2288.
      62893f5b
  3. 18 3月, 2015 2 次提交
  4. 28 2月, 2015 1 次提交
  5. 26 2月, 2015 2 次提交
    • A
      Improvements to PR #2425 · 53659404
      antirez 提交于
      1. Remove useless "cs" initialization.
      2. Add a "select" var to capture a condition checked multiple times.
      3. Avoid duplication of the same if (!copy) conditional.
      4. Don't increment dirty if copy is given (no deletion is performed),
         otherwise we propagate MIGRATE when not needed.
      53659404
    • T
      Add last_dbid to migrateCachedSocket to avoid redundant SELECT · 97c4167a
      Tommy Wang 提交于
      Avoid redundant SELECT calls when continuously migrating keys to
      the same dbid within a target Redis instance.
      97c4167a
  6. 30 1月, 2015 2 次提交
  7. 29 1月, 2015 5 次提交
    • A
      Cluster: initialized not used fileds in gossip section. · f2cd2fcd
      antirez 提交于
      Otherwise we risk sending not initialized data to other nodes, that may
      contain anything. This was actually not possible only because the
      initialization of the buffer where the cluster packets header is created
      was larger than the 3 gossip sections we use, so the memory was already
      all filled with zeroes by the memset().
      f2cd2fcd
    • A
      Cluster: initialized not used fileds in gossip section. · 2553f6c9
      antirez 提交于
      Otherwise we risk sending not initialized data to other nodes, that may
      contain anything. This was actually not possible only because the
      initialization of the buffer where the cluster packets header is created
      was larger than the 3 gossip sections we use, so the memory was already
      all filled with zeroes by the memset().
      2553f6c9
    • A
      Cluster: magical 10% of nodes explained in comments. · 2616d6f6
      antirez 提交于
      2616d6f6
    • A
      CLUSTER count-failure-reports command added. · 92f29b89
      antirez 提交于
      92f29b89
    • A
      Cluster: use a number of gossip sections proportional to cluster size. · 8dd32632
      antirez 提交于
      Otherwise it is impossible to receive the majority of failure reports in
      the node_timeout*2 window in larger clusters.
      
      Still with a 200 nodes cluster, 20 gossip sections are a very reasonable
      amount of bytes to send.
      
      A side effect of this change is also fater cluster nodes joins for large
      clusters, because the cluster layout makes less time to propagate.
      8dd32632
  8. 22 1月, 2015 6 次提交
    • M
      Fix cluster migrate memory leak · ebb07a0b
      Matt Stancliff 提交于
      Fixes valgrind error:
      48 bytes in 1 blocks are definitely lost in loss record 196 of 373
         at 0x4910D3: je_malloc (jemalloc.c:944)
         by 0x42807D: zmalloc (zmalloc.c:125)
         by 0x41FA0D: dictGetIterator (dict.c:543)
         by 0x41FA48: dictGetSafeIterator (dict.c:555)
         by 0x459B73: clusterHandleSlaveMigration (cluster.c:2776)
         by 0x45BF27: clusterCron (cluster.c:3123)
         by 0x423344: serverCron (redis.c:1239)
         by 0x41D6CD: aeProcessEvents (ae.c:311)
         by 0x41D8EA: aeMain (ae.c:455)
         by 0x41A84B: main (redis.c:3832)
      ebb07a0b
    • M
      Fix potential invalid read past end of array · 98faed3a
      Matt Stancliff 提交于
      If array has N elements, we can't read +1 if we are already at N.
      
      Also, we need to move elements by their storage size in the array,
      not just by individual bytes.
      98faed3a
    • M
      Fix cluster reset memory leak · 97ffeb7c
      Matt Stancliff 提交于
      [maybe] Fixes valgrind errors:
      32 bytes in 4 blocks are definitely lost in loss record 107 of 228
         at 0x80EA447: je_malloc (jemalloc.c:944)
         by 0x806E59C: zrealloc (zmalloc.c:125)
         by 0x80A9AFC: clusterSetMaster (cluster.c:801)
         by 0x80AEDC9: clusterCommand (cluster.c:3994)
         by 0x80682A5: call (redis.c:2049)
         by 0x8068A20: processCommand (redis.c:2309)
         by 0x8076497: processInputBuffer (networking.c:1143)
         by 0x8073BAF: readQueryFromClient (networking.c:1208)
         by 0x8060E98: aeProcessEvents (ae.c:412)
         by 0x806123B: aeMain (ae.c:455)
         by 0x806C3DB: main (redis.c:3832)
      
      64 bytes in 8 blocks are definitely lost in loss record 143 of 228
         at 0x80EA447: je_malloc (jemalloc.c:944)
         by 0x806E59C: zrealloc (zmalloc.c:125)
         by 0x80AAB40: clusterProcessPacket (cluster.c:801)
         by 0x80A847F: clusterReadHandler (cluster.c:1975)
         by 0x30000FF: ???
      
      80 bytes in 10 blocks are definitely lost in loss record 148 of 228
         at 0x80EA447: je_malloc (jemalloc.c:944)
         by 0x806E59C: zrealloc (zmalloc.c:125)
         by 0x80AAB40: clusterProcessPacket (cluster.c:801)
         by 0x80A847F: clusterReadHandler (cluster.c:1975)
         by 0x2FFFFFF: ???
      97ffeb7c
    • M
      Fix sending uninitialized bytes · 4a36350d
      Matt Stancliff 提交于
      Fixes valgrind error:
      Syscall param write(buf) points to uninitialised byte(s)
         at 0x514C35D: ??? (syscall-template.S:81)
         by 0x456B81: clusterWriteHandler (cluster.c:1907)
         by 0x41D596: aeProcessEvents (ae.c:416)
         by 0x41D8EA: aeMain (ae.c:455)
         by 0x41A84B: main (redis.c:3832)
       Address 0x5f268e2 is 2,274 bytes inside a block of size 8,192 alloc'd
         at 0x4932D1: je_realloc (jemalloc.c:1297)
         by 0x428185: zrealloc (zmalloc.c:162)
         by 0x4269E0: sdsMakeRoomFor.part.0 (sds.c:142)
         by 0x426CD7: sdscatlen (sds.c:251)
         by 0x4579E7: clusterSendMessage (cluster.c:1995)
         by 0x45805A: clusterSendPing (cluster.c:2140)
         by 0x45BB03: clusterCron (cluster.c:2944)
         by 0x423344: serverCron (redis.c:1239)
         by 0x41D6CD: aeProcessEvents (ae.c:311)
         by 0x41D8EA: aeMain (ae.c:455)
         by 0x41A84B: main (redis.c:3832)
       Uninitialised value was created by a stack allocation
         at 0x457810: nodeUpdateAddressIfNeeded (cluster.c:1236)
      4a36350d
    • A
      Cluster: node deletion cleanup / centralization. · 0a3edcbe
      antirez 提交于
      0a3edcbe
    • A
      Cluster: set the slaves->slaveof filed to NULL when master is freed. · 5130c253
      antirez 提交于
      Related to issue #2289.
      5130c253
  9. 13 1月, 2015 3 次提交
    • A
      Cluster: fetch my IP even if msg is not MEET for the first time. · df1a7fc4
      antirez 提交于
      In order to avoid that misconfigured cluster nodes at some time may
      force an IP update on other nodes, it is required that nodes update
      their own address only on MEET messages. However it does not make sense
      to do this the first time a node is contacted and yet does not have an
      IP, we just risk that myself->ip remains not assigned if there are
      messages lost or cluster creation procedures that don't make sure
      everybody is targeted by at least one incoming MEET message.
      
      Also fix the logging of the IP switch avoiding the :-1 tail.
      df1a7fc4
    • A
      Cluster: clusterMsgDataGossip structure, explict padding + minor stuff. · 45e2a26d
      antirez 提交于
      Also explicitly set version to 0, add a protocol version define, improve
      comments in the gossip structure.
      
      Note that the structure layout is the same after the change, we are just
      making the padding explicit with an additional not used 16 bits field.
      So this commit is still able to talk with the previous versions of
      cluster nodes.
      45e2a26d
    • A
      Suppress valgrind error about write sending uninitialized data. · 799a3cca
      antirez 提交于
      Valgrind checks that the buffers we transfer via syscalls are all
      composed of bytes actually initialized. This is useful, it makes we able
      to avoid leaking informations in non initialized parts fo messages
      transferred to other hosts. This commit fixes one of such issues.
      799a3cca
  10. 12 1月, 2015 1 次提交
    • A
      Cluster: initialize mf_end. · 1584c7a3
      antirez 提交于
      Can't be initialized by resetManualFailover() since it's actual state
      the function uses, so we need to initialize it at startup time. Not
      really a bug in practical terms, but showed up into valgrind and is not
      technically correct anyway.
      1584c7a3
  11. 18 12月, 2014 1 次提交
    • M
      Cluster: Notify user on accept error · 8bce6542
      Matt Stancliff 提交于
      If we woke up to accept a connection, but we can't
      accept it, inform the user of the error going on
      with their networking.
      
      (The previous message was the same for success or error!)
      8bce6542
  12. 16 12月, 2014 1 次提交
  13. 15 12月, 2014 1 次提交
  14. 09 12月, 2014 1 次提交
    • A
      Cluster PUBLISH message: fix totlen count. · 73996c86
      antirez 提交于
      bulk_data field size was not removed from the count. It is not possible
      to declare it simply as 'char bulk_data[]' since the structure is nested
      into another structure.
      73996c86
  15. 31 10月, 2014 2 次提交
    • M
      Parse cluster state file in IPv6 compatible way · 75e68625
      Matt Stancliff 提交于
      We need to pick the port based on the _last_ colon, not the first one.
      75e68625
    • M
      Networking: add more outbound IP binding fixes · f1a6f780
      Matt Stancliff 提交于
      Same as the original bind fixes (we just missed these the
      first time around).
      
      This helps Redis not automatically send
      connections from the first IP on an interface if we are bound
      to a specific IP address (e.g. with multiple IP aliases on one
      interface, you want to send from _your_ IP, not from the first IP
      on the interface).
      f1a6f780
  16. 09 10月, 2014 2 次提交
    • A
      Cluster: process gossip section only for known nodes. · c6226f26
      antirez 提交于
      With the exception of nodes sending MEET packets: we have to trust them
      since they can send us MEET packets only when the cluster is initially
      created or because sysadmin manual action.
      c6226f26
    • A
      Cluster: fix logic to detect we are among a minority. · 419eb185
      antirez 提交于
      In the cluster evaluation function we are supposed to set the cluster
      state as "fail" if we are among a minority, however the code was not
      detecting to be into a minority partition if exactly half the masters
      were reachable, which is a minority.
      419eb185
  17. 08 10月, 2014 1 次提交
  18. 06 10月, 2014 1 次提交
    • M
      Clean up text throughout project · bd62c952
      Matt Stancliff 提交于
        - Remove trailing newlines from redis.conf
        - Fix comment misspelling
        - Clarifies zipEncodeLength usage and a C API mention (#1243, #1242)
        - Fix cluster typos (inspired by @papanikge #1507)
        - Fix rewite -> rewrite in a few places (inspired by #682)
      
      Closes #1243, #1242, #1507
      bd62c952
  19. 19 9月, 2014 2 次提交
  20. 26 8月, 2014 3 次提交