提交 · 47bbaa17b00e4d4f1128d9249cbbe7be8cc702e5 · 别团等shy哥发育 / redis

22 3月, 2015 2 次提交

A
Cluster: separate unknown master check from the rest. · 47bbaa17
由 antirez 提交于 3月 20, 2015
```
In no case we should try to attempt to failover if myself->slaveof is
NULL.
```
47bbaa17

Cluster: refactoring around configEpoch handling. · 0595420b

由 antirez 提交于 3月 20, 2015

This commit moves the process of generating a new config epoch without
consensus out of the clusterCommand() implementation, in order to make
it reusable for other reasons (current target is to have a CLUSTER
FAILOVER option forcing the failover when no master majority is
reachable).

Moreover the commit moves other functions which are similarly related to
config epochs in a new logical section of the cluster.c file, just for
clarity.

0595420b

20 3月, 2015 1 次提交

Cluster: better cluster state transiction handling. · 62893f5b

由 antirez 提交于 3月 20, 2015

Before we relied on the global cluster state to make sure all the hash
slots are linked to some node, when getNodeByQuery() is called. So
finding the hash slot unbound was checked with an assertion. However
this is fragile. The cluster state is often updated in the
clusterBeforeSleep() function, and not ASAP on state change, so it may
happen to process clients with a cluster state that is 'ok' but yet
certain hash slots set to NULL.

With this commit the condition is also checked in getNodeByQuery() and
reported with a identical error code of -CLUSTERDOWN but slightly
different error message so that we have more debugging clue in the
future.

Root cause of issue #2288.

62893f5b

18 3月, 2015 2 次提交
- A
  Cluster: more robust slave check in CLUSTER REPLICATE. · d8236ea2
  由 antirez 提交于 3月 18, 2015
```
There are rare conditions where node->slaveof may be NULL even if the
node is a slave. To check by flag is much more robust.
```
  d8236ea2
- M
  
  Add command CLUSTER MYID · f36482dd
  由 Michel Martens 提交于 3月 10, 2015
  
  f36482dd
28 2月, 2015 1 次提交
- A
  
  Migrate: replace conditional with pre-computed value. · 938dfdc1
  由 antirez 提交于 2月 27, 2015
  
  938dfdc1
26 2月, 2015 2 次提交

Improvements to PR #2425 · 53659404

由 antirez 提交于 2月 26, 2015

1. Remove useless "cs" initialization.
2. Add a "select" var to capture a condition checked multiple times.
3. Avoid duplication of the same if (!copy) conditional.
4. Don't increment dirty if copy is given (no deletion is performed),
   otherwise we propagate MIGRATE when not needed.

53659404

T
Add last_dbid to migrateCachedSocket to avoid redundant SELECT · 97c4167a
由 Tommy Wang 提交于 2月 25, 2015
```
Avoid redundant SELECT calls when continuously migrating keys to
the same dbid within a target Redis instance.
```
97c4167a

30 1月, 2015 2 次提交

Cluster: some bias towwards FAIL/PFAIL nodes in gossip sections. · 55f2bc64

由 antirez 提交于 1月 30, 2015

This improves PFAIL -> FAIL switch. Too late at this point in the RC
releases to add proper PFAIL/FAIL separate dictionary to do this in a
less randomized way. Tested in practice with experiments that this
helps. PFAIL -> FAIL average with 20 nodes and node-timeout set to 5
seconds takes 2.5 seconds without this commit, 1 second with this
commit.

55f2bc64

A

More correct wanted / maxiterations values in clusterSendPing(). · 0f1b9c3d
由 antirez 提交于 1月 30, 2015

0f1b9c3d

29 1月, 2015 5 次提交

Cluster: initialized not used fileds in gossip section. · f2cd2fcd

由 antirez 提交于 1月 24, 2015

Otherwise we risk sending not initialized data to other nodes, that may
contain anything. This was actually not possible only because the
initialization of the buffer where the cluster packets header is created
was larger than the 3 gossip sections we use, so the memory was already
all filled with zeroes by the memset().

f2cd2fcd

Cluster: initialized not used fileds in gossip section. · 2553f6c9

由 antirez 提交于 1月 24, 2015

2553f6c9

A

Cluster: magical 10% of nodes explained in comments. · 2616d6f6
由 antirez 提交于 1月 29, 2015

2616d6f6
A

CLUSTER count-failure-reports command added. · 92f29b89
由 antirez 提交于 1月 29, 2015

92f29b89

Cluster: use a number of gossip sections proportional to cluster size. · 8dd32632

由 antirez 提交于 1月 29, 2015

Otherwise it is impossible to receive the majority of failure reports in
the node_timeout*2 window in larger clusters.

Still with a 200 nodes cluster, 20 gossip sections are a very reasonable
amount of bytes to send.

A side effect of this change is also fater cluster nodes joins for large
clusters, because the cluster layout makes less time to propagate.

8dd32632

22 1月, 2015 6 次提交

Fix cluster migrate memory leak · ebb07a0b

由 Matt Stancliff 提交于 1月 15, 2015

Fixes valgrind error:
48 bytes in 1 blocks are definitely lost in loss record 196 of 373
   at 0x4910D3: je_malloc (jemalloc.c:944)
   by 0x42807D: zmalloc (zmalloc.c:125)
   by 0x41FA0D: dictGetIterator (dict.c:543)
   by 0x41FA48: dictGetSafeIterator (dict.c:555)
   by 0x459B73: clusterHandleSlaveMigration (cluster.c:2776)
   by 0x45BF27: clusterCron (cluster.c:3123)
   by 0x423344: serverCron (redis.c:1239)
   by 0x41D6CD: aeProcessEvents (ae.c:311)
   by 0x41D8EA: aeMain (ae.c:455)
   by 0x41A84B: main (redis.c:3832)

ebb07a0b

Fix potential invalid read past end of array · 98faed3a

由 Matt Stancliff 提交于 1月 14, 2015

If array has N elements, we can't read +1 if we are already at N.

Also, we need to move elements by their storage size in the array,
not just by individual bytes.

98faed3a

Fix cluster reset memory leak · 97ffeb7c

由 Matt Stancliff 提交于 1月 14, 2015

[maybe] Fixes valgrind errors:
32 bytes in 4 blocks are definitely lost in loss record 107 of 228
   at 0x80EA447: je_malloc (jemalloc.c:944)
   by 0x806E59C: zrealloc (zmalloc.c:125)
   by 0x80A9AFC: clusterSetMaster (cluster.c:801)
   by 0x80AEDC9: clusterCommand (cluster.c:3994)
   by 0x80682A5: call (redis.c:2049)
   by 0x8068A20: processCommand (redis.c:2309)
   by 0x8076497: processInputBuffer (networking.c:1143)
   by 0x8073BAF: readQueryFromClient (networking.c:1208)
   by 0x8060E98: aeProcessEvents (ae.c:412)
   by 0x806123B: aeMain (ae.c:455)
   by 0x806C3DB: main (redis.c:3832)

64 bytes in 8 blocks are definitely lost in loss record 143 of 228
   at 0x80EA447: je_malloc (jemalloc.c:944)
   by 0x806E59C: zrealloc (zmalloc.c:125)
   by 0x80AAB40: clusterProcessPacket (cluster.c:801)
   by 0x80A847F: clusterReadHandler (cluster.c:1975)
   by 0x30000FF: ???

80 bytes in 10 blocks are definitely lost in loss record 148 of 228
   at 0x80EA447: je_malloc (jemalloc.c:944)
   by 0x806E59C: zrealloc (zmalloc.c:125)
   by 0x80AAB40: clusterProcessPacket (cluster.c:801)
   by 0x80A847F: clusterReadHandler (cluster.c:1975)
   by 0x2FFFFFF: ???

97ffeb7c

Fix sending uninitialized bytes · 4a36350d

由 Matt Stancliff 提交于 1月 14, 2015

Fixes valgrind error:
Syscall param write(buf) points to uninitialised byte(s)
   at 0x514C35D: ??? (syscall-template.S:81)
   by 0x456B81: clusterWriteHandler (cluster.c:1907)
   by 0x41D596: aeProcessEvents (ae.c:416)
   by 0x41D8EA: aeMain (ae.c:455)
   by 0x41A84B: main (redis.c:3832)
 Address 0x5f268e2 is 2,274 bytes inside a block of size 8,192 alloc'd
   at 0x4932D1: je_realloc (jemalloc.c:1297)
   by 0x428185: zrealloc (zmalloc.c:162)
   by 0x4269E0: sdsMakeRoomFor.part.0 (sds.c:142)
   by 0x426CD7: sdscatlen (sds.c:251)
   by 0x4579E7: clusterSendMessage (cluster.c:1995)
   by 0x45805A: clusterSendPing (cluster.c:2140)
   by 0x45BB03: clusterCron (cluster.c:2944)
   by 0x423344: serverCron (redis.c:1239)
   by 0x41D6CD: aeProcessEvents (ae.c:311)
   by 0x41D8EA: aeMain (ae.c:455)
   by 0x41A84B: main (redis.c:3832)
 Uninitialised value was created by a stack allocation
   at 0x457810: nodeUpdateAddressIfNeeded (cluster.c:1236)

4a36350d

A

Cluster: node deletion cleanup / centralization. · 0a3edcbe
由 antirez 提交于 1月 21, 2015

0a3edcbe
A
Cluster: set the slaves->slaveof filed to NULL when master is freed. · 5130c253
由 antirez 提交于 1月 21, 2015
```
Related to issue #2289.
```
5130c253

13 1月, 2015 3 次提交

Cluster: fetch my IP even if msg is not MEET for the first time. · df1a7fc4

由 antirez 提交于 1月 13, 2015

In order to avoid that misconfigured cluster nodes at some time may
force an IP update on other nodes, it is required that nodes update
their own address only on MEET messages. However it does not make sense
to do this the first time a node is contacted and yet does not have an
IP, we just risk that myself->ip remains not assigned if there are
messages lost or cluster creation procedures that don't make sure
everybody is targeted by at least one incoming MEET message.

Also fix the logging of the IP switch avoiding the :-1 tail.

df1a7fc4

Cluster: clusterMsgDataGossip structure, explict padding + minor stuff. · 45e2a26d

由 antirez 提交于 1月 13, 2015

Also explicitly set version to 0, add a protocol version define, improve
comments in the gossip structure.

Note that the structure layout is the same after the change, we are just
making the padding explicit with an additional not used 16 bits field.
So this commit is still able to talk with the previous versions of
cluster nodes.

45e2a26d

Suppress valgrind error about write sending uninitialized data. · 799a3cca

由 antirez 提交于 1月 13, 2015

Valgrind checks that the buffers we transfer via syscalls are all
composed of bytes actually initialized. This is useful, it makes we able
to avoid leaking informations in non initialized parts fo messages
transferred to other hosts. This commit fixes one of such issues.

799a3cca

12 1月, 2015 1 次提交

Cluster: initialize mf_end. · 1584c7a3

由 antirez 提交于 1月 12, 2015

Can't be initialized by resetManualFailover() since it's actual state
the function uses, so we need to initialize it at startup time. Not
really a bug in practical terms, but showed up into valgrind and is not
technically correct anyway.

1584c7a3

18 12月, 2014 1 次提交

Cluster: Notify user on accept error · 8bce6542

由 Matt Stancliff 提交于 3月 06, 2014

If we woke up to accept a connection, but we can't
accept it, inform the user of the error going on
with their networking.

(The previous message was the same for success or error!)

8bce6542

16 12月, 2014 1 次提交
- A
  
  Fix comment in clusterHandleSlaveFailover(). · 86213b4e
  由 antirez 提交于 12月 16, 2014
  
  86213b4e
15 12月, 2014 1 次提交
- A
  
  Make sure buffer is enough in clusterSendPing(). · 2c6dc9f1
  由 antirez 提交于 12月 15, 2014
  
  2c6dc9f1
09 12月, 2014 1 次提交

Cluster PUBLISH message: fix totlen count. · 73996c86

由 antirez 提交于 11月 28, 2014

bulk_data field size was not removed from the count. It is not possible
to declare it simply as 'char bulk_data[]' since the structure is nested
into another structure.

73996c86

31 10月, 2014 2 次提交

M
Parse cluster state file in IPv6 compatible way · 75e68625
由 Matt Stancliff 提交于 10月 23, 2014
```
We need to pick the port based on the _last_ colon, not the first one.
```
75e68625

Networking: add more outbound IP binding fixes · f1a6f780

由 Matt Stancliff 提交于 10月 28, 2014

Same as the original bind fixes (we just missed these the
first time around).

This helps Redis not automatically send
connections from the first IP on an interface if we are bound
to a specific IP address (e.g. with multiple IP aliases on one
interface, you want to send from _your_ IP, not from the first IP
on the interface).

f1a6f780

09 10月, 2014 2 次提交

Cluster: process gossip section only for known nodes. · c6226f26

由 antirez 提交于 10月 08, 2014

With the exception of nodes sending MEET packets: we have to trust them
since they can send us MEET packets only when the cluster is initially
created or because sysadmin manual action.

c6226f26

Cluster: fix logic to detect we are among a minority. · 419eb185

由 antirez 提交于 10月 08, 2014

In the cluster evaluation function we are supposed to set the cluster
state as "fail" if we are among a minority, however the code was not
detecting to be into a minority partition if exactly half the masters
were reachable, which is a minority.

419eb185

08 10月, 2014 1 次提交
- A
  
  Cluster: more chatty slaves when failover is stalled. · 9a867b68
  由 antirez 提交于 10月 07, 2014
  
  9a867b68
06 10月, 2014 1 次提交

Clean up text throughout project · bd62c952

由 Matt Stancliff 提交于 7月 31, 2014

  - Remove trailing newlines from redis.conf
  - Fix comment misspelling
  - Clarifies zipEncodeLength usage and a C API mention (#1243, #1242)
  - Fix cluster typos (inspired by @papanikge #1507)
  - Fix rewite -> rewrite in a few places (inspired by #682)

Closes #1243, #1242, #1507

bd62c952

19 9月, 2014 2 次提交

Cluster: claim ping_sent time even if we can't connect. · 015cbf30

由 antirez 提交于 9月 17, 2014

This fixes a potential bug that was never observed in practice since
what happens is that the asynchronous connect returns ok (to fail later,
calling the handler) every time, so a ping is queued, and sent_ping
happens to always be populated.

Howver technically connect(2) with a non blocking socket may return an
error synchronously, so before this fix the code was not correct.

015cbf30

A

Cluster: new option to work with partial slots coverage. · d4c3c124
由 antirez 提交于 9月 17, 2014

d4c3c124

26 8月, 2014 3 次提交

Cluster: Fix segfault if cluster config corrupt · 3e223841

由 Matt Stancliff 提交于 3月 27, 2014

This commit adds a size check after initial config
line parsing to make sure we have *at least* 8 arguments
per line.

Also, instead of asserting for cluster->myself, we just test
and error out normally (since the error does a hard exit anyway).

Closes #1597

3e223841

M
Fix memory leak in cluster config parsing · 29ff27d4
由 Matt Stancliff 提交于 8月 01, 2014
```
The continue stop us from triggering the
free after the long line for loop, so add it
earlier.
```
29ff27d4
M

Clarify existing slot wording on cluster start · d409b5ac
由 Matt Stancliff 提交于 8月 01, 2014

d409b5ac

别团等shy哥发育 / redis 与 Fork 源项目一致

别团等shy哥发育 / redis
与 Fork 源项目一致