提交 · 88a5cede882ee4268685acae5ef6c8660f7f4e44 · Turbo码先生 / redis

11 12月, 2013 3 次提交

Y

Fix wrong repldboff type which causes dropped replication in rare cases. · 88a5cede
由 Yossi Gottlieb 提交于 12月 24, 2012

88a5cede

Slaves heartbeats during sync improved. · 11120689

由 antirez 提交于 12月 10, 2013

The previous fix for false positive timeout detected by master was not
complete. There is another blocking stage while loading data for the
first synchronization with the master, that is, flushing away the
current data from the DB memory.

This commit uses the newly introduced dict.c callback in order to make
some incremental work (to send "\n" heartbeats to the master) while
flushing the old data from memory.

It is hard to write a regression test for this issue unfortunately. More
support for debugging in the Redis core would be needed in terms of
functionalities to simulate a slow DB loading / deletion.

11120689

dict.c: added optional callback to dictEmpty(). · 2eb781b3

由 antirez 提交于 12月 10, 2013

Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).

2eb781b3

04 12月, 2013 2 次提交
- A
  
  WAIT command: synchronous replication for Redis. · c5618e7f
  由 antirez 提交于 12月 04, 2013
  
  c5618e7f
- A
  
  BLPOP blocking code refactored to be generic & reusable. · 82b672f6
  由 antirez 提交于 12月 03, 2013
  
  82b672f6
03 12月, 2013 1 次提交
- A
  
  Removed old comments and dead code from freeClient(). · 2e027c48
  由 antirez 提交于 12月 03, 2013
  
  2e027c48
30 11月, 2013 1 次提交
- A
  
  Cluster: basic data structures for nodes black list. · 8f18345e
  由 antirez 提交于 11月 29, 2013
  
  8f18345e
21 11月, 2013 1 次提交

Sentinel: test for writable config file. · 297de1ab

由 antirez 提交于 11月 21, 2013

This commit introduces a funciton called when Sentinel is ready for
normal operations to avoid putting Sentinel specific stuff in redis.c.

297de1ab

19 11月, 2013 2 次提交
- A
  
  Sentinel: sentinelFlushConfig() to CONFIG REWRITE + fsync. · e257ab2b
  由 antirez 提交于 11月 19, 2013
  
  e257ab2b
- A
  
  Sentinel: CONFIG REWRITE support for Sentinel config. · 5998769c
  由 antirez 提交于 11月 19, 2013
  
  5998769c
05 11月, 2013 1 次提交

SCAN code refactored to parse cursor first. · ebcb6251

由 antirez 提交于 11月 05, 2013

The previous implementation of SCAN parsed the cursor in the generic
function implementing SCAN, SSCAN, HSCAN and ZSCAN.

The actual higher-level command implementation only checked for empty
keys and return ASAP in that case. The result was that inverting the
arguments of, for instance, SSCAN for example and write:

    SSCAN 0 key

Instead of

    SSCAN key 0

Resulted into no error, since 0 is a non-existing key name very likely.
Just the iterator returned no elements at all.

In order to fix this issue the code was refactored to extract the
function to parse the cursor and return the error. Every higher level
command implementation now parses the cursor and later checks if the key
exist or not.

ebcb6251

28 10月, 2013 3 次提交
- A
  
  ZSCAN implemented. · 2c643ffa
  由 antirez 提交于 10月 28, 2013
  
  2c643ffa
- A
  
  HSCAN implemented. · e50090aa
  由 antirez 提交于 10月 28, 2013
  
  e50090aa
- A
  
  SSCAN implemented. · 4a1f1cc0
  由 antirez 提交于 10月 28, 2013
  
  4a1f1cc0
25 10月, 2013 1 次提交
- P
  
  Add SCAN command · 7f490b19
  由 Pieter Noordhuis 提交于 7月 09, 2012
  
  7f490b19
09 10月, 2013 2 次提交

Cluster: time switched from seconds to milliseconds. · ba424286

由 antirez 提交于 10月 09, 2013

All the internal state of cluster involving time is now using mstime_t
and mstime() in order to use milliseconds resolution.

Also the clusterCron() function is called with a 10 hz frequency instead
of 1 hz.

The cluster node_timeout must be also configured in milliseconds by the
user in redis.conf.

ba424286

A

Cluster: cluster stuff moved from redis.h to cluster.h. · 929b6a44
由 antirez 提交于 10月 09, 2013

929b6a44

03 10月, 2013 1 次提交

Cluster: new clusterDoBeforeSleep() API. · 7afc0dd5

由 antirez 提交于 10月 03, 2013

The new API is able to remember operations to perform before returning
to the event loop, such as checking if there is the failover quorum for
a slave, save and fsync the configuraiton file, and so forth.

Because this operations are performed before returning on the event
loop we are sure that messages that are sent in the same event loop run
will be delivered *after* the configuration is already saved, that is a
requirement sometimes. For instance we want to publish a new epoch only
when it is already stored in nodes.conf in order to avoid returning back
in the logical clock when a node is restarted.

This new API provides a big performance advantage compared to saving and
possibly fsyncing the configuration file multiple times in the same
event loop run, especially in the case of big clusters with tens or
hundreds of nodes.

7afc0dd5

02 10月, 2013 1 次提交
- A
  
  Cluster: bus messages stats in CLUSTER info. · 6c4d904b
  由 antirez 提交于 10月 02, 2013
  
  6c4d904b
30 9月, 2013 1 次提交

Cluster: time field removed from cluster messages header. · 1dedf9aa

由 antirez 提交于 9月 30, 2013

The new algorithm does not check replies time as checking for the
currentEpoch in the reply ensures that the reply is about the current
election process.

1dedf9aa

26 9月, 2013 3 次提交
- A
  
  Cluster: react faster when a slave wins an election. · 7c4b8f29
  由 antirez 提交于 9月 26, 2013
  
  7c4b8f29
- A
  
  Cluster: master node now uses new protocol to vote. · a445aa30
  由 antirez 提交于 9月 26, 2013
  
  a445aa30
- A
  
  Cluster: slave node now uses the new protocol to get elected. · fb9b76fe
  由 antirez 提交于 9月 26, 2013
  
  fb9b76fe
25 9月, 2013 1 次提交
- A
  
  Cluster: configEpoch added in cluster nodes description. · 12483b00
  由 antirez 提交于 9月 25, 2013
  
  12483b00
20 9月, 2013 1 次提交

Cluster: added time field in cluster bus messages. · 925ea9f8

由 antirez 提交于 9月 20, 2013

The time is sent in requests, and copied back in reply packets.
This way the receiver can compare the time field in a reply with its
local clock and check the age of the request associated with this reply.

This is an easy way to discard delayed replies. Note that only a clock
is used here, that is the one of the node sending the packet. The
receiver only copies the field back into the reply, so no
synchronization is needed between clocks of different hosts.

925ea9f8

04 9月, 2013 1 次提交

Cluster: free HANDSHAKE nodes after node_timeout. · 72587e6c

由 antirez 提交于 9月 04, 2013

Handshake nodes should turn into normal nodes or be freed in a
reasonable amount of time, otherwise they'll keep accumulating if the
address they are associated with is not reachable for some reason.

72587e6c

22 8月, 2013 1 次提交
- A
  
  Use listenToPort() in cluster.c as well. · 81a6a963
  由 antirez 提交于 8月 22, 2013
  
  81a6a963
12 8月, 2013 1 次提交

Replication: better way to send a preamble before RDB payload. · 89ffba91

由 antirez 提交于 8月 12, 2013

During the replication full resynchronization process, the RDB file is
transfered from the master to the slave. However there is a short
preamble to send, that is currently just the bulk payload length of the
file in the usual Redis form $..length..<CR><LF>.

This preamble used to be sent with a direct write call, assuming that
there was alway room in the socket output buffer to hold the few bytes
needed, however this does not scale in case we'll need to send more
stuff, and is not very robust code in general.

This commit introduces a more general mechanism to send a preamble up to
2GB in size (the max length of an sds string) in a non blocking way.

89ffba91

06 8月, 2013 2 次提交

Add per-db average TTL information in INFO output. · 112fa479

由 antirez 提交于 8月 06, 2013

Example:

db0:keys=221913,expires=221913,avg_ttl=655

The algorithm uses a running average with only two samples (current and
previous). Keys found to be expired are considered at TTL zero even if
the actual TTL can be negative.

The TTL is reported in milliseconds.

112fa479

A

Some activeExpireCycle() refactoring. · 6500fabf
由 antirez 提交于 8月 06, 2013

6500fabf

05 8月, 2013 1 次提交

Draft #1 of a new expired keys collection algorithm. · b09ea1bd

由 antirez 提交于 8月 05, 2013

The main idea here is that when we are no longer to expire keys at the
rate the are created, we can't block more in the normal expire cycle as
this would result in too big latency spikes.

For this reason the commit introduces a "fast" expire cycle that does
not run for more than 1 millisecond but is called in the beforeSleep()
hook of the event loop, so much more often, and with a frequency bound
to the frequency of executed commnads.

The fast expire cycle is only called when the standard expiration
algorithm runs out of time, that is, consumed more than
REDIS_EXPIRELOOKUPS_TIME_PERC of CPU in a given cycle without being able
to take the number of already expired keys that are yet not collected
to a number smaller than 25% of the number of keys.

You can test this commit with different loads, but a simple way is to
use the following:

Extreme load with pipelining:

redis-benchmark -r 100000000 -n 100000000  \
        -P 32 set ele:rand:000000000000 foo ex 2

Remove the -P32 in order to avoid the pipelining for a more real-world
load.

In another terminal tab you can monitor the Redis behavior with:

redis-cli -i 0.1 -r -1 info keyspace

and

redis-cli --latency-history

Note: this commit will make Redis printing a lot of debug messages, it
is not a good idea to use it in production.

b09ea1bd

22 7月, 2013 1 次提交

Introduction of a new string encoding: EMBSTR · 894eba07

由 antirez 提交于 6月 05, 2012

Previously two string encodings were used for string objects:

1) REDIS_ENCODING_RAW: a string object with obj->ptr pointing to an sds
stirng.

2) REDIS_ENCODING_INT: a string object where the obj->ptr void pointer
is casted to a long.

This commit introduces a experimental new encoding called
REDIS_ENCODING_EMBSTR that implements an object represented by an sds
string that is not modifiable but allocated in the same memory chunk as
the robj structure itself.

The chunk looks like the following:

+--------------+-----------+------------+--------+----+
| robj data... | robj->ptr | sds header | string | \0 |
+--------------+-----+-----+------------+--------+----+
                     |                       ^
                     +-----------------------+

The robj->ptr points to the contiguous sds string data, so the object
can be manipulated with the same functions used to manipulate plan
string objects, however we need just on malloc and one free in order to
allocate or release this kind of objects. Moreover it has better cache
locality.

This new allocation strategy should benefit both the memory usage and
the performances. A performance gain between 60 and 70% was observed
during micro-benchmarks, however there is more work to do to evaluate
the performance impact and the memory usage behavior.

894eba07

16 7月, 2013 1 次提交
- Y
  
  Chunked loading of RDB to prevent redis from stalling reading very large keys. · 63d15dfc
  由 yoav 提交于 12月 12, 2012
  
  63d15dfc
12 7月, 2013 2 次提交

SORT ALPHA: use collation instead of binary comparison. · cf1579a7

由 antirez 提交于 7月 12, 2013

Note that we only do it when STORE is not used, otherwise we want an
absolutely locale independent and binary safe sorting in order to ensure
AOF / replication consistency.

This is probably an unexpected behavior violating the least surprise
rule, but there is currently no other simple / good alternative.

cf1579a7

Fixed compareStringObject() and introduced collateStringObject(). · 81e55ec0

由 antirez 提交于 7月 12, 2013

compareStringObject was not always giving the same result when comparing
two exact strings, but encoded as integers or as sds strings, since it
switched to strcmp() when at least one of the strings were not sds
encoded.

For instance the two strings "123" and "123\x00456", where the first
string was integer encoded, would result into the old implementation of
compareStringObject() to return 0 as if the strings were equal, while
instead the second string is "greater" than the first in a binary
comparison.

The same compasion, but with "123" encoded as sds string, would instead
return a value < 0, as it is correct. It is not impossible that the
above caused some obscure bug, since the comparison was not always
deterministic, and compareStringObject() is used in the implementation
of skiplists, hash tables, and so forth.

At the same time, collateStringObject() was introduced by this commit, so
that can be used by SORT command to return sorted strings usign
collation instead of binary comparison. See next commit.

81e55ec0

09 7月, 2013 3 次提交
- A
  
  getClientPeerId() refactored into two functions. · d0001fe8
  由 antirez 提交于 7月 09, 2013
  
  d0001fe8
- A
  getClientPeerId() now reports errors. · e4c019e7
  由 antirez 提交于 7月 09, 2013
```
We now also use it in CLIENT KILL implementation.
```
  e4c019e7
- A
  getClientPeerID introduced. · 5cdc5da9
  由 antirez 提交于 7月 09, 2013
```
The function returns an unique identifier for the client, as ip:port for
IPv4 and IPv6 clients, or as path:0 for Unix socket clients.

See the top comment in the function for more info.
```
  5cdc5da9
08 7月, 2013 2 次提交

Update REDIS_CLUSTER_IPLEN to INET6_ADDRSTRLEN. · 6181455a

由 Geoff Garside 提交于 6月 18, 2011

Change REDIS_CLUSTER_IPLEN to INET6_ADDRSTRLEN so that the clusterNode
ip character buffer is big enough to hold an IPv6 address.

6181455a

Add macro to define clusterNode.ip buffer size. · 9cfa02fe

由 Geoff Garside 提交于 6月 18, 2011

Add REDIS_CLUSTER_IPLEN macro to define the size of the clusterNode ip
character array. Additionally use this macro in inet_ntop(3) calls where
the size of the array was being defined manually.

The REDIS_CLUSTER_IPLEN is defined as INET_ADDRSTRLEN which defines the
correct size of a buffer to store an IPv4 address in. The
INET_ADDRSTRLEN macro itself is defined in the <netinet/in.h> header
file and should be portable across the majority of systems.

9cfa02fe

Turbo码先生 / redis 与 Fork 源项目一致

Turbo码先生 / redis
与 Fork 源项目一致