提交 · 51857c7e5cc6479323d53778e37a197f7c140fbf · xindoo / redis

02 6月, 2012 2 次提交

Fixed RESTORE hash failure (Issue #532) · 51857c7e

由 Alex Mitrofanov 提交于 6月 01, 2012

(additional commit notes by antirez@gmail.com):

The rdbIsObjectType() macro was not updated when the new RDB object type
of ziplist encoded hashes was added.

As a result RESTORE, that uses rdbLoadObjectType(), failed when a
ziplist encoded hash was loaded.
This does not affected normal RDB loading because in that case we use
the lower-level function rdbLoadType().

The commit also adds a regression test.

51857c7e

RDB type loading functions clarified in comments. · c7a25200

由 antirez 提交于 6月 02, 2012

Improved comments to make clear that rdbLoadType() just loads a
general TYPE in the context of RDB that can be an object type or an
expire type, end-of-file, and so forth.

While rdbLoadObjectType() enforces that the type is a valid Object Type
otherwise it returns -1.

c7a25200

01 6月, 2012 1 次提交

BITOP bug when called against non existing keys fixed. · 1419406e

由 antirez 提交于 5月 31, 2012

In the issue #529 an user reported a bug that can be triggered with the
following code:

flushdb
set a
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
bitop or x a b

The bug was introduced with the speed optimization in commit 8bbc0768
that specializes every BITOP operation loop up to the minimum length of
the input strings.

However the computation of the minimum length contained an error when a
non existing key was present in the input, after a key that was non zero
length.

This commit fixes the bug and adds a regression test for it.

1419406e

25 5月, 2012 2 次提交

Tests modified to account for INFO fields renaming. · bc70b8e5

由 antirez 提交于 5月 25, 2012

Commit 33e1db36 modified the name of a
few INFO fields. This commit changes the Redis test to account for this
changes.

bc70b8e5

Four new persistence fields in INFO. A few renamed. · 33e1db36

由 antirez 提交于 5月 25, 2012

The 'persistence' section of INFO output now contains additional four
fields related to RDB and AOF persistence:

 rdb_last_bgsave_time_sec       Duration of latest BGSAVE in sec.
 rdb_current_bgsave_time_sec    Duration of current BGSAVE in sec.
 aof_last_rewrite_time_sec      Duration of latest AOF rewrite in sec.
 aof_current_rewrite_time_sec   Duration of current AOF rewrite in sec.

The 'current' fields are set to -1 if a BGSAVE / AOF rewrite is not in
progress. The 'last' fileds are set to -1 if no previous BGSAVE / AOF
rewrites were performed.

Additionally a few fields in the persistence section were renamed for
consistency:

 changes_since_last_save -> rdb_changes_since_last_save
 bgsave_in_progress -> rdb_bgsave_in_progress
 last_save_time -> rdb_last_save_time
 last_bgsave_status -> rdb_last_bgsave_status
 bgrewriteaof_in_progress -> aof_rewrite_in_progress
 bgrewriteaof_scheduled -> aof_rewrite_scheduled

After the renaming, fields in the persistence section start with rdb_ or
aof_ prefix depending on the persistence method they describe.
The field 'loading' and related fields are not prefixed because they are
unique for both the persistence methods.

33e1db36

24 5月, 2012 13 次提交

BITOP command 10x speed improvement. · d8668038

由 antirez 提交于 5月 23, 2012

This commit adds a fast-path to the BITOP that can be used for all the
bytes from 0 to the minimal length of the string, and if there are
at max 16 input keys.

Often the intersected bitmaps are roughly the same size, so this
optimization can provide a 10x speed boost to most real world usages
of the command.

Bytes are processed four full words at a time, in loops specialized
for the specific BITOP sub-command, without the need to check for
length issues with the inputs (since we run this algorithm only as far
as there is data from all the keys at the same time).

The remaining part of the string is intersected in the usual way using
the slow but generic algorith.

It is possible to do better than this with inputs that are not roughly
the same size, sorting the input keys by length, by initializing the
result string in a smarter way, and noticing that the final part of the
output string composed of only data from the longest string does not
need any proecessing since AND, OR and XOR against an empty string does
not alter the output (zero in the first case, and the original string in
the other two cases).

More implementations will be implemented later likely, but this should
be enough to release Redis 2.6-RC4 with bitops merged in.

Note: this commit also adds better testing for BITOP NOT command, that
is currently the faster and hard to optimize further since it just
flips the bits of a single input string.

d8668038

BITOP: handle integer encoded objects correctly. · fa4a5d59

由 antirez 提交于 5月 22, 2012

A bug in the implementation caused BITOP to crash the server if at least
one one of the source objects was integer encoded.

The new implementation takes an additional array of Redis objects
pointers and calls getDecodedObject() to get a reference to a string
encoded object, and then uses decrRefCount() to release the object.

Tests modified to cover the regression and improve coverage.

fa4a5d59

BITCOUNT performance improved. · 7c34643f

由 antirez 提交于 5月 20, 2012

At Redis's default optimization level the command is now much faster,
always using a constant-time bit manipualtion technique to count bits
instead of GCC builtin popcount, and unrolling the loop.

The current implementation performance is 1.5GB/s in a MBA 11" (1.8 Ghz
i7) compiled with both GCC and clang.

The algorithm used is described here:

http://graphics.stanford.edu/~seander/bithacks.html

7c34643f

bitop.c renamed bitops.c · 80f8028e

由 antirez 提交于 5月 20, 2012

bitop.c contains the "Bit related string operations" so it seems more
logical to call it bitops instead of bitop.
This also makes it matching the name of the test (unit/bitops.tcl).

80f8028e

Bit operations tests improved. · 01d3a7e7

由 antirez 提交于 5月 20, 2012

Fuzzing tests of BITCOUNT / BITOP are iterated multiple times.
The new BITCOUNT fuzzing test uses random strings in a wider interval of
lengths including zero-len strings.

01d3a7e7

popcount() optimization for speed. · 343d3bd2

由 antirez 提交于 5月 20, 2012

We run the array by 32 bit words instead of processing it byte per byte.
If the code is compiled using GCC __builtin_popcount() builtin function
is used instead.

343d3bd2

BITCOUNT refactoring. · dbbbe49e

由 antirez 提交于 5月 19, 2012

The low level popualtion counting function is now separated from the
BITCOUNT command implementation, so that the low level function can be
further optimized and eventually used in other contexts if needed.

dbbbe49e

由 antirez 提交于 5月 19, 2012

All the general string operations are implemented in t_string.c, however
the bit operations, while targeting the string type, are better served
in a specific file where we have the implementations of the following
four commands and helper functions:

    GETBIT
    SETBIT
    BITOP
    BITCOUNT

In the future this file will probably contain more code related to
making the BITOP and BITCOUNT operations faster.

760e7765

BITOP and BITCOUNT tests. · a3f2b489

由 antirez 提交于 5月 17, 2012

The Redis implementation is tested against Tcl implementations of the
same operation. Both fuzzing and testing of specific aspects of the
commands behavior are performed.

a3f2b489

New commands: BITOP and BITCOUNT. · 0bd6d68e

由 antirez 提交于 5月 16, 2012

The motivation for this new commands is to be search in the usage of
Redis for real time statistics. See the article "Fast real time metrics
using Redis".

http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/

In general Redis strings when used as bitmaps using the SETBIT/GETBIT
command provide a very space-efficient and fast way to store statistics.
For instance in a web application with users, every user can be
associated with a key that shows every day in which the user visited the
web service. This information can be really valuable to extract user
behaviour information.

With Redis bitmaps doing this is very simple just saying that a given
day is 0 (the data the service was put online) and all the next days are
1, 2, 3, and so forth. So with SETBIT it is possible to set the bit
corresponding to the current day every time the user visits the site.

It is possible to take the count of the bit sets on the run, this is
extremely easy using a Lua script. However a fast bit count native
operation can be useful, especially if it can operate on ranges, or when
the string is small like in the case of days (even if you consider many
years it is still extremely little data).

For this reason BITOP was introduced. The command counts the number of
bits set to 1 in a string, with optional range:

BITCOUNT key [start end]

The start/end parameters are similar to GETRANGE. If omitted the whole
string is tested.

Population counting is more useful when bit-level operations like AND,
OR and XOR are avaialble. For instance I can test multiple users to see
the number of days three users visited the site at the same time. To do
this we can take the AND of all the bitmaps, and then count the set bits.

For this reason the BITOP command was introduced:

BITOP [AND|OR|XOR|NOT] dest_key src_key1 src_key2 src_key3 ... src_keyN

In the special case of NOT (that inverts the bits) only one source key
can be passed.

The judicious use of BITCOUNT and BITOP combined can lead to interesting
use cases with very space efficient representation of data.

The implementation provided is still not tested and optimized for speed,
next commits will introduce unit tests. Later the implementation will be
profiled to see if it is possible to gain an important amount of speed
without making the code much more complex.

0bd6d68e

Add aof_rewrite_buffer_length INFO field. · 6f05a653

由 antirez 提交于 5月 24, 2012

The INFO output, persistence section, already contained the field
describing the size of the current AOF buffer to flush on disk. However
the other AOF buffer, used to accumulate changes during an AOF rewrite,
was not mentioned in the INFO output.

This commit introduces a new field called aof_rewrite_buffer_length with
the length of the rewrite buffer.

6f05a653

Allow an AOF rewrite buffer > 2GB (Fix for issue #504). · 47ca4b6e

由 antirez 提交于 5月 22, 2012

During the AOF rewrite process, the parent process needs to accumulate
the new writes in an in-memory buffer: when the child will terminate the
AOF rewriting process this buffer (that ist the difference between the
dataset when the rewrite was started, and the current dataset) is
flushed to the new AOF file.

We used to implement this buffer using an sds.c string, but sds.c has a
2GB limit. Sometimes the dataset can be big enough, the amount of writes
so high, and the rewrite process slow enough that we overflow the 2GB
limit, causing a crash, documented on github by issue #504.

In order to prevent this from happening, this commit introduces a new
system to accumulate writes, implemented by a linked list of blocks of
10 MB each, so that we also avoid paying the reallocation cost.

Note that theoretically modern operating systems may implement realloc()
simply as a remaping of the old pages, thus with very good performances,
see for instance the mremap() syscall on Linux. However this is not
always true, and jemalloc by default avoids doing this because there are
issues with the current implementation of mremap().

For this reason we are using a linked list of blocks instead of a single
block that gets reallocated again and again.

The changes in this commit lacks testing, that will be performed before
merging into the unstable branch. This fix will not enter 2.4 because it
is too invasive. However 2.4 will log a warning when the AOF rewrite
buffer is near to the 2GB limit.

47ca4b6e

Dead code removed from replication.c. · ef379976

由 antirez 提交于 5月 24, 2012

The user @jokea noticed that the following line of code into
replication.c made little sense:

    addReplySds(slave,sdsempty());

Investigating a bit I found that this was introduced by commit 6208b3a7
three years ago in the early stages of Redis. The code apparently is not
useful at all, so I'm removing it.

This change will not be backported into 2.4 so that in the rare case
this should introduce a bug, we'll have a chance to detect it into the
development branch. However following the code path it seems like the
code is not useful at all, so the risk is truly small.

ef379976

23 5月, 2012 2 次提交

J

Set fd to writable when poll(2) detects POLLERR or POLLHUP event. · e150ce3c
由 jokea 提交于 5月 23, 2012

e150ce3c

Fixed issue #516 (ZINTERSTORE mixing sets and zsets). · 348ee1a4

由 antirez 提交于 5月 23, 2012

Weeks ago trying to fix an harmless GCC warning I introduced a bug in
the ziplist-encoded implementations of sorted sets.

The bug completely broke zuiNext() iterator, that is used in the
ZINTERSTORE and ZUNIONSTORE implementation, so those two commands are no
longer reliable starting from Redis version 2.4.12 and latest 2.6.0-RC
releases.

This commit fixes the problem and adds a regression test.

348ee1a4

22 5月, 2012 1 次提交

Redis test: include bug report on crash. · 2bcd18a2

由 antirez 提交于 5月 22, 2012

Due to a change in the format of the bug report in case of crash of
failed assertion the test suite was no longer able to properly log it.
Instead just a protocol error was logged by the Redis TCL client that
provided no clue about the actual problem.

This commit resolves the issue by logging everything from the first line
of the log including the string REDIS BUG REPORT, till the end of the
file.

2bcd18a2

21 5月, 2012 2 次提交

Use comments to split aof.c into sections. · 5a559993

由 antirez 提交于 5月 21, 2012

This makes the code more readable, it is still not the case to split the
file itself into three different files, but the logical separation
improves the readability especially since new commits are going to
introduce an additional section.

5a559993

TODO file removed. · d44a36e0

由 antirez 提交于 5月 20, 2012

The list of things to do is since long time in two places:

1) Github issues.
2) I've a private TOOD list of random ideas, what makes sense is later
moved to github issues. So github is anyway the true source of things to
do.

d44a36e0

16 5月, 2012 2 次提交

Deleted jemalloc.orig from /deps. · d165f785

由 antirez 提交于 5月 16, 2012

In the commit upgrading jemalloc to version 3.0.0 I added the old
version of Jemalloc in the 'jemalloc.orig' directory for an error.
This commit removes the not useful version of jemalloc.

d165f785

Jemalloc updated to 3.0.0. · ad4c0b41

由 antirez 提交于 5月 15, 2012

Full changelog here:

http://www.canonware.com/cgi-bin/gitweb.cgi?p=jemalloc.git;a=blob_plain;f=ChangeLog;hb=master

Notable improvements from the point of view of Redis:

1) Bugfixing.
2) Support for Valgrind.
3) Support for OSX Lion, FreeBSD.

ad4c0b41

15 5月, 2012 4 次提交
- P
  
  Whitespace · 3c72c94a
  由 Pieter Noordhuis 提交于 5月 14, 2012
  
  3c72c94a
- D
  
  use port_getn instead of port_get · 6d312770
  由 Dave Pacheco 提交于 3月 26, 2012
  
  6d312770
- D
  
  first cut at event port support · 05da63da
  由 Dave Pacheco 提交于 3月 26, 2012
  
  05da63da
- P
  
  Fix `install` target on OSX (see #495) · f16d090c
  由 Pieter Noordhuis 提交于 5月 04, 2012
  
  f16d090c
14 5月, 2012 3 次提交

Added time.h include in redis-cli. · e9f0419c

由 antirez 提交于 5月 14, 2012

redis-cli.c uses the time() function to seed the PRNG, but time.h was
not included. This was not noticed since sys/time.h is included and was
enough in most systems (but not correct). With Ubuntu 12.04 GCC
generates a warning that made us aware of the issue.

e9f0419c

activeExpireCycle(): better precision in max time used. · b3624f5a

由 antirez 提交于 5月 14, 2012

activeExpireCycle() can consume no more than a few milliseconds per
iteration. This commit improves the precision of the check for the time
elapsed in two ways:

1) We check every 16 iterations instead of the main loop instead of 256.
2) We reset iterations at the start of the function and not every time
   we switch to the next database, so the check is correctly performed
   every 16 iterations.

b3624f5a

Impovements for: Redis timer, hashes rehashing, keys collection. · 61daf891

由 antirez 提交于 5月 13, 2012

A previous commit introduced REDIS_HZ define that changes the frequency
of calls to the serverCron() Redis function. This commit improves
different related things:

1) Software watchdog: now the minimal period can be set according to
REDIS_HZ. The minimal period is two times the timer period, that is:

    (1000/REDIS_HZ)*2 milliseconds

2) The incremental rehashing is now performed in the expires dictionary
as well.

3) The activeExpireCycle() function was improved in different ways:

- Now it checks if it already used too much time using microseconds
  instead of milliseconds for better precision.
- The time limit is now calculated correctly, in the previous version
  the division was performed before of the multiplication resulting in
  a timelimit of 0 if HZ was big enough.
- Databases with less than 1% of buckets fill in the hash table are
  skipped, because getting random keys is too expensive in this
  condition.

4) tryResizeHashTables() is now called at every timer call, we need to
   match the number of calls we do to the expired keys colleciton cycle.

5) REDIS_HZ was raised to 100.

61daf891

13 5月, 2012 1 次提交

Redis timer interrupt frequency configurable as REDIS_HZ. · 94343492

由 antirez 提交于 5月 13, 2012

Redis uses a function called serverCron() that is very similar to the
timer interrupt of an operating system. This function is used to handle
a number of asynchronous things, like active expired keys collection,
clients timeouts, update of statistics, things related to the cluster
and replication, triggering of BGSAVE and AOF rewrite process, and so
forth.

In the past the timer was called 1 time per second. At some point it was
raised to 10 times per second, but it still was fixed and could not be
changed even at compile time, because different functions called from
serverCron() assumed a given fixed frequency.

This commmit makes the frequency configurable, so that it is simpler to
pick a good tradeoff between overhead of this function (that is usually
very small) and the responsiveness of Redis during a few critical
circumstances where a lot of work is done inside the timer.

An example of such a critical condition is mass-expire of a lot of keys
in the same second. Up to a given percentage of CPU time is used to
perform expired keys collection per expire cylce. Now changing the
REDIS_HZ macro it is possible to do less work but more times per second
in order to block the server for less time.

If this patch will work well in our tests it will enter Redis 2.6-final.

94343492

12 5月, 2012 2 次提交

A

Comment improved so that the code goal is more clear. Thx to @agladysh. · f333788f
由 antirez 提交于 5月 11, 2012

f333788f

More incremental active expired keys collection process. · 1dcc95d0

由 antirez 提交于 5月 11, 2012

If a large amonut of keys are all expiring about at the same time, the
"active" expired keys collection cycle used to block as far as the
percentage of already expired keys was >= 25% of the total population of
keys with an expire set.

This could block the server even for many seconds in order to reclaim
memory ASAP. The new algorithm uses at max a small amount of
milliseconds per cycle, even if this means reclaiming the memory less
promptly it also means a more responsive server.

1dcc95d0

11 5月, 2012 3 次提交
- A
  
  If the computer running the Redis test is slow, we revert to --clients 1 to avoid false positives. · 27737964
  由 antirez 提交于 5月 11, 2012
  
  27737964
- A
  
  redis-cli pipe mode: handle EINTR properly as well so that SIGSTOP/SIGCONT are handled correctly. · ea66be60
  由 antirez 提交于 5月 11, 2012
  
  ea66be60
- A
  
  redis-cli pipe mode: handle EAGAIN while writing to socket. · f6bd9122
  由 antirez 提交于 5月 11, 2012
  
  f6bd9122
10 5月, 2012 1 次提交
- A
  
  Fix PREFIX typo in Makefile. · 8c630146
  由 antirez 提交于 5月 09, 2012
  
  8c630146
09 5月, 2012 1 次提交
- A
  
  Allow PREFIX to be overrided in Makefile. · b26808a5
  由 antirez 提交于 5月 09, 2012
  
  b26808a5