- 29 5月, 2018 1 次提交
-
-
由 antirez 提交于
Close #4947.
-
- 23 5月, 2018 4 次提交
-
-
由 antirez 提交于
-
由 antirez 提交于
While this feature is not used by Redis, ae.c implements the ability for a timer to call a finalizer callback when an timer event is deleted. This feature was bugged since the start, and because it was never used we never noticed a problem. However Anthony LaTorre was using the same library in order to implement a different system: he found a bug that he describes as follows, and which he fixed with the patch in this commit, sent me by private email: --- Anthony email --- 've found one bug in the current implementation of the timed events. It's possible to lose track of a timed event if an event is added in the finalizerProc of another event. For example, suppose you start off with three timed events 1, 2, and 3. Then the linked list looks like: 3 -> 2 -> 1 Then, you run processTimeEvents and events 2 and 3 finish, so now the list looks like: -1 -> -1 -> 2 Now, on the next iteration of processTimeEvents it starts by deleting the first event, and suppose this finalizerProc creates a new event, so that the list looks like this: 4 -> -1 -> 2 On the next iteration of the while loop, when it gets to the second event, the variable prev is still set to NULL, so that the head of the event loop after the next event will be set to 2, i.e. after deleting the next event the event loop will look like: 2 and the event with id 4 will be lost. I've attached an example program to illustrate the issue. If you run it you will see that it prints: ``` foo id = 0 spam! ``` But if you uncomment line 29 and run it again it won't print "spam!". --- End of email --- Test.c source code is as follows: #include "ae.h" #include <stdio.h> aeEventLoop *el; int foo(struct aeEventLoop *el, long long id, void *data) { printf("foo id = %lld\n", id); return AE_NOMORE; } int spam(struct aeEventLoop *el, long long id, void *data) { printf("spam!\n"); return AE_NOMORE; } void bar(struct aeEventLoop *el, void *data) { aeCreateTimeEvent(el, 0, spam, NULL, NULL); } int main(int argc, char **argv) { el = aeCreateEventLoop(100); //aeCreateTimeEvent(el, 0, foo, NULL, NULL); aeCreateTimeEvent(el, 0, foo, NULL, bar); aeMain(el); return 0; } Anthony fixed the problem by using a linked list for the list of timers, and sent me back this patch after he tested the code in production for some time. The code looks sane to me, so committing it to Redis.
-
由 antirez 提交于
See issue #2819 for details. The gist is that when we want to send INFO because we are over the time, we used to send only INFO commands, no longer sending PING commands. However if a master fails exactly when we are about to send an INFO command, the PING times will result zero because the PONG reply was already received, and we'll fail to send more PINGs, since we try only to send INFO commands: the failure detector will delay until the connection is closed and re-opened for "long timeout". This commit changes the logic so that we can send the three kind of messages regardless of the fact we sent another one already in the same code path. It could happen that we go over the message limit for the link by a few messages, but this is not significant. However now we'll not introduce delays in sending commands just because there was something else to send at the same time.
-
由 zhaozhao.zz 提交于
-
- 08 5月, 2018 1 次提交
-
-
由 huijing.whj 提交于
-
- 27 3月, 2018 1 次提交
-
-
由 antirez 提交于
-
- 26 3月, 2018 2 次提交
-
-
由 zhaozhao.zz 提交于
-
由 zhaozhao.zz 提交于
-
- 14 3月, 2018 2 次提交
-
-
由 antirez 提交于
-
由 antirez 提交于
This commit, in some parts derived from PR #3041 which is no longer possible to merge (because the user deleted the original branch), implements the ability of slaves to have a special configuration preventing that they try to start a failover when the master is failing. There are multiple reasons for wanting this, and the feautre was requested in issue #3021 time ago. The differences between this patch and the original PR are the following: 1. The flag is saved/loaded on the nodes configuration. 2. The 'myself' node is now flag-aware, the flag is updated as needed when the configuration is changed via CONFIG SET. 3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent with existing NOADDR. 4. The redis.conf documentation was rewritten. Thanks to @deep011 for the original patch.
-
- 02 3月, 2018 2 次提交
- 01 3月, 2018 3 次提交
- 28 2月, 2018 1 次提交
-
-
由 charsyam 提交于
-
- 27 2月, 2018 5 次提交
-
-
由 antirez 提交于
AE_BARRIER was implemented like: - Fire the readable event. - Do not fire the writabel event if the readable fired. However this may lead to the writable event to never be called if the readable event is always fired. There is an alterantive, we can just invert the sequence of the calls in case AE_BARRIER is set. This commit does that.
-
由 antirez 提交于
In case the write handler is already installed, it could happen that we serve the reply of a query in the same event loop cycle we received it, preventing beforeSleep() from guaranteeing that we do the AOF fsync before sending the reply to the client. The AE_BARRIER mechanism, introduced in a previous commit, prevents this problem. This commit makes actual use of this new feature to fix the bug.
-
由 antirez 提交于
Add AE_BARRIER to the writable event loop so that slaves requesting votes can't be served before we re-enter the event loop in the next iteration, so clusterBeforeSleep() will fsync to disk in time. Also add the call to explicitly fsync, given that we modified the last vote epoch variable.
-
由 antirez 提交于
AOF fsync=always, and certain Redis Cluster bus operations, require to fsync data on disk before replying with an acknowledge. In such case, in order to implement Group Commits, we want to be sure that queries that are read in a given cycle of the event loop, are never served to clients in the same event loop iteration. This way, by using the event loop "before sleep" callback, we can fsync the information just one time before returning into the event loop for the next cycle. This is much more efficient compared to calling fsync() multiple times. Unfortunately because of a bug, this was not always guaranteed: the actual way the events are installed was the sole thing that could control. Normally this problem is hard to trigger when AOF is enabled with fsync=always, because we try to flush the output buffers to the socekt directly in the beforeSleep() function of Redis. However if the output buffers are full, we actually install a write event, and in such a case, this bug could happen. This change to ae.c modifies the event loop implementation to make this concept explicit. Write events that are registered with: AE_WRITABLE|AE_BARRIER Are guaranteed to never fire after the readable event was fired for the same file descriptor. In this way we are sure that data is persisted to disk before the client performing the operation receives an acknowledged. However note that this semantics does not provide all the guarantees that one may believe are automatically provided. Take the example of the blocking list operations in Redis. With AOF and fsync=always we could have: Client A doing: BLPOP myqueue 0 Client B doing: RPUSH myqueue a b c In this scenario, Client A will get the "a" elements immediately after the Client B RPUSH will be executed, even before the operation is persisted. However when Client B will get the acknowledge, it can be sure that "b,c" are already safe on disk inside the list. What to note here is that it cannot be assumed that Client A receiving the element is a guaranteed that the operation succeeded from the point of view of Client B. This is due to the fact that the barrier exists within the same socket, and not between different sockets. However in the case above, the element "a" was not going to be persisted regardless, so it is a pretty synthetic argument.
-
由 antirez 提交于
-
- 19 2月, 2018 1 次提交
-
-
由 antirez 提交于
This commit adds two new fields in the INFO output, stats section: expired_stale_perc:0.34 expired_time_cap_reached_count:58 The first field is an estimate of the number of keys that are yet in memory but are already logically expired. They reason why those keys are yet not reclaimed is because the active expire cycle can't spend more time on the process of reclaiming the keys, and at the same time nobody is accessing such keys. However as the active expire cycle runs, while it will eventually have to return to the caller, because of time limit or because there are less than 25% of keys logically expired in each given database, it collects the stats in order to populate this INFO field. Note that expired_stale_perc is a running average, where the current sample accounts for 5% and the history for 95%, so you'll see it changing smoothly over time. The other field, expired_time_cap_reached_count, counts the number of times the expire cycle had to stop, even if still it was finding a sizeable number of keys yet to expire, because of the time limit. This allows people handling operations to understand if the Redis server, during mass-expiration events, is able to collect keys fast enough usually. It is normal for this field to increment during mass expires, but normally it should very rarely increment. When instead it constantly increments, it means that the current workloads is using a very important percentage of CPU time to expire keys. This feature was created thanks to the hints of Rashmi Ramesh and Bart Robinson from Twitter. In private email exchanges, they noted how it was important to improve the observability of this parameter in the Redis server. Actually in big deployments, the amount of keys that are yet to expire in each server, even if they are logically expired, may account for a very big amount of wasted memory.
-
- 16 2月, 2018 13 次提交
-
-
由 antirez 提交于
-
由 antirez 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 Dvir Volk 提交于
-
由 antirez 提交于
See #3832.
-
由 oranagra 提交于
since slave isn't replying to it's master, these errors go unnoticed. since we don't expect the master to send garbadge to the slave, this should be safe. (as long as we don't log OOM errors there)
-
- 13 2月, 2018 4 次提交
-
-
由 charsyam 提交于
-
由 Guy Benoish 提交于
-
由 antirez 提交于
See #3858.
-
由 Guy Benoish 提交于
It is possible to do BGREWRITEAOF even if appendonly=no. This is by design. stopAppendonly() didn't turn off aof_rewrite_scheduled (it can be turned on again by BGREWRITEAOF even while appendonly is off anyway). After configuring `appendonly yes` it will see that the state is AOF_OFF, there's no RDB fork, so it will do rewriteAppendOnlyFileBackground() which will fail since the aof_child_pid is set (was scheduled and started by cron). Solution: stopAppendonly() will turn off the schedule flag (regardless of who asked for it). startAppendonly() will terminate any existing fork and start a new one (so it is the most recent).
-