1. 02 3月, 2018 2 次提交
  2. 01 3月, 2018 3 次提交
  3. 28 2月, 2018 1 次提交
  4. 27 2月, 2018 5 次提交
    • A
      ae.c: insetad of not firing, on AE_BARRIER invert the sequence. · 1e2f0d69
      antirez 提交于
      AE_BARRIER was implemented like:
      
          - Fire the readable event.
          - Do not fire the writabel event if the readable fired.
      
      However this may lead to the writable event to never be called if the
      readable event is always fired. There is an alterantive, we can just
      invert the sequence of the calls in case AE_BARRIER is set. This commit
      does that.
      1e2f0d69
    • A
      AOF: fix a bug that may prevent proper fsyncing when fsync=always. · b2e4aad9
      antirez 提交于
      In case the write handler is already installed, it could happen that we
      serve the reply of a query in the same event loop cycle we received it,
      preventing beforeSleep() from guaranteeing that we do the AOF fsync
      before sending the reply to the client.
      
      The AE_BARRIER mechanism, introduced in a previous commit, prevents this
      problem. This commit makes actual use of this new feature to fix the
      bug.
      b2e4aad9
    • A
      Cluster: improve crash-recovery safety after failover auth vote. · 93bad8ae
      antirez 提交于
      Add AE_BARRIER to the writable event loop so that slaves requesting
      votes can't be served before we re-enter the event loop in the next
      iteration, so clusterBeforeSleep() will fsync to disk in time.
      Also add the call to explicitly fsync, given that we modified the last
      vote epoch variable.
      93bad8ae
    • A
      ae.c: introduce the concept of read->write barrier. · e32752e8
      antirez 提交于
      AOF fsync=always, and certain Redis Cluster bus operations, require to
      fsync data on disk before replying with an acknowledge.
      In such case, in order to implement Group Commits, we want to be sure
      that queries that are read in a given cycle of the event loop, are never
      served to clients in the same event loop iteration. This way, by using
      the event loop "before sleep" callback, we can fsync the information
      just one time before returning into the event loop for the next cycle.
      This is much more efficient compared to calling fsync() multiple times.
      
      Unfortunately because of a bug, this was not always guaranteed: the
      actual way the events are installed was the sole thing that could
      control. Normally this problem is hard to trigger when AOF is enabled
      with fsync=always, because we try to flush the output buffers to the
      socekt directly in the beforeSleep() function of Redis. However if the
      output buffers are full, we actually install a write event, and in such
      a case, this bug could happen.
      
      This change to ae.c modifies the event loop implementation to make this
      concept explicit. Write events that are registered with:
      
          AE_WRITABLE|AE_BARRIER
      
      Are guaranteed to never fire after the readable event was fired for the
      same file descriptor. In this way we are sure that data is persisted to
      disk before the client performing the operation receives an
      acknowledged.
      
      However note that this semantics does not provide all the guarantees
      that one may believe are automatically provided. Take the example of the
      blocking list operations in Redis.
      
      With AOF and fsync=always we could have:
      
          Client A doing: BLPOP myqueue 0
          Client B doing: RPUSH myqueue a b c
      
      In this scenario, Client A will get the "a" elements immediately after
      the Client B RPUSH will be executed, even before the operation is persisted.
      However when Client B will get the acknowledge, it can be sure that
      "b,c" are already safe on disk inside the list.
      
      What to note here is that it cannot be assumed that Client A receiving
      the element is a guaranteed that the operation succeeded from the point
      of view of Client B.
      
      This is due to the fact that the barrier exists within the same socket,
      and not between different sockets. However in the case above, the
      element "a" was not going to be persisted regardless, so it is a pretty
      synthetic argument.
      e32752e8
    • A
      Fix ziplist prevlen encoding description. See #4705. · 262f4039
      antirez 提交于
      262f4039
  5. 19 2月, 2018 1 次提交
    • A
      Track number of logically expired keys still in memory. · 83923afa
      antirez 提交于
      This commit adds two new fields in the INFO output, stats section:
      
      expired_stale_perc:0.34
      expired_time_cap_reached_count:58
      
      The first field is an estimate of the number of keys that are yet in
      memory but are already logically expired. They reason why those keys are
      yet not reclaimed is because the active expire cycle can't spend more
      time on the process of reclaiming the keys, and at the same time nobody
      is accessing such keys. However as the active expire cycle runs, while
      it will eventually have to return to the caller, because of time limit
      or because there are less than 25% of keys logically expired in each
      given database, it collects the stats in order to populate this INFO
      field.
      
      Note that expired_stale_perc is a running average, where the current
      sample accounts for 5% and the history for 95%, so you'll see it
      changing smoothly over time.
      
      The other field, expired_time_cap_reached_count, counts the number
      of times the expire cycle had to stop, even if still it was finding a
      sizeable number of keys yet to expire, because of the time limit.
      This allows people handling operations to understand if the Redis
      server, during mass-expiration events, is able to collect keys fast
      enough usually. It is normal for this field to increment during mass
      expires, but normally it should very rarely increment. When instead it
      constantly increments, it means that the current workloads is using
      a very important percentage of CPU time to expire keys.
      
      This feature was created thanks to the hints of Rashmi Ramesh and
      Bart Robinson from Twitter. In private email exchanges, they noted how
      it was important to improve the observability of this parameter in the
      Redis server. Actually in big deployments, the amount of keys that are
      yet to expire in each server, even if they are logically expired, may
      account for a very big amount of wasted memory.
      83923afa
  6. 16 2月, 2018 13 次提交
  7. 13 2月, 2018 5 次提交
  8. 03 2月, 2018 1 次提交
  9. 02 2月, 2018 1 次提交
  10. 24 1月, 2018 5 次提交
  11. 18 1月, 2018 3 次提交
    • A
      Fix migrateCommand() access of not initialized byte. · 4acd6973
      antirez 提交于
      4acd6973
    • G
      Replication buffer fills up on high rate traffic. · 548e4fe0
      Guy Benoish 提交于
      When feeding the master with a high rate traffic the the slave's feed is much slower.
      This causes the replication buffer to grow (indefinitely) which leads to slave disconnection.
      The problem is that writeToClient() decides to stop writing after NET_MAX_WRITES_PER_EVENT
      writes (In order to be fair to clients).
      We should ignore this when the client is a slave.
      It's better if clients wait longer, the alternative is that the slave has no chance to stay in
      sync in this situation.
      548e4fe0
    • A
      Cluster: improve anti-affinity algo in redis-trib.rb. · efa7063c
      antirez 提交于
      See #3462 and related PRs.
      
      We use a simple algorithm to calculate the level of affinity violation,
      and then an optimizer that performs random swaps until things improve.
      efa7063c