1. 11 3月, 2014 1 次提交
    • M
      Fix "can't bind to address" error reporting. · 6f4b5ef6
      Matt Stancliff 提交于
      Report the actual port used for the listening attempt instead of
      server.port.
      
      Originally, Redis would just listen on server.port.
      But, with clustering, Redis uses a Cluster Port too,
      so we can't say server.port is always where we are listening.
      
      If you tried to launch Redis with a too-high port number (any
      port where Port+10000 > 65535), Redis would refuse to start, but
      only print an error saying it can't connect to the Redis port.
      
      This patch fixes much confusions.
      6f4b5ef6
  2. 05 3月, 2014 8 次提交
    • A
      4d5ba596
    • A
      Sentinel: more aggressive failover start desynchronization. · 313f8831
      antirez 提交于
      Sentinel needs to avoid split brain conditions due to multiple sentinels
      trying to get voted at the exact same time.
      
      So far some desynchronization was provided by fluctuating server.hz,
      that is the frequency of the timer function call. However the
      desynchonization provided in this way was not enough when using many
      Sentinel instances, especially when a large quorum value is used in
      order to force a greater degree of agreement (more than N/2+1).
      
      It was verified that it was likely to trigger a split brain
      condition, forcing the system to try again after a timeout.
      Usually the system will succeed after a few retries, but this is not
      optimal.
      
      This commit desynchronizes instances in a more effective way to make it
      likely that the first attempt will be successful.
      313f8831
    • A
      CONFIG REWRITE should be logged at WARNING level. · 5ee23944
      antirez 提交于
      5ee23944
    • A
      Cluster: invalidate current transaction on redirections. · d2e16801
      antirez 提交于
      d2e16801
    • A
      Document why we update peak memory in INFO. · 1d9eb47f
      antirez 提交于
      1d9eb47f
    • A
      Fix configEpoch assignment when a cluster slot gets "closed". · e4833ed8
      antirez 提交于
      This is still code to rework in order to use agreement to obtain a new
      configEpoch when a slot is migrated, however this commit handles the
      special case that happens when the nodes are just started and everybody
      has a configEpoch of 0. In this special condition to have the maximum
      configEpoch is not enough as the special epoch 0 is not unique (all the
      others are).
      
      This does not fixes the intrinsic race condition of a failover happening
      while we are resharding, that will be addressed later.
      e4833ed8
    • M
      Force INFO used_memory_peak to match peak memory · 7c092b67
      Matt Stancliff 提交于
      used_memory_peak only updates in serverCron every server.hz,
      but Redis can use more memory and a user can request memory
      INFO before used_memory_peak gets updated in the next
      cron run.
      
      This patch updates used_memory_peak to the current
      memory usage if the current memory usage is higher
      than the recorded used_memory_peak value.
      
      (And it only calls zmalloc_used_memory() once instead of
      twice as it was doing before.)
      7c092b67
    • M
      Improved bigkeys with progress, pipelining and summary · 23addbb5
      michael-grunder 提交于
      This commit reworks the redis-cli --bigkeys command to provide more
      information about our progress as well as output summary information
      when we're done.
      
       - We now show an approximate percentage completion as we go
       - Hiredis pipelining is used for TYPE and SIZE retreival
       - A summary of keyspace distribution and overall breakout at the end
      23addbb5
  3. 28 2月, 2014 1 次提交
  4. 27 2月, 2014 5 次提交
    • A
      warnigns -> warnings in redisBitpos(). · 9104f1e6
      antirez 提交于
      9104f1e6
    • A
      More consistent BITPOS behavior with bit=0 and ranges. · eacc0951
      antirez 提交于
      With the new behavior it is possible to specify just the start in the
      range (the end will be assumed to be the first byte), or it is possible
      to specify both start and end.
      
      This is useful to change the behavior of the command when looking for
      zeros inside a string.
      
      1) If the user specifies both start and end, and no 0 is found inside
         the range, the command returns -1.
      
      2) If instead no range is specified, or just the start is given, even
         if in the actual string no 0 bit is found, the command returns the
         first bit on the right after the end of the string.
      
      So for example if the string stored at key foo is "\xff\xff":
      
          BITPOS foo (returns 16)
          BITPOS foo 0 -1 (returns -1)
          BITPOS foo 0 (returns 16)
      
      The idea is that when no end is given the user is just looking for the
      first bit that is zero and can be set to 1 with SETBIT, as it is
      "available". Instead when a specific range is given, we just look for a
      zero within the boundaries of the range.
      eacc0951
    • A
      Initial implementation of BITPOS. · 1f8005ca
      antirez 提交于
      It appears to work but more stress testing, and both unit tests and
      fuzzy testing, is needed in order to ensure the implementation is sane.
      1f8005ca
    • A
      Fix misaligned word access in redisPopcount(). · 24265edb
      antirez 提交于
      24265edb
    • M
      Fix IP representation in clusterMsgDataGossip · 7e274194
      Matt Stancliff 提交于
      7e274194
  5. 25 2月, 2014 13 次提交
  6. 20 2月, 2014 1 次提交
  7. 18 2月, 2014 4 次提交
    • A
      Sentinel: SENTINEL_SLAVE_RECONF_RETRY_PERIOD -> RECONF_TIMEOUT · 6e4662e4
      antirez 提交于
      Rename define to match the new meaning.
      6e4662e4
    • A
      Sentinel: fix slave promotion timeout. · bd31fcf1
      antirez 提交于
      If we can't reconfigure a slave in time during failover, go forward as
      anyway the slave will be fixed by Sentinels in the future, once they
      detect it is misconfigured.
      
      Otherwise a failover in progress may never terminate if for some reason
      the slave is uncapable to sync with the master while at the same time
      it is not disconnected.
      bd31fcf1
    • A
      Get absoulte config file path before processig 'dir'. · 58b6dd9b
      antirez 提交于
      The code tried to obtain the configuration file absolute path after
      processing the configuration file. However if config file was a relative
      path and a "dir" statement was processed reading the config, the absolute
      path obtained was wrong.
      
      With this fix the absolute path is obtained before processing the
      configuration while the server is still in the original directory where
      it was executed.
      58b6dd9b
    • A
      Sentinel: better specify startup errors due to config file. · c36a5dce
      antirez 提交于
      Now it logs the file name if it is not accessible. Also there is a
      different error for the missing config file case, and for the non
      writable file case.
      c36a5dce
  8. 13 2月, 2014 3 次提交
    • A
      Update cached time in rdbLoad() callback. · 3c1672da
      antirez 提交于
      server.unixtime and server.mstime are cached less precise timestamps
      that we use every time we don't need an accurate time representation and
      a syscall would be too slow for the number of calls we require.
      
      Such an example is the initialization and update process of the last
      interaction time with the client, that is used for timeouts.
      
      However rdbLoad() can take some time to load the DB, but at the same
      time it did not updated the time during DB loading. This resulted in the
      bug described in issue #1535, where in the replication process the slave
      loads the DB, creates the redisClient representation of its master, but
      the timestamp is so old that the master, under certain conditions, is
      sensed as already "timed out".
      
      Thanks to @yoav-steinberg and Redis Labs Inc for the bug report and
      analysis.
      3c1672da
    • A
      Log when CONFIG REWRITE goes bad. · 116617c5
      antirez 提交于
      116617c5
    • A
      Fix script cache bug in the scripting engine. · 14143fbe
      antirez 提交于
      This commit fixes a serious Lua scripting replication issue, described
      by Github issue #1549. The root cause of the problem is that scripts
      were put inside the script cache, assuming that slaves and AOF already
      contained it, even if the scripts sometimes produced no changes in the
      data set, and were not actaully propagated to AOF/slaves.
      
      Example:
      
          eval "if tonumber(KEYS[1]) > 0 then redis.call('incr', 'x') end" 1 0
      
      Then:
      
          evalsha <sha1 step 1 script> 1 0
      
      At this step sha1 of the script is added to the replication script cache
      (the script is marked as known to the slaves) and EVALSHA command is
      transformed to EVAL. However it is not dirty (there is no changes to db),
      so it is not propagated to the slaves. Then the script is called again:
      
          evalsha <sha1 step 1 script> 1 1
      
      At this step master checks that the script already exists in the
      replication script cache and doesn't transform it to EVAL command. It is
      dirty and propagated to the slaves, but they fail to evaluate the script
      as they don't have it in the script cache.
      
      The fix is trivial and just uses the new API to force the propagation of
      the executed command regardless of the dirty state of the data set.
      
      Thank you to @minus-infinity on Github for finding the issue,
      understanding the root cause, and fixing it.
      14143fbe
  9. 12 2月, 2014 2 次提交
    • A
      AOF write error: retry with a frequency of 1 hz. · 0296aab6
      antirez 提交于
      0296aab6
    • A
      AOF: don't abort on write errors unless fsync is 'always'. · dd73a7bf
      antirez 提交于
      A system similar to the RDB write error handling is used, in which when
      we can't write to the AOF file, writes are no longer accepted until we
      are able to write again.
      
      For fsync == always we still abort on errors since there is currently no
      easy way to avoid replying with success to the user otherwise, and this
      would violate the contract with the user of only acknowledging data
      already secured on disk.
      dd73a7bf
  10. 11 2月, 2014 2 次提交