1. 11 9月, 2012 1 次提交
    • A
      Make sure that SELECT argument is an integer or return an error. · bfc197c3
      antirez 提交于
      Unfortunately we had still the lame atoi() without any error checking in
      place, so "SELECT foo" would work as "SELECT 0". This was not an huge
      problem per se but some people expected that DB can be strings and not
      just numbers, and without errors you get the feeling that they can be
      numbers, but not the behavior.
      
      Now getLongFromObjectOrReply() is used as almost everybody else across
      the code, generating an error if the number is not an integer or
      overflows the long type.
      
      Thanks to @mipearson for reporting that on Twitter.
      bfc197c3
  2. 10 9月, 2012 1 次提交
  3. 05 9月, 2012 4 次提交
    • A
      BITCOUNT regression test for #582 fixed for 32 bit target. · 74e57d0e
      antirez 提交于
      Bug #582 was not present in 32 bit builds of Redis as
      getObjectFromLong() will return an error for overflow.
      
      This commit makes sure that the test does not fail because of the error
      returned when running against 32 bit builds.
      74e57d0e
    • H
      BITCOUNT: fix segmentation fault. · 749aac72
      Haruto Otake 提交于
      remove unsafe and unnecessary cast.
      until now, this cast may lead segmentation fault when end > UINT_MAX
      
      setbit foo 0 1
      bitcount  0 4294967295
      => ok
      bitcount  0 4294967296
      => cause segmentation fault.
      
      Note by @antirez: the commit was modified a bit to also change the
      string length type to long, since it's guaranteed to be at max 512 MB in
      size, so we can work with the same type across all the code path.
      
      A regression test was also added.
      749aac72
    • S
      Merge pull request #576 from saj/fix-slave-ping-period · 24bc807b
      Salvatore Sanfilippo 提交于
      Bug fix: slaves being pinged every second
      24bc807b
    • A
      Scripting: Force SORT BY constant determinism inside SORT itself. · 36741b2c
      antirez 提交于
      SORT is able to return (faster than when ordering) unordered output if
      the "BY" clause is used with a constant value. However we try to play
      well with scripting requirements of determinism providing always sorted
      outputs when SORT (and other similar commands) are called by Lua
      scripts.
      
      However we used the general mechanism in place in scripting in order to
      reorder SORT output, that is, if the command has the "S" flag set, the
      Lua scripting engine will take an additional step when converting a
      multi bulk reply to Lua value, calling a Lua sorting function.
      
      This is suboptimal as we can do it faster inside SORT itself.
      This is also broken as issue #545 shows us: basically when SORT is used
      with a constant BY, and additionally also GET is used, the Lua scripting
      engine was trying to order the output as a flat array, while it was
      actually a list of key-value pairs.
      
      What we do know is to recognized if the caller of SORT is the Lua client
      (since we can check this using the REDIS_LUA_CLIENT flag). If so, and if
      a "don't sort" condition is triggered by the BY option with a constant
      string, we force the lexicographical sorting.
      
      This commit fixes this bug and improves the performance, and at the same
      time simplifies the implementation. This does not mean I'm smart today,
      it means I was stupid when I committed the original implementation ;)
      36741b2c
  4. 04 9月, 2012 2 次提交
    • A
      Sentinel: reply -IDONTKNOW to get-master-addr-by-name on lack of info. · 9bd0e097
      antirez 提交于
      If we don't have any clue about a master since it never replied to INFO
      so far, reply with an -IDONTKNOW error to SENTINEL
      get-master-addr-by-name requests.
      9bd0e097
    • A
      Sentinel: more easy master redirection if master is a slave. · 8bdde086
      antirez 提交于
      Before this commit Sentienl used to redirect master ip/addr if the
      current instance reported to be a slave only if this was the first INFO
      output received, and the role was found to be slave.
      
      Now instead also if we find that the runid is different, and the
      reported role is slave, we also redirect to the reported master ip/addr.
      
      This unifies the behavior of Sentinel in the case of a reboot (where it
      will see the first INFO output with the wrong role and will perform the
      redirection), with the behavior of Sentinel in the case of a change in
      what it sees in the INFO output of the master.
      8bdde086
  5. 02 9月, 2012 1 次提交
    • A
      Send an async PING before starting replication with master. · bb66fc31
      antirez 提交于
      During the first synchronization step of the replication process, a Redis
      slave connects with the master in a non blocking way. However once the
      connection is established the replication continues sending the REPLCONF
      command, and sometimes the AUTH command if needed. Those commands are
      send in a partially blocking way (blocking with timeout in the order of
      seconds).
      
      Because it is common for a blocked master to accept connections even if
      it is actually not able to reply to the slave requests, it was easy for
      a slave to block if the master had serious issues, but was still able to
      accept connections in the listening socket.
      
      For this reason we now send an asynchronous PING request just after the
      non blocking connection ended in a successful way, and wait for the
      reply before to continue with the replication process. It is very
      unlikely that a master replying to PING can't reply to the other
      commands.
      
      This solution was proposed by Didier Spezia (Thanks!) so that we don't
      need to turn all the replication process into a non blocking affair, but
      still the probability of a slave blocked is minimal even in the event of
      a failing master.
      
      Also we now use getsockopt(SO_ERROR) in order to check errors ASAP
      in the event handler, instead of waiting for actual I/O to return an
      error.
      
      This commit fixes issue #632.
      bb66fc31
  6. 31 8月, 2012 3 次提交
    • A
      Scripting: Reset Lua fake client reply_bytes after command execution. · e323635c
      antirez 提交于
      Lua scripting uses a fake client in order to run commands in the context
      of a client, accumulate the reply, and convert it into a Lua object
      to return to the caller. This client is reused again and again, and is
      referenced by the server.lua_client globally accessible pointer.
      
      However after every call to redis.call() or redis.pcall(), that is
      handled by the luaRedisGenericCommand() function, the reply_bytes field
      of the client was not set back to zero. This filed is used to estimate
      the amount of memory currently used in the reply. Because of the lack of
      reset, script after script executed, this value used to get bigger and
      bigger, and in the end on 32 bit systems it triggered the following
      assert:
      
          redisAssert(c->reply_bytes < ULONG_MAX-(1024*64));
      
      On 64 bit systems this does not happen because it takes too much time to
      reach values near to 2^64 for users to see the practical effect of the
      bug.
      
      Now in the cleanup stage of luaRedisGenericCommand() we reset the
      reply_bytes counter to zero, avoiding the issue. It is not practical to
      add a test for this bug, but the fix was manually tested using a
      debugger.
      
      This commit fixes issue #656.
      e323635c
    • A
      Scripting: require at least one argument for redis.call(). · 46c31a15
      antirez 提交于
      Redis used to crash with a call like the following:
      
          EVAL "redis.call()" 0
      
      Now the explicit check for at least one argument prevents the problem.
      
      This commit fixes issue #655.
      46c31a15
    • A
      Sentinel: do not crash against slaves not publishing the runid. · 6276434a
      antirez 提交于
      Older versions of Redis (before 2.4.17) don't publish the runid field in
      INFO. This commit makes Sentinel able to handle that without crashing.
      6276434a
  7. 29 8月, 2012 2 次提交
  8. 28 8月, 2012 5 次提交
    • A
      712656e8
    • A
      Sentinel: Sentinel-side support for slave priority. · 3ec701e0
      antirez 提交于
      The slave priority that is now published by Redis in INFO output is
      now used by Sentinel in order to select the slave with minimum priority
      for promotion, and in order to consider slaves with priority set to 0 as
      not able to play the role of master (they will never be promoted by
      Sentinel).
      
      The "slave-priority" field is now one of the fileds that Sentinel
      publishes when describing an instance via the SENTINEL commands such as
      "SENTINEL slaves mastername".
      3ec701e0
    • A
      Sentinel: Redis-side support for slave priority. · 169a44cb
      antirez 提交于
      A Redis slave can now be configured with a priority, that is an integer
      number that is shown in INFO output and can be get and set using the
      redis.conf file or the CONFIG GET/SET command.
      
      This field is used by Sentinel during slave election. A slave with lower
      priority is preferred. A slave with priority zero is never elected (and
      is considered to be impossible to elect even if it is the only slave
      available).
      
      A next commit will add support in the Sentinel side as well.
      169a44cb
    • A
      Sentinel: suppress harmless warning by initializing 'table' to NULL. · c14e0eca
      antirez 提交于
      Note that the assertion guarantees that one of the if branches setting
      table is always entered.
      c14e0eca
    • A
      Incrementally flush RDB on disk while loading it from a master. · 784b9308
      antirez 提交于
      This fixes issue #539.
      
      Basically if there is enough free memory the OS may buffer the RDB file
      that the slave transfers on disk from the master. The file may
      actually be flused on disk at once by the operating system when it gets
      closed by Redis, causing the close system call to block for a long time.
      
      This patch is a modified version of one provided by yoav-steinberg of
      @garantiadata (the original version was posted in the issue #539
      comments), and tries to flush the OS buffers incrementally (every 8 MB
      of loaded data).
      784b9308
  9. 24 8月, 2012 4 次提交
    • A
      1caa627e
    • A
      Better Out of Memory handling. · 6fdc6354
      antirez 提交于
      The previous implementation of zmalloc.c was not able to handle out of
      memory in an application-specific way. It just logged an error on
      standard error, and aborted.
      
      The result was that in the case of an actual out of memory in Redis
      where malloc returned NULL (In Linux this actually happens under
      specific overcommit policy settings and/or with no or little swap
      configured) the error was not properly logged in the Redis log.
      
      This commit fixes this problem, fixing issue #509.
      Now the out of memory is properly reported in the Redis log and a stack
      trace is generated.
      
      The approach used is to provide a configurable out of memory handler
      to zmalloc (otherwise the default one logging the event on the
      standard output is used).
      6fdc6354
    • A
      Sentinel: send SCRIPT KILL on -BUSY reply and SDOWN instance. · 850789ce
      antirez 提交于
      From the point of view of Redis an instance replying -BUSY is down,
      since it is effectively not able to reply to user requests. However
      a looping script is a recoverable condition in Redis if the script still
      did not performed any write to the dataset. In that case performing a
      fail over is not optimal, so Sentinel now tries to restore the normal server
      condition killing the script with a SCRIPT KILL command.
      
      If the script already performed some write before entering an infinite
      (or long enough to timeout) loop, SCRIPT KILL will not work and the
      fail over will be triggered anyway.
      850789ce
    • A
      Sentinel: fixed a crash on script execution. · 01477753
      antirez 提交于
      The call to sentinelScheduleScriptExecution() lacked the final NULL
      argument to signal the end of arguments. This resulted into a crash.
      01477753
  10. 22 8月, 2012 1 次提交
  11. 21 8月, 2012 2 次提交
    • A
      redis-benchmark: disable big buffer cleanup in hiredis context. · 227b4293
      antirez 提交于
      This new hiredis features allows us to reuse a previous context reader
      buffer even if already very big in order to maximize performances with
      big payloads (Usually hiredis re-creates buffers when they are too big
      and unused in order to save memory).
      227b4293
    • A
      hiredis library updated. · d6704c9b
      antirez 提交于
      This version of hiredis merges modifications of the Redis fork with
      latest changes in the hiredis repository.
      
      The same version was pushed on the hiredis repository and will probably
      merged into the master branch in short time.
      d6704c9b
  12. 14 8月, 2012 2 次提交
  13. 03 8月, 2012 2 次提交
    • A
      Sentinel: SENTINEL FAILOVER command implemented. · cada7f96
      antirez 提交于
      This command can be used in order to force a Sentinel instance to start
      a failover for the specified master, as leader, forcing the failover
      even if the master is up.
      
      The commit also adds some minor refactoring and other improvements to
      functions already implemented that make them able to work when the
      master is not in SDOWN condition. For instance slave selection
      assumed that we ask INFO every second to every slave, this is true
      only when the master is in SDOWN condition, so slave selection did not
      worked when the master was not in SDOWN condition.
      cada7f96
    • A
      Sentinel: client reconfiguration script execution. · 6275004c
      antirez 提交于
      This commit adds support to optionally execute a script when one of the
      following events happen:
      
      * The failover starts (with a slave already promoted).
      * The failover ends.
      * The failover is aborted.
      
      The script is called with enough parameters (documented in the example
      sentinel.conf file) to provide information about the old and new ip:port
      pair of the master, the role of the sentinel (leader or observer) and
      the name of the master.
      
      The goal of the script is to inform clients of the configuration change
      in a way specific to the environment Sentinel is running, that can't be
      implemented in a genereal way inside Sentinel itself.
      6275004c
  14. 02 8月, 2012 2 次提交
  15. 31 7月, 2012 5 次提交
    • A
      Sentinel: when leader in wait-start, sense another leader as race. · fd92b366
      antirez 提交于
      When we are in wait start, if another leader (or any other external
      entity) turns a slave into a master, abort the failover, and detect it
      as an observer.
      
      Note that the wait-start state is mainly there for this reason but the
      abort was yet not implemented.
      
      This adds a new sentinel event -failover-abort-race.
      fd92b366
    • A
      91c15ed1
    • M
      Use correct variable name for value to convert. · f1d187bb
      Michael Parker 提交于
      Note by @antirez: this code was never compiled because utils.c lacked the
      float.h include, so we never noticed this variable was mispelled in the
      past.
      
      This should provide a noticeable speed boost when saving certain types
      of databases with many sorted sets inside.
      f1d187bb
    • A
      Sentinel: sentinel.conf self-documenation improved. · ed2a691a
      antirez 提交于
      ed2a691a
    • A
      Sentinel: abort failover when in wait-start if master is back. · 75084e05
      antirez 提交于
      When we are a Leader Sentinel in wait-start state, starting with this
      commit the failover is aborted if the master returns online.
      
      This improves the way we handle a notable case of net split, that is the
      split between Sentinels and Redis servers, that will be a very common
      case of split becase Sentinels will often be installed in the client's
      network and servers can be in a differnt arm of the network.
      
      When Sentinels and Redis servers are isolated the master is in ODOWN
      condition since the Sentinels can agree about this state, however the
      failover does not start since there are no good slaves to promote (in
      this specific case all the slaves are unreachable).
      
      However when the split is resolved, Sentinels may sense the slave back
      a moment before they sense the master is back, so the failover may start
      without a good reason (since the master is actually working too).
      
      Now this condition is reversible, so the failover will be aborted
      immediately after if the master is detected to be working again, that
      is, not in SDOWN nor in ODOWN condition.
      75084e05
  16. 29 7月, 2012 3 次提交