1. 25 3月, 2014 8 次提交
    • M
      Replace magic 32 with REDIS_EVENTLOOP_FDSET_INCR · 6826af1b
      Matt Stancliff 提交于
      32 was the additional number of file descriptors Redis
      would reserve when managing a too-low ulimit.  The
      number 32 was in too many places statically, so now
      we use a macro instead that looks more appropriate.
      
      When Redis sets up the server event loop, it uses:
          server.maxclients+REDIS_EVENTLOOP_FDSET_INCR
      
      So, when reserving file descriptors, it makes sense to
      reserve at least REDIS_EVENTLOOP_FDSET_INCR FDs instead
      of only 32.  Currently, REDIS_EVENTLOOP_FDSET_INCR is
      set to 128 in redis.h.
      
      Also, I replaced the static 128 in the while f < old loop
      with REDIS_EVENTLOOP_FDSET_INCR as well, which results
      in no change since it was already 128.
      
      Impact: Users now need at least maxclients+128 as
      their open file limit instead of maxclients+32 to obtain
      actual "maxclients" number of clients.  Redis will carve
      the extra REDIS_EVENTLOOP_FDSET_INCR file descriptors it
      needs out of the "maxclients" range instead of failing
      to start (unless the local ulimit -n is too low to accomidate
      the request).
      6826af1b
    • M
      Fix maxclients error handling · 611372fa
      Matt Stancliff 提交于
      Everywhere in the Redis code base, maxclients is treated
      as an int with (int)maxclients or `maxclients = atoi(source)`,
      so let's make maxclients an int.
      
      This fixes a bug where someone could specify a negative maxclients
      on startup and it would work (as well as set maxclients very high)
      because:
      
          unsigned int maxclients;
          char *update = "-300";
          maxclients = atoi(update);
          if (maxclients < 1) goto fail;
      
      But, (maxclients < 1) can only catch the case when maxclients
      is exactly 0.  maxclients happily sets itself to -300, which isn't
      -300, but rather 4294966996, which isn't < 1, so... everything
      "worked."
      
      maxclients config parsing checks for the case of < 1, but maxclients
      CONFIG SET parsing was checking for case of < 0 (allowing
      maxclients to be set to 0).  CONFIG SET parsing is now updated to
      match config parsing of < 1.
      
      It's tempting to add a MINIMUM_CLIENTS define, but... I didn't.
      
      These changes were inspired by antirez#356, but this doesn't
      fix that issue.
      611372fa
    • M
      Sentinel: remove variable causing warning · 3bd32406
      Matt Stancliff 提交于
      GCC-4.9 warned about this, but clang didn't.
      
      This commit fixes warning:
      sentinel.c: In function 'sentinelReceiveHelloMessages':
      sentinel.c:2156:43: warning: variable 'master' set but not used [-Wunused-but-set-variable]
           sentinelRedisInstance *ri = c->data, *master;
      3bd32406
    • A
      Fixed undefined variable value with certain code paths. · 79349aff
      antirez 提交于
      In sentinelFlushConfig() fd could be undefined when the following if
      statement was true:
      
              if (rewrite_status == -1) goto werr;
      
      This could cause random file descriptors to get closed.
      79349aff
    • M
      Sentinel: Notify user when config can't be saved · 80dec5e4
      Matt Stancliff 提交于
      80dec5e4
    • J
      Small typo fixed · a2ec9a90
      Jan-Erik Rediger 提交于
      a2ec9a90
    • M
      Fix data loss when save AOF/RDB with no free space · 88c6c669
      Matt Stancliff 提交于
      Previously, the (!fp) would only catch lack of free space
      under OS X.  Linux waits to discover it can't write until
      it actually writes contents to disk.
      
      (fwrite() returns success even if the underlying file
      has no free space to write into.  All the errors
      only show up at flush/sync/close time.)
      
      Fixes antirez/redis#1604
      88c6c669
    • J
      Finally fix the `install_server.sh` script. · 6182dad3
      Jan-Erik Rediger 提交于
      Includes changes from a dozen bug reports and pull requests.
      Was tested on Ubuntu, Debian and CentOS.
      6182dad3
  2. 24 3月, 2014 4 次提交
    • A
      Sample and cache RSS in serverCron(). · 4ebc7e37
      antirez 提交于
      Obtaining the RSS (Resident Set Size) info is slow in Linux and OSX.
      This slowed down the generation of the INFO 'memory' section.
      
      Since the RSS does not require to be a real-time measurement, we
      now sample it with server.hz frequency (10 times per second by default)
      and use this value both to show the INFO rss field and to compute the
      fragmentation ratio.
      
      Practically this does not make any difference for memory profiling of
      Redis but speeds up the INFO call significantly.
      4ebc7e37
    • A
      sdscatvprintf(): Try to use a static buffer. · 72257f4b
      antirez 提交于
      For small content the function now tries to use a static buffer to avoid
      a malloc/free cycle that is too costly when the function is used in the
      context of performance critical code path such as INFO output generation.
      
      This change was verified to have positive effects in the execution speed
      of the INFO command.
      72257f4b
    • A
      Cache uname() output across INFO calls. · 7054f2a0
      antirez 提交于
      Uname was profiled to be a slow syscall. It produces always the same
      output in the context of a single execution of Redis, so calling it at
      every INFO output generation does not make too much sense.
      
      The uname utsname structure was modified as a static variable. At the
      same time a static integer was added to check if we need to call uname
      the first time.
      7054f2a0
    • A
      sdscatvprintf(): guess buflen using format length. · 7a20f096
      antirez 提交于
      sdscatvprintf() uses a loop where it tries to output the formatted
      string in a buffer of the initial length, if there was not enough room,
      a buffer of doubled size is tried and so forth.
      
      The initial guess for the buffer length was very poor, an hardcoded
      "16". This caused the printf to be processed multiple times without a
      good reason. Given that printf functions are already not fast, the
      overhead was significant.
      
      The new heuristic is to use a buffer 4 times the length of the format
      buffer, and 32 as minimal size. This appears to be a good balance for
      typical uses of the function inside the Redis code base.
      
      This change improved INFO command performances 3 times.
      7a20f096
  3. 21 3月, 2014 12 次提交
    • A
      Use 24 bits for the lru object field and improve resolution. · 001775f5
      antirez 提交于
      There were 2 spare bits inside the Redis object structure that are now
      used in order to enlarge 4x the range of the LRU field.
      
      At the same time the resolution was improved from 10 to 1 second: this
      still provides 194 days before the LRU counter overflows (restarting from
      zero).
      
      This is not a problem since it only causes lack of eviction precision for
      objects not touched for a very long time, and the lack of precision is
      only temporary.
      001775f5
    • A
      Specify lruclock in redisServer structure via REDIS_LRU_BITS. · c68189a1
      antirez 提交于
      The padding field was totally useless: removed.
      c68189a1
    • A
      Set LRU parameters via REDIS_LRU_BITS define. · ff8c8187
      antirez 提交于
      ff8c8187
    • A
      Unify stats reset for CONFIG RESETSTAT / initServer(). · e3b71a1c
      antirez 提交于
      Now CONFIG RESETSTAT makes sure to reset all the fields, and in the
      future it will be simpler to avoid missing new fields.
      e3b71a1c
    • A
      Sentinel: sentinelRefreshInstanceInfo() minor refactoring. · 0937377a
      antirez 提交于
      Test sentinel.tilt condition on top and return if it is true.
      This allows to remove the check for the tilt condition in the remaining
      code paths of the function.
      0937377a
    • A
      Sentinel test: 02 unit better coverage + refactoring. · 686839b4
      antirez 提交于
      686839b4
    • A
      Sentinel test: foreach_instance_id implements 'break'. · 6d0e408a
      antirez 提交于
      6d0e408a
    • A
      Sentinel: instance_is_killed proc added to sentinel.tcl. · ba2edc41
      antirez 提交于
      ba2edc41
    • A
      9c2063fb
    • A
      Sentinel: down-after-milliseconds is not master-specific. · ffa8f479
      antirez 提交于
      addReplySentinelRedisInstance() modified so that this field is displayed
      for all the kind of instances: Sentinels, Masters, Slaves.
      ffa8f479
    • A
      Sentinel failure detection implementation improved. · 42091a79
      antirez 提交于
      Failure detection in Sentinel is ping-pong based. It used to work by
      remembering the last time a valid PONG reply was received, and checking
      if the reception time was too old compared to the current current time.
      
      PINGs were sent at a fixed interval of 1 second.
      
      This works in a decent way, but does not scale well when we want to set
      very small values of "down-after-milliseconds" (this is the node
      timeout basically).
      
      This commit reiplements the failure detection making a number of
      changes. Some changes are inspired to Redis Cluster failure detection
      code:
      
      * A new last_ping_time field is added in representation of instances.
        If non zero, we have an active ping that was sent at the specified
        time. When a valid reply to ping is received, the field is zeroed
        again.
      * last_ping_time is not reset when we reconnect the link or send a new
        ping, so from our point of view it represents the time we started
        waiting for the instance to reply to our pings without receiving a
        reply.
      * last_ping_time is now used in order to check if the instance is
        timed out. This means that we can have a node timeout of 100
        milliseconds and yet the system will work well since the new check is
        not bound to the period used to send pings.
      * Pings are now sent every second, or often if the value of
        down-after-milliseconds is less than one second. With a lower limit of
        10 HZ ping frequency.
      * Link reconnection code was improved. This is used in order to try to
        reconnect the link when we are at 50% of the node timeout without a
        valid reply received yet. However the old code triggered unnecessary
        reconnections when the node timeout was very small. Now that should be
        ok.
      
      The new code passes the tests but more testing is needed and more unit
      tests stressing the failure detector, so currently this is merged only
      in the unstable branch.
      42091a79
    • A
      Sentinel: use CLIENT SETNAME when connecting to Redis. · 38241c4b
      antirez 提交于
      This makes debugging / monitoring of Sentinels simpler since you can
      identify sentinels in CLIENT LIST output of Redis instances.
      38241c4b
  4. 15 3月, 2014 2 次提交
    • M
      Fix segfault from accessing array out of bounds · 9de07558
      Matt Stancliff 提交于
      argc == 2; argv[2] == crash
      9de07558
    • A
      Sentinel: be safe under crash-recovery assumptions. · a31a0b43
      antirez 提交于
      Sentinel's main safety argument is that there are no two configurations
      for the same master with the same version (configuration epoch).
      
      For this to be true Sentinels require to be authorized by a majority.
      Additionally Sentinels require to do two important things:
      
      * Never vote again for the same epoch.
      * Never exchange an old vote for a fresh one.
      
      The first prerequisite, in a crash-recovery system model, requires to
      persist the master->leader_epoch on durable storage before to reply to
      messages. This was not the case.
      
      We also make sure to persist the current epoch in order to never reply
      to stale votes requests from other Sentinels, after a recovery.
      
      The configuration is persisted by making use of fsync(), this is
      considered in the context of this code a good enough guarantee that
      after a restart our durable state is restored, however this may not
      always be the case depending on the kind of hardware and operating
      system used.
      a31a0b43
  5. 14 3月, 2014 2 次提交
    • A
      Sentinel: fake PUBLISH command to receive HELLO messages. · 6b0e36ff
      antirez 提交于
      Now the way HELLO messages are received is unified.
      Now it is no longer needed for Sentinels to converge to the higher
      configuration for a master to be able to chat via some Redis instance,
      the are able to directly exchanges configurations.
      
      Note that this commit does not include the (trivial) change needed to
      send HELLO messages to Sentinel instances as well, since for an error I
      committed the change in the previous commit that refactored hello
      messages processing into a separated function.
      6b0e36ff
    • A
  6. 13 3月, 2014 1 次提交
  7. 11 3月, 2014 6 次提交
  8. 10 3月, 2014 2 次提交
  9. 05 3月, 2014 3 次提交