2244 Commits

Author SHA1 Message Date
Jan-Erik Rediger
a2ec9a9068 Small typo fixed 2014-03-25 08:02:56 +01:00
Matt Stancliff
88c6c669c3 Fix data loss when save AOF/RDB with no free space
Previously, the (!fp) would only catch lack of free space
under OS X.  Linux waits to discover it can't write until
it actually writes contents to disk.

(fwrite() returns success even if the underlying file
has no free space to write into.  All the errors
only show up at flush/sync/close time.)

Fixes antirez/redis#1604
2014-03-24 21:56:22 +01:00
antirez
4ebc7e3738 Sample and cache RSS in serverCron().
Obtaining the RSS (Resident Set Size) info is slow in Linux and OSX.
This slowed down the generation of the INFO 'memory' section.

Since the RSS does not require to be a real-time measurement, we
now sample it with server.hz frequency (10 times per second by default)
and use this value both to show the INFO rss field and to compute the
fragmentation ratio.

Practically this does not make any difference for memory profiling of
Redis but speeds up the INFO call significantly.
2014-03-24 12:03:44 +01:00
antirez
72257f4b87 sdscatvprintf(): Try to use a static buffer.
For small content the function now tries to use a static buffer to avoid
a malloc/free cycle that is too costly when the function is used in the
context of performance critical code path such as INFO output generation.

This change was verified to have positive effects in the execution speed
of the INFO command.
2014-03-24 10:23:58 +01:00
antirez
7054f2a029 Cache uname() output across INFO calls.
Uname was profiled to be a slow syscall. It produces always the same
output in the context of a single execution of Redis, so calling it at
every INFO output generation does not make too much sense.

The uname utsname structure was modified as a static variable. At the
same time a static integer was added to check if we need to call uname
the first time.
2014-03-24 10:02:58 +01:00
antirez
7a20f09647 sdscatvprintf(): guess buflen using format length.
sdscatvprintf() uses a loop where it tries to output the formatted
string in a buffer of the initial length, if there was not enough room,
a buffer of doubled size is tried and so forth.

The initial guess for the buffer length was very poor, an hardcoded
"16". This caused the printf to be processed multiple times without a
good reason. Given that printf functions are already not fast, the
overhead was significant.

The new heuristic is to use a buffer 4 times the length of the format
buffer, and 32 as minimal size. This appears to be a good balance for
typical uses of the function inside the Redis code base.

This change improved INFO command performances 3 times.
2014-03-24 09:45:18 +01:00
antirez
001775f5fb Use 24 bits for the lru object field and improve resolution.
There were 2 spare bits inside the Redis object structure that are now
used in order to enlarge 4x the range of the LRU field.

At the same time the resolution was improved from 10 to 1 second: this
still provides 194 days before the LRU counter overflows (restarting from
zero).

This is not a problem since it only causes lack of eviction precision for
objects not touched for a very long time, and the lack of precision is
only temporary.
2014-03-21 11:19:28 +01:00
antirez
c68189a19f Specify lruclock in redisServer structure via REDIS_LRU_BITS.
The padding field was totally useless: removed.
2014-03-21 11:17:52 +01:00
antirez
ff8c81873a Set LRU parameters via REDIS_LRU_BITS define. 2014-03-21 11:17:36 +01:00
antirez
e3b71a1c02 Unify stats reset for CONFIG RESETSTAT / initServer().
Now CONFIG RESETSTAT makes sure to reset all the fields, and in the
future it will be simpler to avoid missing new fields.
2014-03-21 11:16:12 +01:00
antirez
0937377aa2 Sentinel: sentinelRefreshInstanceInfo() minor refactoring.
Test sentinel.tilt condition on top and return if it is true.
This allows to remove the check for the tilt condition in the remaining
code paths of the function.
2014-03-21 11:16:12 +01:00
antirez
9c2063fb8f Sentinel: propagate down-after-ms changes to slaves and sentinels. 2014-03-21 11:16:12 +01:00
antirez
ffa8f479a5 Sentinel: down-after-milliseconds is not master-specific.
addReplySentinelRedisInstance() modified so that this field is displayed
for all the kind of instances: Sentinels, Masters, Slaves.
2014-03-21 11:16:12 +01:00
antirez
42091a79bb Sentinel failure detection implementation improved.
Failure detection in Sentinel is ping-pong based. It used to work by
remembering the last time a valid PONG reply was received, and checking
if the reception time was too old compared to the current current time.

PINGs were sent at a fixed interval of 1 second.

This works in a decent way, but does not scale well when we want to set
very small values of "down-after-milliseconds" (this is the node
timeout basically).

This commit reiplements the failure detection making a number of
changes. Some changes are inspired to Redis Cluster failure detection
code:

* A new last_ping_time field is added in representation of instances.
  If non zero, we have an active ping that was sent at the specified
  time. When a valid reply to ping is received, the field is zeroed
  again.
* last_ping_time is not reset when we reconnect the link or send a new
  ping, so from our point of view it represents the time we started
  waiting for the instance to reply to our pings without receiving a
  reply.
* last_ping_time is now used in order to check if the instance is
  timed out. This means that we can have a node timeout of 100
  milliseconds and yet the system will work well since the new check is
  not bound to the period used to send pings.
* Pings are now sent every second, or often if the value of
  down-after-milliseconds is less than one second. With a lower limit of
  10 HZ ping frequency.
* Link reconnection code was improved. This is used in order to try to
  reconnect the link when we are at 50% of the node timeout without a
  valid reply received yet. However the old code triggered unnecessary
  reconnections when the node timeout was very small. Now that should be
  ok.

The new code passes the tests but more testing is needed and more unit
tests stressing the failure detector, so currently this is merged only
in the unstable branch.
2014-03-21 11:16:11 +01:00
antirez
38241c4b1e Sentinel: use CLIENT SETNAME when connecting to Redis.
This makes debugging / monitoring of Sentinels simpler since you can
identify sentinels in CLIENT LIST output of Redis instances.
2014-03-21 11:16:11 +01:00
Matt Stancliff
9de0755869 Fix segfault from accessing array out of bounds
argc == 2; argv[2] == crash
2014-03-14 22:57:10 +01:00
antirez
a31a0b4336 Sentinel: be safe under crash-recovery assumptions.
Sentinel's main safety argument is that there are no two configurations
for the same master with the same version (configuration epoch).

For this to be true Sentinels require to be authorized by a majority.
Additionally Sentinels require to do two important things:

* Never vote again for the same epoch.
* Never exchange an old vote for a fresh one.

The first prerequisite, in a crash-recovery system model, requires to
persist the master->leader_epoch on durable storage before to reply to
messages. This was not the case.

We also make sure to persist the current epoch in order to never reply
to stale votes requests from other Sentinels, after a recovery.

The configuration is persisted by making use of fsync(), this is
considered in the context of this code a good enough guarantee that
after a restart our durable state is restored, however this may not
always be the case depending on the kind of hardware and operating
system used.
2014-03-14 22:57:06 +01:00
antirez
6b0e36ffc6 Sentinel: fake PUBLISH command to receive HELLO messages.
Now the way HELLO messages are received is unified.
Now it is no longer needed for Sentinels to converge to the higher
configuration for a master to be able to chat via some Redis instance,
the are able to directly exchanges configurations.

Note that this commit does not include the (trivial) change needed to
send HELLO messages to Sentinel instances as well, since for an error I
committed the change in the previous commit that refactored hello
messages processing into a separated function.
2014-03-14 11:04:54 +01:00
antirez
bd48ff69b0 Sentinel: HELLO processing refactored into sentinelProcessHelloMessage(). 2014-03-14 10:56:56 +01:00
antirez
3703112671 Linenoise updated, multiline mode enabled in redis-cli. 2014-03-13 15:12:04 +01:00
zhanghailei
ccd2c18c1c According to context,the size should be 16 rather than 64 2014-03-11 10:12:19 +01:00
zhanghailei
daa5d2a6c9 FIXED a typo more thank should be more than 2014-03-11 10:12:13 +01:00
zhanghailei
ca0720b694 refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits 2014-03-11 10:12:00 +01:00
Matt Stancliff
ffe742f92a Reset op_sec_last_sample_ops when reset requested
This value needs to be set to zero (in addition to
stat_numcommands) or else people may see
a negative operations per second count after they
run CONFIG RESETSTAT.

Fixes antirez/redis#1577
2014-03-11 10:10:54 +01:00
antirez
eb9e1526e6 DEBUG ERROR implemented.
The new "error" subcommand of the DEBUG command can reply with an user
selected error, specified as its sole argument:

    DEBUG ERROR "LOADING please wait..."

The error is generated just prefixing the command argument with a "-"
character, and replacing newlines with spaces (since error replies can't
include newlines).

The goal of the command is to help in Client libraries unit tests by
making simple to simulate a command call triggering a given error.
2014-03-10 23:04:37 +01:00
Matt Stancliff
a6970570e3 Fix return value check for anetTcpAccept
anetTcpAccept returns ANET_ERR, not AE_ERR.

This isn't a physical error since both ANET_ERR
and AE_ERR are -1, but better to be consistent.
2014-03-10 15:47:20 +01:00
antirez
fe0ab7d234 Fixed memory leak in SORT LIMIT option argument parsing on error. 2014-03-10 15:45:29 +01:00
antirez
464fef9bf8 Redis 2.8.7. 2014-03-05 14:42:50 +01:00
antirez
aab16ead92 Cast saveparams[].seconds to long for %ld format specifier. 2014-03-05 11:26:50 +01:00
antirez
55b8f6ec1c Document why we update peak memory in INFO. 2014-03-05 10:16:20 +01:00
Matt Stancliff
647a261465 Force INFO used_memory_peak to match peak memory
used_memory_peak only updates in serverCron every server.hz,
but Redis can use more memory and a user can request memory
INFO before used_memory_peak gets updated in the next
cron run.

This patch updates used_memory_peak to the current
memory usage if the current memory usage is higher
than the recorded used_memory_peak value.

(And it only calls zmalloc_used_memory() once instead of
twice as it was doing before.)
2014-03-05 10:16:16 +01:00
antirez
9c66cd91be Sentinel test: Makefile target added. 2014-03-05 10:16:12 +01:00
michael-grunder
6991792943 Improved bigkeys with progress, pipelining and summary
This commit reworks the redis-cli --bigkeys command to provide more
information about our progress as well as output summary information
when we're done.

 - We now show an approximate percentage completion as we go
 - Hiredis pipelining is used for TYPE and SIZE retreival
 - A summary of keyspace distribution and overall breakout at the end
2014-03-05 10:16:06 +01:00
antirez
1606978a0b Sentinel: more aggressive failover start desynchronization.
Sentinel needs to avoid split brain conditions due to multiple sentinels
trying to get voted at the exact same time.

So far some desynchronization was provided by fluctuating server.hz,
that is the frequency of the timer function call. However the
desynchonization provided in this way was not enough when using many
Sentinel instances, especially when a large quorum value is used in
order to force a greater degree of agreement (more than N/2+1).

It was verified that it was likely to trigger a split brain
condition, forcing the system to try again after a timeout.
Usually the system will succeed after a few retries, but this is not
optimal.

This commit desynchronizes instances in a more effective way to make it
likely that the first attempt will be successful.
2014-03-05 10:15:32 +01:00
antirez
6200191943 CONFIG REWRITE should be logged at WARNING level. 2014-03-05 10:15:32 +01:00
antirez
c1cc28f230 warnigns -> warnings in redisBitpos(). 2014-02-27 15:55:56 +01:00
antirez
d79f9ebdb5 More consistent BITPOS behavior with bit=0 and ranges.
With the new behavior it is possible to specify just the start in the
range (the end will be assumed to be the first byte), or it is possible
to specify both start and end.

This is useful to change the behavior of the command when looking for
zeros inside a string.

1) If the user specifies both start and end, and no 0 is found inside
   the range, the command returns -1.

2) If instead no range is specified, or just the start is given, even
   if in the actual string no 0 bit is found, the command returns the
   first bit on the right after the end of the string.

So for example if the string stored at key foo is "\xff\xff":

    BITPOS foo (returns 16)
    BITPOS foo 0 -1 (returns -1)
    BITPOS foo 0 (returns 16)

The idea is that when no end is given the user is just looking for the
first bit that is zero and can be set to 1 with SETBIT, as it is
"available". Instead when a specific range is given, we just look for a
zero within the boundaries of the range.
2014-02-27 15:55:47 +01:00
antirez
25e2791ec3 Initial implementation of BITPOS.
It appears to work but more stress testing, and both unit tests and
fuzzy testing, is needed in order to ensure the implementation is sane.
2014-02-27 15:54:31 +01:00
antirez
580ed0768f Fix misaligned word access in redisPopcount(). 2014-02-27 15:53:40 +01:00
antirez
65194d11c3 redis-cli: also remove useless uint8_t. 2014-02-25 15:09:15 +01:00
antirez
16c2189d4e redis-cli: don't use uint64_t where actually not needed.
The computation is just something to take the CPU busy, no need to use a
specific type. Since stdint.h was not included this prevented
compilation on certain systems.
2014-02-25 15:09:15 +01:00
antirez
1b2bcd4210 redis-cli: check argument existence for --pattern. 2014-02-25 15:09:15 +01:00
antirez
6ec3dc2c42 redis-cli: --intrinsic-latency run mode added. 2014-02-25 15:09:15 +01:00
antirez
02ae6cb01a redis-cli: added comments to split program in parts. 2014-02-25 15:09:15 +01:00
michael-grunder
cf76fdd04f Update --bigkeys to use SCAN
This commit changes the findBigKeys() function in redis-cli.c to use the new
SCAN command for iterating the keyspace, rather than RANDOMKEY.  Because we
can know when we're done using SCAN, it will exit after exhausting the keyspace.
2014-02-25 15:08:38 +01:00
antirez
85fa77e02e Sentinel: log quorum with +monitor event. 2014-02-25 10:30:35 +01:00
antirez
6e6106797e Sentinel: generate +monitor events at startup. 2014-02-25 10:30:35 +01:00
antirez
96162c0c58 Sentinel: log +monitor and +set events.
Now that we have a runtime configuration system, it is very important to
be able to log how the Sentinel configuration changes over time because
of API calls.
2014-02-25 10:30:35 +01:00
antirez
39eacde136 Sentinel: added missing exit(1) after checking for config file. 2014-02-25 10:30:35 +01:00
antirez
d83ab8f9d4 Sentinel: IDONTKNOW error removed.
This error was conceived for the older version of Sentinel that worked
via master redirection and that was not able to get configuration
updates from other Sentinels via the Pub/Sub channel of masters or
slaves.

This reply does not make sense today, every Sentinel should reply with
the best information it has currently. The error will make even more
sense in the future since the plan is to allow Sentinels to update the
configuration of other Sentinels via gossip with a direct chat without
the prerequisite that they have at least a monitored instance in common.
2014-02-25 10:30:35 +01:00