3592 Commits

Author SHA1 Message Date
antirez
9c2063fb8f Sentinel: propagate down-after-ms changes to slaves and sentinels. 2014-03-21 11:16:12 +01:00
antirez
ffa8f479a5 Sentinel: down-after-milliseconds is not master-specific.
addReplySentinelRedisInstance() modified so that this field is displayed
for all the kind of instances: Sentinels, Masters, Slaves.
2014-03-21 11:16:12 +01:00
antirez
42091a79bb Sentinel failure detection implementation improved.
Failure detection in Sentinel is ping-pong based. It used to work by
remembering the last time a valid PONG reply was received, and checking
if the reception time was too old compared to the current current time.

PINGs were sent at a fixed interval of 1 second.

This works in a decent way, but does not scale well when we want to set
very small values of "down-after-milliseconds" (this is the node
timeout basically).

This commit reiplements the failure detection making a number of
changes. Some changes are inspired to Redis Cluster failure detection
code:

* A new last_ping_time field is added in representation of instances.
  If non zero, we have an active ping that was sent at the specified
  time. When a valid reply to ping is received, the field is zeroed
  again.
* last_ping_time is not reset when we reconnect the link or send a new
  ping, so from our point of view it represents the time we started
  waiting for the instance to reply to our pings without receiving a
  reply.
* last_ping_time is now used in order to check if the instance is
  timed out. This means that we can have a node timeout of 100
  milliseconds and yet the system will work well since the new check is
  not bound to the period used to send pings.
* Pings are now sent every second, or often if the value of
  down-after-milliseconds is less than one second. With a lower limit of
  10 HZ ping frequency.
* Link reconnection code was improved. This is used in order to try to
  reconnect the link when we are at 50% of the node timeout without a
  valid reply received yet. However the old code triggered unnecessary
  reconnections when the node timeout was very small. Now that should be
  ok.

The new code passes the tests but more testing is needed and more unit
tests stressing the failure detector, so currently this is merged only
in the unstable branch.
2014-03-21 11:16:11 +01:00
antirez
38241c4b1e Sentinel: use CLIENT SETNAME when connecting to Redis.
This makes debugging / monitoring of Sentinels simpler since you can
identify sentinels in CLIENT LIST output of Redis instances.
2014-03-21 11:16:11 +01:00
Matt Stancliff
9de0755869 Fix segfault from accessing array out of bounds
argc == 2; argv[2] == crash
2014-03-14 22:57:10 +01:00
antirez
a31a0b4336 Sentinel: be safe under crash-recovery assumptions.
Sentinel's main safety argument is that there are no two configurations
for the same master with the same version (configuration epoch).

For this to be true Sentinels require to be authorized by a majority.
Additionally Sentinels require to do two important things:

* Never vote again for the same epoch.
* Never exchange an old vote for a fresh one.

The first prerequisite, in a crash-recovery system model, requires to
persist the master->leader_epoch on durable storage before to reply to
messages. This was not the case.

We also make sure to persist the current epoch in order to never reply
to stale votes requests from other Sentinels, after a recovery.

The configuration is persisted by making use of fsync(), this is
considered in the context of this code a good enough guarantee that
after a restart our durable state is restored, however this may not
always be the case depending on the kind of hardware and operating
system used.
2014-03-14 22:57:06 +01:00
antirez
6b0e36ffc6 Sentinel: fake PUBLISH command to receive HELLO messages.
Now the way HELLO messages are received is unified.
Now it is no longer needed for Sentinels to converge to the higher
configuration for a master to be able to chat via some Redis instance,
the are able to directly exchanges configurations.

Note that this commit does not include the (trivial) change needed to
send HELLO messages to Sentinel instances as well, since for an error I
committed the change in the previous commit that refactored hello
messages processing into a separated function.
2014-03-14 11:04:54 +01:00
antirez
bd48ff69b0 Sentinel: HELLO processing refactored into sentinelProcessHelloMessage(). 2014-03-14 10:56:56 +01:00
antirez
3703112671 Linenoise updated, multiline mode enabled in redis-cli. 2014-03-13 15:12:04 +01:00
zhanghailei
ccd2c18c1c According to context,the size should be 16 rather than 64 2014-03-11 10:12:19 +01:00
zhanghailei
daa5d2a6c9 FIXED a typo more thank should be more than 2014-03-11 10:12:13 +01:00
zhanghailei
ca0720b694 refer to updateLRUClock's comment REDIS_LRU_CLOCK_MAX is 22 bits,but #define REDIS_LRU_CLOCK_MAX ((1<<21)-1) only 21 bits 2014-03-11 10:12:00 +01:00
Matt Stancliff
ffe742f92a Reset op_sec_last_sample_ops when reset requested
This value needs to be set to zero (in addition to
stat_numcommands) or else people may see
a negative operations per second count after they
run CONFIG RESETSTAT.

Fixes antirez/redis#1577
2014-03-11 10:10:54 +01:00
antirez
6fb9e2d717 Typo in sentinel.conf, exists -> exits. 2014-03-11 10:10:30 +01:00
antirez
eb9e1526e6 DEBUG ERROR implemented.
The new "error" subcommand of the DEBUG command can reply with an user
selected error, specified as its sole argument:

    DEBUG ERROR "LOADING please wait..."

The error is generated just prefixing the command argument with a "-"
character, and replacing newlines with spaces (since error replies can't
include newlines).

The goal of the command is to help in Client libraries unit tests by
making simple to simulate a command call triggering a given error.
2014-03-10 23:04:37 +01:00
Matt Stancliff
a6970570e3 Fix return value check for anetTcpAccept
anetTcpAccept returns ANET_ERR, not AE_ERR.

This isn't a physical error since both ANET_ERR
and AE_ERR are -1, but better to be consistent.
2014-03-10 15:47:20 +01:00
antirez
fe0ab7d234 Fixed memory leak in SORT LIMIT option argument parsing on error. 2014-03-10 15:45:29 +01:00
antirez
464fef9bf8 Redis 2.8.7. 2.8.7 2014-03-05 14:42:50 +01:00
antirez
aab16ead92 Cast saveparams[].seconds to long for %ld format specifier. 2014-03-05 11:26:50 +01:00
antirez
86afc0706d Merge branch '2.8' of github.com:/antirez/redis into 2.8 2014-03-05 10:22:56 +01:00
antirez
55b8f6ec1c Document why we update peak memory in INFO. 2014-03-05 10:16:20 +01:00
Matt Stancliff
647a261465 Force INFO used_memory_peak to match peak memory
used_memory_peak only updates in serverCron every server.hz,
but Redis can use more memory and a user can request memory
INFO before used_memory_peak gets updated in the next
cron run.

This patch updates used_memory_peak to the current
memory usage if the current memory usage is higher
than the recorded used_memory_peak value.

(And it only calls zmalloc_used_memory() once instead of
twice as it was doing before.)
2014-03-05 10:16:16 +01:00
antirez
9c66cd91be Sentinel test: Makefile target added. 2014-03-05 10:16:12 +01:00
michael-grunder
6991792943 Improved bigkeys with progress, pipelining and summary
This commit reworks the redis-cli --bigkeys command to provide more
information about our progress as well as output summary information
when we're done.

 - We now show an approximate percentage completion as we go
 - Hiredis pipelining is used for TYPE and SIZE retreival
 - A summary of keyspace distribution and overall breakout at the end
2014-03-05 10:16:06 +01:00
antirez
7d65b7199a BITPOS fuzzy testing. 2014-03-05 10:16:02 +01:00
antirez
c19cfde65d Basic BITPOS tests. 2014-03-05 10:15:55 +01:00
antirez
55f4b20f31 Sentinel test: set less time sensitive defaults.
This commit sets the failover timeout to 30 seconds instead of the 180
seconds default, and allows to reconfigure multiple slaves at the same
time.

This makes tests less sensible to timing, with the result that there are
less false positives due to normal behaviors that require time to
succeed or to be retried.

However the long term solution is probably some way in order to detect
when a test failed because of timing issues (for example split brain
during leader election) and retry it.
2014-03-05 10:15:32 +01:00
antirez
1606978a0b Sentinel: more aggressive failover start desynchronization.
Sentinel needs to avoid split brain conditions due to multiple sentinels
trying to get voted at the exact same time.

So far some desynchronization was provided by fluctuating server.hz,
that is the frequency of the timer function call. However the
desynchonization provided in this way was not enough when using many
Sentinel instances, especially when a large quorum value is used in
order to force a greater degree of agreement (more than N/2+1).

It was verified that it was likely to trigger a split brain
condition, forcing the system to try again after a timeout.
Usually the system will succeed after a few retries, but this is not
optimal.

This commit desynchronizes instances in a more effective way to make it
likely that the first attempt will be successful.
2014-03-05 10:15:32 +01:00
antirez
6200191943 CONFIG REWRITE should be logged at WARNING level. 2014-03-05 10:15:32 +01:00
antirez
2af2173a60 Sentinel test: debugging console improved. 2014-03-05 10:15:32 +01:00
antirez
1dc1e31c2b Sentinel test: initial debugging console. 2014-03-05 10:15:32 +01:00
antirez
b132121a71 Sentinel test: be more patient in create_redis_master_slave_cluster. 2014-03-05 10:15:32 +01:00
antirez
d1d706c923 Sentiel test: add test start time in output. 2014-03-05 10:15:32 +01:00
antirez
c99cd2fd40 Sentinel test: use 1000 as retry in initial 00 unit test. 2014-03-05 10:15:32 +01:00
antirez
f33d4a6c24 Sentinel test: initial tests in 03 unit. 2014-03-05 10:15:32 +01:00
antirez
c2d99d49b5 Sentinel test: foreach_instance_id now supports 'continue'. 2014-03-05 10:15:32 +01:00
antirez
e40f8378bd Sentienl test: fixed typo in unit 03 top comment. 2014-03-05 10:15:32 +01:00
antirez
a0335e18ba Sentinel test: Makefile target added. 2014-02-28 16:00:17 +01:00
antirez
950cb76e7b BITPOS fuzzy testing. 2014-02-27 15:58:34 +01:00
antirez
42e3630d77 Basic BITPOS tests. 2014-02-27 15:58:30 +01:00
antirez
1892b56224 warnigns -> warnings in redisBitpos(). 2014-02-27 15:58:27 +01:00
antirez
2c8036f7b2 More consistent BITPOS behavior with bit=0 and ranges.
With the new behavior it is possible to specify just the start in the
range (the end will be assumed to be the first byte), or it is possible
to specify both start and end.

This is useful to change the behavior of the command when looking for
zeros inside a string.

1) If the user specifies both start and end, and no 0 is found inside
   the range, the command returns -1.

2) If instead no range is specified, or just the start is given, even
   if in the actual string no 0 bit is found, the command returns the
   first bit on the right after the end of the string.

So for example if the string stored at key foo is "\xff\xff":

    BITPOS foo (returns 16)
    BITPOS foo 0 -1 (returns -1)
    BITPOS foo 0 (returns 16)

The idea is that when no end is given the user is just looking for the
first bit that is zero and can be set to 1 with SETBIT, as it is
"available". Instead when a specific range is given, we just look for a
zero within the boundaries of the range.
2014-02-27 15:58:21 +01:00
antirez
3294f74fef Initial implementation of BITPOS.
It appears to work but more stress testing, and both unit tests and
fuzzy testing, is needed in order to ensure the implementation is sane.
2014-02-27 15:58:14 +01:00
antirez
82d2e295b8 Added two more BITCOUNT tests stressing misaligned access. 2014-02-27 15:57:56 +01:00
antirez
a3eb3f9c3b BITCOUNT fuzzy test with random start/end added.
It was verified in practice that this test is able to stress much more
the implementation by introducing errors that were only trivially to
detect with different offsets but impossible to detect starting always
at zero and counting bits the full length of the string.
2014-02-27 15:57:53 +01:00
antirez
30a92b6c76 Fix misaligned word access in redisPopcount(). 2014-02-27 15:57:49 +01:00
antirez
c1cc28f230 warnigns -> warnings in redisBitpos(). 2014-02-27 15:55:56 +01:00
antirez
d79f9ebdb5 More consistent BITPOS behavior with bit=0 and ranges.
With the new behavior it is possible to specify just the start in the
range (the end will be assumed to be the first byte), or it is possible
to specify both start and end.

This is useful to change the behavior of the command when looking for
zeros inside a string.

1) If the user specifies both start and end, and no 0 is found inside
   the range, the command returns -1.

2) If instead no range is specified, or just the start is given, even
   if in the actual string no 0 bit is found, the command returns the
   first bit on the right after the end of the string.

So for example if the string stored at key foo is "\xff\xff":

    BITPOS foo (returns 16)
    BITPOS foo 0 -1 (returns -1)
    BITPOS foo 0 (returns 16)

The idea is that when no end is given the user is just looking for the
first bit that is zero and can be set to 1 with SETBIT, as it is
"available". Instead when a specific range is given, we just look for a
zero within the boundaries of the range.
2014-02-27 15:55:47 +01:00
antirez
25e2791ec3 Initial implementation of BITPOS.
It appears to work but more stress testing, and both unit tests and
fuzzy testing, is needed in order to ensure the implementation is sane.
2014-02-27 15:54:31 +01:00
antirez
c955b47d62 Added two more BITCOUNT tests stressing misaligned access. 2014-02-27 15:53:40 +01:00