4005 Commits

Author SHA1 Message Date
antirez
4d5ba5962c Cast saveparams[].seconds to long for %ld format specifier. 2014-03-05 11:26:46 +01:00
antirez
ef4e465316 Sentinel test: set less time sensitive defaults.
This commit sets the failover timeout to 30 seconds instead of the 180
seconds default, and allows to reconfigure multiple slaves at the same
time.

This makes tests less sensible to timing, with the result that there are
less false positives due to normal behaviors that require time to
succeed or to be retried.

However the long term solution is probably some way in order to detect
when a test failed because of timing issues (for example split brain
during leader election) and retry it.
2014-03-05 10:22:07 +01:00
antirez
313f8831ed Sentinel: more aggressive failover start desynchronization.
Sentinel needs to avoid split brain conditions due to multiple sentinels
trying to get voted at the exact same time.

So far some desynchronization was provided by fluctuating server.hz,
that is the frequency of the timer function call. However the
desynchonization provided in this way was not enough when using many
Sentinel instances, especially when a large quorum value is used in
order to force a greater degree of agreement (more than N/2+1).

It was verified that it was likely to trigger a split brain
condition, forcing the system to try again after a timeout.
Usually the system will succeed after a few retries, but this is not
optimal.

This commit desynchronizes instances in a more effective way to make it
likely that the first attempt will be successful.
2014-03-05 10:22:07 +01:00
antirez
5ee2394474 CONFIG REWRITE should be logged at WARNING level. 2014-03-05 10:22:07 +01:00
antirez
fa8aca5236 Sentinel test: debugging console improved. 2014-03-05 10:22:07 +01:00
antirez
cf746c398c Sentinel test: initial debugging console. 2014-03-05 10:22:07 +01:00
antirez
cbf51b60a2 Sentinel test: be more patient in create_redis_master_slave_cluster. 2014-03-05 10:22:07 +01:00
antirez
23a12bb8c1 Sentiel test: add test start time in output. 2014-03-05 10:22:07 +01:00
antirez
d2e16801f0 Cluster: invalidate current transaction on redirections. 2014-03-05 10:22:07 +01:00
antirez
21557a0d8b Sentinel test: use 1000 as retry in initial 00 unit test. 2014-03-05 10:22:07 +01:00
antirez
290a5d7ff6 Sentinel test: initial tests in 03 unit. 2014-03-05 10:22:07 +01:00
antirez
b2668466f4 Sentinel test: foreach_instance_id now supports 'continue'. 2014-03-05 10:22:07 +01:00
antirez
a76bca16e6 Sentienl test: fixed typo in unit 03 top comment. 2014-03-05 10:22:07 +01:00
antirez
1d9eb47f9d Document why we update peak memory in INFO. 2014-03-05 10:22:07 +01:00
antirez
e4833ed8bf Fix configEpoch assignment when a cluster slot gets "closed".
This is still code to rework in order to use agreement to obtain a new
configEpoch when a slot is migrated, however this commit handles the
special case that happens when the nodes are just started and everybody
has a configEpoch of 0. In this special condition to have the maximum
configEpoch is not enough as the special epoch 0 is not unique (all the
others are).

This does not fixes the intrinsic race condition of a failover happening
while we are resharding, that will be addressed later.
2014-03-05 10:22:07 +01:00
Matt Stancliff
7c092b679f Force INFO used_memory_peak to match peak memory
used_memory_peak only updates in serverCron every server.hz,
but Redis can use more memory and a user can request memory
INFO before used_memory_peak gets updated in the next
cron run.

This patch updates used_memory_peak to the current
memory usage if the current memory usage is higher
than the recorded used_memory_peak value.

(And it only calls zmalloc_used_memory() once instead of
twice as it was doing before.)
2014-03-05 10:22:07 +01:00
michael-grunder
23addbb5a3 Improved bigkeys with progress, pipelining and summary
This commit reworks the redis-cli --bigkeys command to provide more
information about our progress as well as output summary information
when we're done.

 - We now show an approximate percentage completion as we go
 - Hiredis pipelining is used for TYPE and SIZE retreival
 - A summary of keyspace distribution and overall breakout at the end
2014-03-05 10:22:07 +01:00
antirez
a46811693d Sentinel test: Makefile target added. 2014-02-28 16:00:14 +01:00
antirez
32db37c3c4 BITPOS fuzzy testing. 2014-02-27 15:52:43 +01:00
antirez
1d62f83364 Basic BITPOS tests. 2014-02-27 15:52:43 +01:00
antirez
9104f1e672 warnigns -> warnings in redisBitpos(). 2014-02-27 15:52:43 +01:00
antirez
eacc0951a2 More consistent BITPOS behavior with bit=0 and ranges.
With the new behavior it is possible to specify just the start in the
range (the end will be assumed to be the first byte), or it is possible
to specify both start and end.

This is useful to change the behavior of the command when looking for
zeros inside a string.

1) If the user specifies both start and end, and no 0 is found inside
   the range, the command returns -1.

2) If instead no range is specified, or just the start is given, even
   if in the actual string no 0 bit is found, the command returns the
   first bit on the right after the end of the string.

So for example if the string stored at key foo is "\xff\xff":

    BITPOS foo (returns 16)
    BITPOS foo 0 -1 (returns -1)
    BITPOS foo 0 (returns 16)

The idea is that when no end is given the user is just looking for the
first bit that is zero and can be set to 1 with SETBIT, as it is
"available". Instead when a specific range is given, we just look for a
zero within the boundaries of the range.
2014-02-27 15:52:43 +01:00
antirez
1f8005ca09 Initial implementation of BITPOS.
It appears to work but more stress testing, and both unit tests and
fuzzy testing, is needed in order to ensure the implementation is sane.
2014-02-27 15:52:43 +01:00
antirez
cbeaf17374 Added two more BITCOUNT tests stressing misaligned access. 2014-02-27 15:52:43 +01:00
antirez
9c206cc8e8 BITCOUNT fuzzy test with random start/end added.
It was verified in practice that this test is able to stress much more
the implementation by introducing errors that were only trivially to
detect with different offsets but impossible to detect starting always
at zero and counting bits the full length of the string.
2014-02-27 15:52:43 +01:00
antirez
24265edb6c Fix misaligned word access in redisPopcount(). 2014-02-27 15:52:43 +01:00
Matt Stancliff
7e274194bf Fix IP representation in clusterMsgDataGossip 2014-02-27 15:52:43 +01:00
antirez
d242b698f4 Sentinel test: add stub for unit 04. 2014-02-25 15:55:21 +01:00
antirez
cd8b149dff Sentinel test: added TODO items in 02 unit. 2014-02-25 15:55:21 +01:00
antirez
c12ebb4633 Sentinel test: check role at end of unit 01. 2014-02-25 15:09:46 +01:00
michael-grunder
a2b3f2eae2 Update --bigkeys to use SCAN
This commit changes the findBigKeys() function in redis-cli.c to use the new
SCAN command for iterating the keyspace, rather than RANDOMKEY.  Because we
can know when we're done using SCAN, it will exit after exhausting the keyspace.
2014-02-25 15:09:46 +01:00
antirez
cb7ccc9ad0 Sentinel test: kill masters instead of using DEBUG SLEEP in all tests. 2014-02-25 15:09:46 +01:00
antirez
48fa34bf9e redis-cli: also remove useless uint8_t. 2014-02-25 15:09:46 +01:00
antirez
ff20d05650 redis-cli: don't use uint64_t where actually not needed.
The computation is just something to take the CPU busy, no need to use a
specific type. Since stdint.h was not included this prevented
compilation on certain systems.
2014-02-25 15:09:46 +01:00
antirez
6a95ddb248 redis-cli: check argument existence for --pattern. 2014-02-25 15:09:46 +01:00
antirez
2431b63ff4 redis-cli: --intrinsic-latency run mode added. 2014-02-25 15:09:46 +01:00
antirez
68e9597e8a redis-cli: added comments to split program in parts. 2014-02-25 15:09:46 +01:00
antirez
31ed0911fa Sentinel test: restart instances left killed by previous unit.
An unit can abort in the middle for an error. The next unit should not
assume that the instances are in a clean state, and must restart what
was left killed.
2014-02-25 10:24:16 +01:00
antirez
edf48d17a2 Sentinel test: jump to next unit on test failure.
Sentinel tests are designed to be dependent on the previous tests in the
same unit, so usually we can't continue with the next test in the same
unit if a previous test failed.
2014-02-25 10:24:16 +01:00
antirez
7f521d764f Sentinel test: test majority crashing Sentinels.
The test was previously performed by removing the master from the
Sentinel monitored masters. The test with the Sentinels crashed is
more similar to real-world partitions / failures.
2014-02-25 10:24:16 +01:00
antirez
491bcbaf73 Sentinel test: restart_instance should refresh pid attrib.
Also kill_instance was modified to warn when a test will try to kill the
same instance multiple times for error.
2014-02-25 10:24:16 +01:00
antirez
55643361f1 Sentinel test: more stuff mored from 00-base to init.
The area a number of mandatory tests to craete a stable setup for
testing that is not too sensitive to timing issues. All those tests
moved to includes/init-tests, and marked as (init).
2014-02-25 10:24:16 +01:00
antirez
4f6ed3412e Sentinel: log quorum with +monitor event. 2014-02-25 10:24:16 +01:00
antirez
55a2e10b03 Sentinel test: removed useless code to set SDOWN timeout.
The new common initialization code used to start a new unit already set
the timeout to 2000 milliseconds.
2014-02-25 10:24:16 +01:00
antirez
56e18ba4f6 Sentinel: generate +monitor events at startup. 2014-02-25 10:24:16 +01:00
antirez
cd68a1d45a Sentinel: log +monitor and +set events.
Now that we have a runtime configuration system, it is very important to
be able to log how the Sentinel configuration changes over time because
of API calls.
2014-02-25 10:24:16 +01:00
antirez
4af2acf2b0 Sentinel: added missing exit(1) after checking for config file. 2014-02-25 10:24:16 +01:00
antirez
fecfd721df Sentinel test: tmp dir and gitignore added. 2014-02-25 10:24:16 +01:00
antirez
b18109e351 Sentinel test: minor fixes to --pause-on-error. 2014-02-25 10:24:16 +01:00
antirez
2ed707bcaa Sentinel test: --pause-on-error option added.
Pause the test with running instances available for state inspection on
error.
2014-02-25 10:24:16 +01:00