Commit Graph

4707 Commits

Author SHA1 Message Date
dc8584696a fix issue 1787 2014-06-01 02:23:24 +08:00
8a588ac14d More trailing spaces in sentinel.c removed. 2014-05-28 15:46:05 +02:00
b187c591e3 Small typo fixed 2014-05-28 09:46:01 +02:00
efbe173962 Cluster test: add tmp dir to Git repo. 2014-05-26 18:08:12 +02:00
3e0aef1e2e Merge pull request #1775 from mattsta/fix-test-against-new-PID-format
Fix test framework to detect proper server PID
2014-05-26 17:56:58 +02:00
7a0c5fdf12 Disable recursive watchdog signal handler
If we are in the signal handler, we don't want to handle
the signal again.  In extreme cases, this can cause a stack overflow
and segfault Redis.

Fixes #1771
2014-05-26 17:53:33 +02:00
88c2307535 Cluster: always allow ok -> fail switch in clusterUpdateState().
There is a time defined by REDIS_CLUSTER_WRITABLE_DELAY where fail -> ok
switch is not possible after startup as a master for some time, however
the contrary (ok -> fail) should always be possible.
2014-05-26 16:24:12 +02:00
3495caf4fb Cluster test: catch FLUSHALL errors on node reset.
FLUSHALL will fail on read-only slaves, but there the command is not
needed in order to reset the instance with CLUSTER RESET so errors can
be ignored.
2014-05-26 11:00:11 +02:00
69fa133ec2 Sentinel example config: explain you don't need to specify slaves. 2014-05-26 10:17:12 +02:00
6c16ecaaaa Fix test framework to detect proper server PID
Previously the PID format was:
[PID] Timestamp

But it recently changed to:
PID:X Timestamp

The tcl testing framework was grabbing the PID from \[\d+\], but
that's not valid anymore.

Now we grab the pid from "PID: <PID>" in the part of Redis startup
output to the right of the ASCII logo.
2014-05-23 13:54:29 -04:00
d0566daeaf Cluster test: basic failover unit added. 2014-05-23 11:47:47 +02:00
aa5dfb3c2c Cluster test: move basic read/write test into a procedure. 2014-05-23 11:41:50 +02:00
a700bc74a8 Cluster test: more reliable 01-faildet unit.
Do things in a sequence that prevents failover during failure detection.
2014-05-23 11:40:34 +02:00
94e3bb568a Fixed typo in word avarege in result message of --intrinsic-latency analyzer 2014-05-22 20:01:12 +02:00
b239a32aae redisLogFromHandler() format changed to match new logs format. 2014-05-22 19:24:35 +02:00
d98fa718e0 Tag every log line with role.
Every log contains, just after the pid, a single character that provides
information about the role of an instance:

S - Slave
M - Master
C - Writing child
X - Sentinel
2014-05-22 18:48:37 +02:00
39603a7e31 Cluster: slave validity factor is now user configurable.
Check the commit changes in the example redis.conf for more information.
2014-05-22 16:57:54 +02:00
ce37488919 Test: AOF test false positive when running in slow hosts.
The bug was triggered by running the test with Valgrind (which is a lot
slower and more sensible to timing issues) after the recent changes
that made Redis more promptly able to reply with the -LOADING error.
2014-05-22 16:05:03 +02:00
17b05afda9 Test: dump.tcl fixed for RESTORE new error msg. 2014-05-22 15:56:17 +02:00
762b1ae2be Fix an error in redis-trib where we always talk with same node.
While iterating the list of nodes we want to set the slot as stable in
the current node, not always in the first node of the list.
2014-05-21 18:17:02 +02:00
c68c78719f redis-trib fix improved: move keys from N nodes to owner. 2014-05-21 16:40:46 +02:00
33f943b4cd Fix blocking operations from missing new lists
Behrad Zari discovered [1] and Josiah reported [2]: if you block
and wait for a list to exist, but the list creates from
a non-push command, the blocked client never gets notified.

This commit adds notification of blocked clients into
the DB layer and away from individual commands.

Lists can be created by [LR]PUSH, SORT..STORE, RENAME, MOVE,
and RESTORE.  Previously, blocked client notifications were
only triggered by [LR]PUSH.  Your client would never get
notified if a list were created by SORT..STORE or RENAME or
a RESTORE, etc.

Blocked client notification now happens in one unified place:
  - dbAdd() triggers notification when adding a list to the DB

Two new tests are added that fail prior to this commit.

All test pass.

Fixes #1668

[1]: https://groups.google.com/forum/#!topic/redis-db/k4oWfMkN1NU
[2]: #1668
2014-05-21 09:52:52 -04:00
56161ca0a4 redis-trib fix: use MIGRATE REPLACE when fixing slots.
This fixes issue #1765.
2014-05-21 12:15:06 +02:00
7f772355f4 Regression test for issue #1764. 2014-05-20 16:20:16 +02:00
ce2b2f22d9 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2014-05-20 16:15:13 +02:00
ce7c47265b Merge pull request #1764 from michael-grunder/lua_cache_segfault
Fix LUA_OBJCACHE segfault.
2014-05-20 16:14:34 +02:00
4ddc77041f Remove trailing spaces from scripting.c 2014-05-20 16:11:22 +02:00
01e3f9ba1d Remove trailing spaces from sentinel.c. 2014-05-20 14:22:42 +02:00
ea0e2524aa Fix LUA_OBJCACHE segfault.
When scanning the argument list inside of a redis.call() invocation
for pre-cached values, there was no check being done that the
argument we were on was in fact within the bounds of the cache size.

So if a redis.call() command was ever executed with more than 32
arguments (current cache size #define setting) redis-server could
segfault.
2014-05-19 13:18:13 -07:00
a9e62ab9fa HyperLogLog regression test for issue #1762. 2014-05-19 15:44:04 +02:00
04d901aed5 Merge pull request #1762 from trink/unstable
Correct the HyperLogLog stale cache flag to prevent unnecessary computat...
2014-05-19 15:39:30 +02:00
194b78050e Cluster test: better failure detection test and framework improvements. 2014-05-19 15:26:19 +02:00
a7da78e472 Cluster test: failure detection initial tests. 2014-05-19 11:39:15 +02:00
4c04744734 Cluster test: proper initialization at unit startup. 2014-05-19 11:24:15 +02:00
ba52cd06c8 Correct the HyperLogLog stale cache flag to prevent unnecessary computations.
Set the MSB as documented.
2014-05-18 07:26:26 -07:00
67133d2f48 Cluster: use clusterSetNodeAsMaster() during slave failover.
clusterHandleSlaveFailover() was reimplementing what
clusterSetNodeAsMaster() without any good reason.
2014-05-15 17:03:28 +02:00
8c6e92c3bc Cluster: clear todo_before_sleep flags when executing actions.
Thanks to this change, when there is some code like:

    clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|...);
    ... and later before returning to the event loop ...
    clusterUpdateState();

The clusterUpdateState() function will clar the flag and will not be
repeated in the clusterBeforeSleep() function. This especially important
for config save/fsync flags which are slow to execute and not a good
idea to repeat without a good reason.

This is implemented for all the CLUSTER_TODO flags.
2014-05-15 16:33:13 +02:00
7b87cda70e Fixed typo in CLUSTER RESET implementation. 2014-05-15 12:33:57 +02:00
796f4ae9f7 CLUSTER RESET implemented.
The new command is able to reset a cluster node so that it starts again
as a fresh node. By default the command performs a soft reset (the same
as calling it as CLUSTER RESET SOFT), and the following steps are
performed:

1) All slots are set as unassigned.
2) The list of known nodes is flushed.
3) Node is set as master if it is a slave.

When an hard reset is performed with CLUSTER RESET HARD the following
additional operations are performed:

4) A new Node ID is created at random.
5) Epochs are set to 0.

CLUSTER RESET is useful both when the sysadmin wants to reconfigure a
node with a different role (for example turning a slave into a master)
and for testing purposes.

It also may play a role in automatically provisioned Redis Clusters,
since it allows to reset a node back to the initial state in order to be
reconfigured.
2014-05-15 11:43:06 +02:00
8b9d5ecbd1 Remove trailing spaces from cluster.c file. 2014-05-15 10:18:36 +02:00
0b7aa26164 Cluster test: added function assert_cluster_state. 2014-05-14 15:21:57 +02:00
60e5d1724c Cluster: don't accept cluster bus connections during startup. 2014-05-14 12:05:00 +02:00
6baac558d8 Cluster: better handling of stolen slots.
The previous code handling a lost slot (by another master with an higher
configuration for the slot) was defensive, considering it an error and
putting the cluster in an odd state requiring redis-cli fix.

This was changed, because actually this only happens either in a
legitimate way, with failovers, or when the admin messed with the config
in order to reconfigure the cluster. So the new code instead will try to
make sure that the keys stored match the new slots map, by removing all
the keys in the slots we lost ownership from.

The function that deletes the keys from the lost slots is called only
if the node does not lose all its slots (resulting in a reconfiguration
as a slave of the node that got ownership). This is an optimization
since the replication code will anyway flush all the instance data in
a faster way.
2014-05-14 10:46:37 +02:00
27ca133d35 cluster.tcl: fix redis links leak in refresh_nodes_map. 2014-05-14 09:10:03 +02:00
cdf2271c5b cluster.tcl: saner error handling.
Better handling of connection errors in order to update the table and
recovery, populate the startup nodes table after fetching the list of
nodes.

More work to do about it, it is still not as reliable as
redis-rb-cluster implementation which is the minimal reference
implementation for Redis Cluster clients.
2014-05-14 00:15:52 +02:00
bae30479fb redis.tcl: return I/O error message when peer closes connection. 2014-05-14 00:14:35 +02:00
832a298005 Cluster: fixed data_age computation / check integer overflow. 2014-05-12 17:46:15 +02:00
7c4decb101 Fix lack of strtold under Cygwin
Renaming strtold to strtod then casting
the result is the standard way of dealing with
no strtold in Cygwin.
2014-05-12 11:11:09 -04:00
3e0e51dd9f Fix lack of SA_ONSTACK under Cygwin
Fixes #232
2014-05-12 11:10:24 -04:00
2692339138 Cluster: forced failover implemented.
Using CLUSTER FAILOVER FORCE it is now possible to failover a master in
a forced way, which means:

1) No check to understand if the master is up is performed.
2) No data age of the slave is checked. Evan a slave with very old data
   can manually failover a master in this way.
3) No chat with the master is attempted to reach its replication offset:
   the master can just be down.
2014-05-12 16:34:20 +02:00