5020 Commits

Author SHA1 Message Date
antirez
7d016e6f39 Cluster: include node IDs in SLOTS output.
CLUSTER SLOTS now includes IDs in the nodes description associated with
a given slot range. Certain client libraries implementations need a way
to reference a node in an unique way, so they were relying on CLUSTER
NODES, that is not a stable API and may change frequently depending on
Redis Cluster future requirements.
2016-01-29 12:02:27 +01:00
antirez
438942a540 Remove spurious entries in 3.0.7 changelog.
Certain things were only applicable to 3.2.0 RC2 and RC3.
2016-01-29 09:18:48 +01:00
antirez
6bf60cbf5e UPDATE: Redis 3.0.7.
We had to fix a few last minutes bugs.
2016-01-28 13:02:27 +01:00
antirez
1a7e68556f Use a smoother running average for avg_ttl in INFO.
Reported here:
https://www.reddit.com/r/redis/comments/42r0i0/avg_ttl_varies_a_lot/
2016-01-26 15:29:48 +01:00
antirez
13f48d8dbf Fix merge conflicts from 3.2. 2016-01-26 14:25:49 +01:00
antirez
4d62a82b4a Cluster: mismatch sender ID log put back at DEBUG level. 2016-01-26 14:24:24 +01:00
antirez
4685f253c3 Cluster: fix missing ntohs() call to access gossip section port. 2016-01-26 14:22:46 +01:00
antirez
025f936cc4 Better address udpate strategy when processing gossip sections.
The change covers the case where:

1. There is a node we can't reach (in fail or pfail state).
2. We see a different address for this node, in the gossip section sent
to us by a node that, instead, is able to talk with the node we cannot
talk to.

In this case it's a good bet to switch to the address reported by this
node, since there was an address switch and it is able to talk with the
node and we are not.

However previosuly this was done in a dangerous way, by initiating an
handshake. The handshake, using the MEET packet, forces the receiver to
join our cluster, and this is not a good idea. If the node in question
really just switched address, but is the same node, it already knows about
us, so we just need to perform an address update and a reconnection.

So with this commit instead we just update the address of the node,
release the node link if any, and attempt to reconnect in the next
clusterCron() cycle.

The commit also improves debugging messages printed by Cluster during
address or ID switches.
2016-01-26 14:22:32 +01:00
antirez
5d9a533591 Fix memory leak in masterauth config option loading. 2016-01-26 14:21:49 +01:00
antirez
72f5326076 Fix merge issues with 3.2 backports. 3.0.7 2016-01-25 15:57:52 +01:00
antirez
53c9c299df Redis 3.0.7. 2016-01-25 15:54:36 +01:00
antirez
d4090b169d Minor MIGRATE refactoring.
Centralize cleanup of newargv in a single place.
Add more comments to help a bit following a complex function.

Related to issue #3016.
2016-01-25 15:23:08 +01:00
antirez
29c89df46e More variadic MIGRATE fixes.
Another leak was fixed in the case of syntax error by restructuring the
allocation strategy for the two dynamic vectors.

We also make sure to always close the cached socket on I/O errors so that
all the I/O errors are handled the same, even if we had a previously
queued error of a different kind from the destination server.

Thanks to Kevin McGehee. Related to issue #3016.
2016-01-25 15:23:04 +01:00
antirez
14e1599660 Various fixes to MIGRATE with multiple keys.
In issue #3016 Kevin McGehee identified multiple very serious issues in
the new implementation of MIGRATE. This commit attempts to restructure
the code in oder to avoid mistakes, an analysis of the new
implementation is in progress in order to check for possible edge cases.
2016-01-25 15:22:58 +01:00
antirez
4300a973b8 Test: Handle LOADING in restart_instance. 2016-01-25 15:21:57 +01:00
antirez
5a402ce2d5 Detect and show crashes on Sentinel/Cluster tests. 2016-01-25 15:21:53 +01:00
antirez
515392c216 Cluster: fix setting nodes slaveof pointer to NULL on node release.
With this commit we preserve the list of nodes that have .slaveof set
to the node, even when the node is turned into a slave, and make sure to
fix the .slaveof pointers to NULL when a node is freed from memory,
regardless of the fact it's a slave or a master.

Basically we try to remember the logical master in the current
configuration even if the logical master advertised it as a slave
already. However we still remember the associations, so that when a node
is freed we can fix them.

This should fix issue #3002.
2016-01-25 15:21:49 +01:00
antirez
d5872e8e31 Cluster: clarify node->slave may be NULL. 2016-01-25 15:21:43 +01:00
antirez
8cae6e955b Cluster: fix rebalancing to always empty nodes.
Because of rounding error even with weight=0 sometimes a node was left
with an assigned slot.

Close #3001.
2016-01-25 15:21:40 +01:00
antirez
791a295636 Cluster: redis-trib move_to_slot: don't send SETSLOT to slaves. 2016-01-25 15:21:36 +01:00
antirez
f2879c25d1 Cluster: fix redis-trib reference of variable in warning. 2016-01-25 15:21:31 +01:00
antirez
49b1e78820 CLUSTER BUMPEPOCH initial implementation fixed. 2016-01-25 15:21:27 +01:00
antirez
7942e7090e Cluster: implement redis-trib fix when slot is open without owners.
Still work to do.
2016-01-25 15:21:24 +01:00
antirez
beb5058e6f Cluster: implement redis-trib fix for uncovered slots. 2016-01-25 15:21:20 +01:00
antirez
8ef716d19a Cluster: CLUSTER BUMPEPOCH introduced to help redis-trib fix.
Sometimes during "fixes" we have to setup a new configuration and assign
slots to nodes. With BUMPEPOCH we can make sure the new configuration of
the node will win if there are conflicting configurations (for example
another node is *also* claiming the same slot because the cluster is
totally messed up).
2016-01-25 15:21:17 +01:00
antirez
5da1e640ad Cluster: don't allow CLUSTER SETSLOT with slaves. 2016-01-25 15:21:13 +01:00
antirez
53edd42a4e Cluster: check packets length before accessing far fields. 2016-01-19 13:18:08 +01:00
antirez
e50b9a0757 Scripting: handle trailing comments.
This fix, provided by Paul Kulchenko (@pkulchenko), allows the Lua
scripting engine to evaluate statements with a trailing comment like the
following one:

    EVAL "print() --comment" 0

Lua can't parse the above if the string does not end with a newline, so
now a final newline is always added automatically. This does not change
the SHA1 of scripts since the SHA1 is computed on the body we pass to
EVAL, without the other code we add to register the function.

Close #2951.
2016-01-08 15:45:18 +01:00
antirez
e9abc94483 Allow MIGRATE to always be called on local keys for open slots.
Extend the MIGRATE extra freedom to be able to be called in the context
of the local slot, anytime there is a slot open in one or the other
direction (importing or migrating). This is useful for redis-trib to fix
the cluster when it has in an odd state.

Thix fix allows "redis-trib fix" to make its work in certain cases where
previously an error was reported.
2016-01-08 15:36:37 +01:00
antirez
583194c8f4 Fix typos & grammar in clusterBumpConfigEpochWithoutConsensus() comment. 2016-01-08 15:35:44 +01:00
antirez
319a4c04c7 Cluster: don't send -ASK to MIGRATE.
For non existing keys, we don't want to send -ASK redirections to
MIGRATE, since when moving slots from the migrating node to the
importing node, we want just to ignore keys that are no longer there.
They may be expired or deleted between the GETKEYSINSLOT call and the
MIGRATE call. Otherwise this causes an error during migrations with
redis-trib (or equivalent cluster management tools).
2016-01-06 12:24:13 +01:00
antirez
a8c2aa0f44 Cluster test: do leaks detection with OSX leaks utility. 2016-01-02 13:25:27 +01:00
antirez
f9c971ff87 redis-trib: Remove duplicated key in hash initialization. 2016-01-02 13:25:27 +01:00
antirez
3d61cb0cb1 Suppress harmless warnings. 3.0.6 2015-12-18 16:19:47 +01:00
antirez
ddc4d7f8c7 Changelog typo fixed. 2015-12-18 16:14:55 +01:00
antirez
d51fdfda42 Redis 3.0.6 2015-12-18 16:10:43 +01:00
antirez
7ce7387202 Cluster: rebalance now supports --threshold option. 2015-12-18 15:52:22 +01:00
antirez
b4b7c57cb0 Cluster: redis-trib reshard / rebalance --pipeline support. 2015-12-18 15:52:22 +01:00
antirez
c48355e920 Cluster: verify slaves consistency after resharding. 2015-12-18 11:34:08 +01:00
antirez
99cb476500 Fix CMD_DENYOOM macro name after backporting. 2015-12-18 09:15:47 +01:00
antirez
514ee7135e Cluster: allows abbreviated node IDs with rebalance --weight option. 2015-12-18 09:12:24 +01:00
antirez
025628bd76 Cluster: rebalancing option --simulate, and a fix. 2015-12-18 09:12:20 +01:00
antirez
f501f5f4c9 Cluster: redis-trib rebalance initial implementation. 2015-12-18 09:12:16 +01:00
antirez
a49a57ccd1 Initial implementation of redis-trib info subcommand. 2015-12-18 09:12:11 +01:00
antirez
0bc1993879 fix sprintf and snprintf format string
There are some cases of printing unsigned integer with %d conversion
specificator and vice versa (signed integer with %u specificator).

Patch by Sergey Polovko. Backported to Redis from Disque.
2015-12-18 09:10:51 +01:00
antirez
cb61d003ab Cluster: resharding test now checks AOF consistency.
It's a key invariant that when AOF is enabled, after the cluster
reshards, a crash-recovery event causes all the keys to be still fine
with the expected logical content. Now this is part of unit 04.
2015-12-17 17:53:33 +01:00
antirez
d999f5a68e Fix a race that may lead to the active (slave) client to be freed.
In issue #2948 a crash was reported in processCommand(). Later Oran Agra
(@oranagra) traced the bug (in private chat) in the following sequence
of events:

1. Some maxmemory is set.
2. The slave is the currently active client and is executing PING or
   REPLCONF or whatever a slave can send to its master.
3. freeMemoryIfNeeded() is called since maxmemory is set.
4. flushSlavesOutputBuffers() is called by freeMemoryIfNeeded().
5. During slaves buffers flush, a write error could be encoutered in
   writeToClient() or sendReplyToClient() depending on the version of
   Redis. This will trigger freeClient() against the currently active
   client, so a segmentation fault will likely happen in
   processCommand() immediately after the call to freeMemoryIfNeeded().

There are different possible fixes:

1. Add flags to writeToClient() (recent versions code base) so that
   we can ignore the write errors, and use this flag in
   flushSlavesOutputBuffers(). However this is not simple to do in older
   versions of Redis.
2. Use freeClientAsync() during write errors. This works but changes the
   current behavior of releasing clients ASAP when possible. Normally
   we write to clients during the normal event loop processing, in the
   writable client, where there is no active client, so no care must be
   taken.
3. The fix of this commit: to detect that the current client is no
   longer valid. This fix is a bit "ad-hoc", but works across all the
   versions and has the advantage of not changing the remaining
   behavior. Only alters what happens during this race condition,
   hopefully.
2015-12-17 09:48:44 +01:00
antirez
f1ab834658 Log address causing SIGSEGV. 2015-12-15 18:02:18 +01:00
Sun He
8bb9cb38be lua_struct.c/getnum: throw error if overflow happen
Fix issue #2855
2015-12-14 17:58:55 +01:00
antirez
a5d27d395f Fix 3.0 merge issues with new MIGRATE. 2015-12-13 10:23:04 +01:00